Philipp Kunz 899045e6aa fix(protocol): Fix inline timing metadata parsing and enhance test coverage for performance metrics and timing edge cases

2025-05-26 08:22:26 +00:00

14 KiB

Raw Blame History

Architecture Overview

Project Structure

This project integrates tstest with tapbundle through a modular architecture:

tstest (/ts/) - The test runner that discovers and executes test files
tapbundle (/ts_tapbundle/) - The TAP testing framework for writing tests
tapbundle_node (/ts_tapbundle_node/) - Node.js-specific testing utilities

How Components Work Together

Test Execution Flow

CLI Entry Point (cli.js <20> cli.ts.js <20> cli.child.ts)
- The CLI uses tsx to run TypeScript files directly
- Accepts glob patterns to find test files
- Supports options like --verbose, --quiet, --web
Test Discovery
- tstest scans for test files matching the provided pattern
- Defaults to test/**/*.ts when no pattern is specified
- Supports both file and directory modes
Test Runner
- Each test file imports tap and expect from tapbundle
- Tests are written using tap.test() with async functions
- Browser tests are compiled with esbuild and run in Chromium via Puppeteer

Key Integration Points

Import Structure
- Test files import from local tapbundle: import { tap, expect } from '../../ts_tapbundle/index.js'
- Node-specific tests also import from tapbundle_node: import { tapNodeTools } from '../../ts_tapbundle_node/index.js'
WebHelpers
- Browser tests can use webhelpers for DOM manipulation
- webhelpers.html - Template literal for creating HTML strings
- webhelpers.fixture - Creates DOM elements from HTML strings
- Automatically detects browser environment and only enables in browser context
Build System
- Uses tsbuild tsfolders to compile TypeScript (invoked by pnpm build)
- Maintains separate output directories: /dist_ts/, /dist_ts_tapbundle/, /dist_ts_tapbundle_node/, /dist_ts_tapbundle_protocol/
- Compilation order is resolved automatically based on dependencies in tspublish.json files
- Protocol imports use compiled dist directories:
```
// In ts/tstest.classes.tap.parser.ts
import { ProtocolParser } from '../dist_ts_tapbundle_protocol/index.js';

// In ts_tapbundle/tapbundle.classes.tap.ts  
import { ProtocolEmitter } from '../dist_ts_tapbundle_protocol/index.js';
```

Test Scripts

The package.json defines several test scripts:

test - Builds and runs all tests (tapbundle and tstest)
test:tapbundle - Runs tapbundle framework tests
test:tstest - Runs tstest's own tests
Both support :verbose variants for detailed output

Environment Detection

The framework automatically detects the runtime environment:

Node.js tests run directly via tsx
Browser tests are compiled and served via a local server
WebHelpers are only enabled in browser environment

This architecture allows for seamless testing across both Node.js and browser environments while maintaining a clean separation of concerns.

Logging System

Log File Naming (Fixed in v1.9.1)

When using the --logfile flag, tstest creates log files in .nogit/testlogs/. The log file naming was updated to preserve directory structure and prevent collisions:

Old behavior: test/tapbundle/test.ts → .nogit/testlogs/test.log
New behavior: test/tapbundle/test.ts → .nogit/testlogs/test__tapbundle__test.log

This fix ensures that test files with the same basename in different directories don't overwrite each other's logs. The implementation:

Takes the relative path from the current working directory
Replaces path separators (/) with double underscores (__)
Removes the .ts extension
Creates a flat filename that preserves the directory structure

Test Timing Display (Fixed in v1.9.2)

Fixed an issue where test timing was displayed incorrectly with duplicate values like:

Before: ✅ test name # time=133ms (0ms)
After: ✅ test name (133ms)

The issue was in the TAP parser regex which was greedily capturing the entire line including the TAP timing comment. Changed the regex from (.*) to (.*?) to make it non-greedy, properly separating the test name from the timing metadata.

Protocol Limitations and Improvements

Current TAP Protocol Issues

The current implementation uses standard TAP format with metadata in comments:

ok 1 - test name # time=123ms

This has several limitations:

Delimiter Conflict: Test descriptions containing # can break parsing
Regex Fragility: Complex regex patterns that are hard to maintain
Limited Metadata: Difficult to add rich error information or custom data

Planned Protocol V2

A new internal protocol is being designed that will:

Use Unicode delimiters ⟦TSTEST:⟧ that won't conflict with test content
Support structured JSON metadata
Allow rich error reporting with stack traces and diffs
Completely replace v1 protocol (no backwards compatibility)

ts_tapbundle_protocol Directory

The protocol v2 implementation is contained in a separate ts_tapbundle_protocol directory:

Isomorphic Code: All protocol code works in both browser and Node.js environments
No Platform Dependencies: No Node.js-specific imports, ensuring true cross-platform compatibility
Clean Separation: Protocol logic is isolated from platform-specific code in tstest and tapbundle
Shared Implementation: Both tstest (parser) and tapbundle (emitter) use the same protocol classes
Build Process:
- Compiled by pnpm build via tsbuild to dist_ts_tapbundle_protocol/
- Build order managed through tspublish.json files
- Other modules import from the compiled dist directory, not source

This architectural decision ensures the protocol can be used in any JavaScript environment without modification and maintains proper build dependencies.

See readme.protocol.md for the full specification and ts_tapbundle_protocol/ for the implementation.

Protocol V2 Implementation Status

The Protocol V2 has been implemented to fix issues with TAP protocol parsing when test descriptions contain special characters like #, ###SNAPSHOT###, or protocol markers like ⟦TSTEST:ERROR⟧.

Implementation Details:

Protocol Components:
- ProtocolEmitter - Generates protocol v2 messages (used by tapbundle)
- ProtocolParser - Parses protocol v2 messages (used by tstest)
- Uses Unicode markers ⟦TSTEST: and ⟧ to avoid conflicts with test content
Current Status:
- ✅ Basic protocol emission and parsing works
- ✅ Handles test descriptions with special characters correctly
- ✅ Supports metadata for timing, tags, errors
- ⚠️ Protocol messages sometimes appear in console output (parsing not catching all cases)
Key Findings:
- tap.skip.test() doesn't create actual test objects, just logs and increments counter
- tap.todo() method is not implemented (no addTodo method in Tap class)
- Protocol parser's isBlockStart was fixed to only match exact block markers, not partial matches in test descriptions
Import Paths:
- tstest imports from: import { ProtocolParser } from '../dist_ts_tapbundle_protocol/index.js';
- tapbundle imports from: import { ProtocolEmitter } from '../dist_ts_tapbundle_protocol/index.js';

Test Configuration System (Phase 2)

The Test Configuration System has been implemented to provide global settings and lifecycle hooks for tests.

Key Features:

00init.ts Discovery:
- Automatically detects 00init.ts files in the same directory as test files
- Creates a temporary loader file that imports both 00init.ts and the test file
- Loader files are cleaned up automatically after test execution
Settings Inheritance:
- Global settings from 00init.ts → File-level settings → Test-level settings
- Settings include: timeout, retries, retryDelay, bail, concurrency
- Lifecycle hooks: beforeAll, afterAll, beforeEach, afterEach
Implementation Details:
- SettingsManager class handles settings inheritance and merging
- tap.settings() API allows configuration at any level
- Lifecycle hooks are integrated into test execution flow

Important Development Notes:

Local Development: When developing tstest itself, use node cli.js instead of globally installed tstest to test changes
Console Output Buffering: Console output from tests is buffered and only displayed for failing tests. TAP-compliant comments (lines starting with #) are always shown.
TypeScript Warnings: Fixed async/await warnings in movePreviousLogFiles() by using sync versions of file operations

Enhanced Communication Features (Phase 3)

The Enhanced Communication system has been implemented to provide rich, real-time feedback during test execution.

Key Features:

Event-Based Test Lifecycle Reporting:
- test:queued - Test is ready to run
- test:started - Test execution begins
- test:completed - Test finishes (with pass/fail status)
- suite:started - Test suite/describe block begins
- suite:completed - Test suite/describe block ends
- hook:started - Lifecycle hook (beforeEach/afterEach) begins
- hook:completed - Lifecycle hook finishes
- assertion:failed - Assertion failure with detailed information
Visual Diff Output for Assertion Failures:
- String Diffs: Character-by-character comparison with colored output
- Object/Array Diffs: Deep property comparison showing added/removed/changed properties
- Primitive Diffs: Clear display of expected vs actual values
- Colorized Output: Green for expected, red for actual, yellow for differences
- Smart Formatting: Multi-line strings and complex objects are formatted for readability
Real-Time Test Progress API:
- Tests emit progress events as they execute
- tstest parser processes events and updates display in real-time
- Structured event format carries rich metadata (timing, errors, diffs)
- Seamless integration with existing TAP protocol via Protocol V2

Implementation Details:

Events are transmitted via Protocol V2's EVENT block type
Event data is JSON-encoded within protocol markers
Parser handles events asynchronously for real-time updates
Visual diffs are generated using custom diff algorithms for each data type

Watch Mode (Phase 4)

tstest now supports watch mode for automatic test re-runs on file changes.

Usage

tstest test/**/*.ts --watch
tstest test/specific.ts -w

Features

Automatic Re-runs: Tests re-run when any watched file changes
Debouncing: Multiple rapid changes are batched (300ms delay)
Clear Output: Console is cleared before each run for clean results
Status Updates: Shows which files triggered the re-run
Graceful Exit: Press Ctrl+C to stop watching

Options

--watch or -w: Enable watch mode
--watch-ignore: Comma-separated patterns to ignore (e.g., --watch-ignore node_modules,dist)

Implementation Details

Uses @push.rocks/smartchok for cross-platform file watching
Watches the entire project directory from where tests are run
Ignores changes matching the ignore patterns
Shows "Waiting for file changes..." between runs

Fixed Issues

tap.skip.test(), tap.todo(), and tap.only.test() (Fixed)

Previously reported issues with these methods have been resolved:

tap.skip.test() - Now properly creates test objects that are counted in test results
- Tests marked with skip.test() appear in the test count
- Shows as passed with skip directive in TAP output
- markAsSkipped() method added to handle pre-test skip marking
tap.todo.test() - Fully implemented with test object creation
- Supports both tap.todo.test('description') and tap.todo.test('description', testFunc)
- Todo tests are counted and marked with todo directive
- Both regular and parallel todo tests supported
tap.only.test() - Works correctly for focused testing
- When .only tests exist, only those tests run
- Other tests are not executed but still counted
- Both regular and parallel only tests supported

These fixes ensure accurate test counts and proper TAP-compliant output for all test states.

Test Timing Implementation

Timing Architecture

Test timing is captured using @push.rocks/smarttime's HrtMeasurement class, which provides high-resolution timing:

Timing Capture:
- Each TapTest instance has its own HrtMeasurement
- Timer starts immediately before test function execution
- Timer stops after test completes (or fails/times out)
- Millisecond precision is used for reporting
Protocol Integration:
- Timing is embedded in TAP output using Protocol V2 markers
- Inline format for simple timing: ok 1 - test name ⟦TSTEST:time:123⟧
- Block format for complex metadata: ⟦TSTEST:META:{"time":456,"file":"test.ts"}⟧
Performance Metrics Calculation:
- Average is calculated from sum of individual test times, not total runtime
- Slowest test detection prefers tests with >0ms duration
- Failed tests still contribute their execution time to metrics

Edge Cases and Considerations

Sub-millisecond Tests:
- Very fast tests may report 0ms due to millisecond rounding
- Performance metrics handle this by showing "All tests completed in <1ms" when appropriate
Special Test States:
- Skipped tests: Report 0ms (not executed)
- Todo tests: Report 0ms (not executed)
- Failed tests: Report actual execution time before failure
- Timeout tests: Report time until timeout occurred
Parallel Test Timing:
- Each parallel test tracks its own execution time independently
- Parallel tests may have overlapping execution periods
- Total suite time reflects wall-clock time, not sum of test times
Hook Timing:
- beforeEach/afterEach hooks are not included in individual test times
- Only the actual test function execution is measured
Retry Timing:
- When tests retry, only the final attempt's duration is reported
- Each retry attempt emits separate test:started events

Parser Fix for Timing Metadata

The protocol parser was fixed to correctly handle inline timing metadata:

Changed condition from !simpleMatch[1].includes(':') to check for simple key:value pairs
Excludes prefixed formats (META:, SKIP:, TODO:, EVENT:) while parsing simple formats like time:250

This ensures timing metadata is correctly extracted and displayed in test results.

14 KiB Raw Blame History Unescape Escape

Architecture Overview

Project Structure

How Components Work Together

Test Execution Flow

Key Integration Points

Test Scripts

Environment Detection

Logging System

Log File Naming (Fixed in v1.9.1)

Test Timing Display (Fixed in v1.9.2)

Protocol Limitations and Improvements

Current TAP Protocol Issues

Planned Protocol V2

ts_tapbundle_protocol Directory

Protocol V2 Implementation Status

Implementation Details:

Test Configuration System (Phase 2)

Key Features:

Important Development Notes:

Enhanced Communication Features (Phase 3)

Key Features:

Implementation Details:

Watch Mode (Phase 4)

Usage

Features

Options

Implementation Details

Fixed Issues

tap.skip.test(), tap.todo(), and tap.only.test() (Fixed)

Test Timing Implementation

Timing Architecture

Edge Cases and Considerations

Parser Fix for Timing Metadata

14 KiB

Raw Blame History