tstest/readme.plan.md

12 KiB

Improvement Plan for tstest and tapbundle

!! FIRST: Reread /home/philkunz/.claude/CLAUDE.md to ensure following all guidelines !!

Improved Internal Protocol (NEW - Critical) COMPLETED

Current Issues RESOLVED

  • TAP protocol uses # for metadata which conflicts with test descriptions containing #
  • Fragile regex parsing that breaks with special characters
  • Limited extensibility for new metadata types

Proposed Solution: Protocol V2 IMPLEMENTED

  • Use Unicode delimiters ⟦TSTEST:META:{}⟧ that won't appear in test names
  • Structured JSON metadata format
  • Separate protocol blocks for complex data (errors, snapshots)
  • Complete replacement of v1 (no backwards compatibility needed)

Implementation COMPLETED

  • Phase 1: Create protocol v2 implementation in ts_tapbundle_protocol
  • Phase 2: Replace all v1 code in both tstest and tapbundle with v2
  • Phase 3: Delete all v1 parsing and generation code

ts_tapbundle_protocol Directory

The protocol v2 implementation will be contained in the ts_tapbundle_protocol directory as isomorphic TypeScript code:

  • Isomorphic Design: All code must work in both browser and Node.js environments
  • No Node.js Imports: No Node.js-specific modules allowed (no fs, path, child_process, etc.)
  • Protocol Classes: Contains classes implementing all sides of the protocol:
    • ProtocolEmitter: For generating protocol v2 messages (used by tapbundle)
    • ProtocolParser: For parsing protocol v2 messages (used by tstest)
    • ProtocolMessage: Base classes for different message types
    • ProtocolTypes: TypeScript interfaces and types for protocol structures
  • Pure TypeScript: Only browser-compatible APIs and pure TypeScript/JavaScript code
  • Build Integration:
    • Compiled by pnpm build (via tsbuild) to dist_ts_tapbundle_protocol/
    • Build order defined in tspublish.json files
    • Imported by ts and ts_tapbundle modules from the compiled dist directory

See readme.protocol.md for detailed specification.

Test Configuration System (NEW)

Global Test Configuration via 00init.ts

  • Discovery: Check for test/00init.ts before running tests
  • Execution: Import and execute before any test files if found
  • Purpose: Define project-wide default test settings

tap.settings() API

interface TapSettings {
  // Timing
  timeout?: number;              // Default timeout for all tests (ms)
  slowThreshold?: number;        // Mark tests as slow if they exceed this (ms)
  
  // Execution Control
  bail?: boolean;                // Stop on first test failure
  retries?: number;              // Number of retries for failed tests
  retryDelay?: number;           // Delay between retries (ms)
  
  // Output Control
  suppressConsole?: boolean;     // Suppress console output in passing tests
  verboseErrors?: boolean;       // Show full stack traces
  showTestDuration?: boolean;    // Show duration for each test
  
  // Parallel Execution
  maxConcurrency?: number;       // Max parallel tests (for .para files)
  isolateTests?: boolean;        // Run each test in fresh context
  
  // Lifecycle Hooks
  beforeAll?: () => Promise<void> | void;
  afterAll?: () => Promise<void> | void;
  beforeEach?: (testName: string) => Promise<void> | void;
  afterEach?: (testName: string, passed: boolean) => Promise<void> | void;
  
  // Environment
  env?: Record<string, string>;  // Additional environment variables
  
  // Features
  enableSnapshots?: boolean;     // Enable snapshot testing
  snapshotDirectory?: string;    // Custom snapshot directory
  updateSnapshots?: boolean;     // Update snapshots instead of comparing
}

Settings Inheritance

  • Global (00init.ts) → File level → Test level
  • More specific settings override less specific ones
  • Arrays/objects are merged, primitives are replaced

Implementation Phases

  1. Core Infrastructure: Settings storage and merge logic
  2. Discovery: 00init.ts loading mechanism
  3. Application: Apply settings to test execution
  4. Advanced: Parallel execution and snapshot configuration

1. Enhanced Communication Between tapbundle and tstest COMPLETED

1.1 Real-time Test Progress API COMPLETED

  • Create a bidirectional communication channel between tapbundle and tstest
  • Emit events for test lifecycle stages (start, progress, completion)
  • Allow tstest to subscribe to tapbundle events for better progress reporting
  • Implement a standardized message format for test metadata

1.2 Rich Error Reporting COMPLETED

  • Pass structured error objects from tapbundle to tstest
  • Include stack traces, code snippets, and contextual information
  • Support for error categorization (assertion failures, timeouts, uncaught exceptions)
  • Visual diff output for failed assertions

2. Enhanced toolsArg Functionality

2.3 Test Data and Context Sharing (Partial)

tap.test('data-driven test', async (toolsArg) => {
  // Parameterized test data (not yet implemented)
  const testData = toolsArg.data<TestInput>();
  expect(processData(testData)).toEqual(expected);
});

3. Nested Tests and Test Suites

3.2 Hierarchical Test Organization (Not yet implemented)

  • Support for multiple levels of nesting
  • Inherited context and configuration from parent suites
  • Aggregated reporting for test suites
  • Suite-level lifecycle hooks

4. Advanced Test Features

4.1 Snapshot Testing (Basic implementation complete)

4.2 Performance Benchmarking

tap.test('performance test', async (toolsArg) => {
  const benchmark = toolsArg.benchmark();
  
  // Run operation
  await expensiveOperation();
  
  // Assert performance constraints
  benchmark.expect({
    maxDuration: 1000,
    maxMemory: '100MB'
  });
});

5. Test Execution Improvements

5.2 Watch Mode COMPLETED

  • Automatically re-run tests on file changes
  • Debounced file change detection (300ms)
  • Clear console output between runs
  • Shows which files triggered re-runs
  • Graceful exit with Ctrl+C
  • --watch-ignore option for excluding patterns

5.3 Advanced Test Filtering (Partial) ⚠️

// Exclude tests by pattern (not yet implemented)
tstest --exclude "**/slow/**"

// Run only failed tests from last run (not yet implemented)
tstest --failed

// Run tests modified in git (not yet implemented)
tstest --changed

6. Reporting and Analytics

6.1 Custom Reporters

  • Plugin architecture for custom reporters
  • Built-in reporters: JSON, JUnit, HTML, Markdown
  • Real-time streaming reporters
  • Aggregated test metrics and trends

6.2 Coverage Integration

  • Built-in code coverage collection
  • Coverage thresholds and enforcement
  • Coverage trending over time
  • Integration with CI/CD pipelines

6.3 Test Analytics Dashboard

  • Web-based dashboard for test results
  • Historical test performance data
  • Flaky test detection
  • Test impact analysis

7. Developer Experience

7.1 Better Error Messages

  • Clear, actionable error messages
  • Suggestions for common issues
  • Links to documentation
  • Code examples in error output

Implementation Phases

Phase 1: Improved Internal Protocol (Priority: Critical) COMPLETED

  1. Create ts_tapbundle_protocol directory with isomorphic protocol v2 implementation
    • Implement ProtocolEmitter class for message generation
    • Implement ProtocolParser class for message parsing
    • Define ProtocolMessage types and interfaces
    • Ensure all code is browser and Node.js compatible
    • Add tspublish.json to configure build order
  2. Update build configuration to compile ts_tapbundle_protocol first
  3. Replace TAP parser in tstest with Protocol V2 parser importing from dist_ts_tapbundle_protocol
  4. Replace TAP generation in tapbundle with Protocol V2 emitter importing from dist_ts_tapbundle_protocol
  5. Delete all v1 TAP parsing code from tstest
  6. Delete all v1 TAP generation code from tapbundle
  7. Test with real-world test suites containing special characters

Phase 2: Test Configuration System (Priority: High) COMPLETED

  1. Implement tap.settings() API with TypeScript interfaces
  2. Add 00init.ts discovery and loading mechanism
  3. Implement settings inheritance and merge logic
  4. Apply settings to test execution (timeouts, retries, etc.)

Phase 3: Enhanced Communication (Priority: High) COMPLETED

  1. Build on Protocol V2 for richer communication
  2. Implement real-time test progress API
  3. Add structured error reporting with diffs and traces

Phase 4: Developer Experience (Priority: Medium) NOT STARTED

  1. Add watch mode
  2. Implement custom reporters
  3. Complete advanced test filtering options
  4. Add performance benchmarking API

Phase 5: Analytics and Performance (Priority: Low) NOT STARTED

  1. Build test analytics dashboard
  2. Implement coverage integration
  3. Create trend analysis tools
  4. Add test impact analysis

Technical Considerations

API Design Principles

  • Clean, modern API design without legacy constraints
  • Progressive enhancement approach
  • Well-documented features and APIs
  • Clear, simple interfaces

Performance Goals

  • Minimal overhead for test execution
  • Efficient parallel execution
  • Fast test discovery
  • Optimized browser test bundling

Integration Points

  • Clean interfaces between tstest and tapbundle
  • Extensible plugin architecture
  • Standard test result format
  • Compatible with existing CI/CD tools

Summary of Remaining Work

Completed

  • Protocol V2: Full implementation with Unicode delimiters, structured metadata, and special character handling
  • Test Configuration System: tap.settings() API, 00init.ts discovery, settings inheritance, lifecycle hooks
  • Enhanced Communication: Event-based test lifecycle reporting, visual diff output for assertion failures, real-time test progress API
  • Rich Error Reporting: Stack traces, error metadata, and visual diffs through protocol
  • Tags Filtering: --tags option for running specific tagged tests

Existing Features (Not in Plan)

  • Timeout Support: --timeout option and per-test timeouts
  • Test Retries: tap.retry() for flaky test handling
  • Parallel Tests: .testParallel() for concurrent execution
  • Snapshot Testing: Basic implementation with toMatchSnapshot()
  • Test Lifecycle: describe() blocks with beforeEach/afterEach
  • Skip Tests: tap.skip.test() (though it doesn't create test objects)
  • Log Files: --logfile option saves output to .nogit/testlogs/
  • Test Range: --startFrom and --stopAt for partial runs

⚠️ Partially Completed

  • Advanced Test Filtering: Have --tags but missing --exclude, --failed, --changed

Not Started

High Priority

Medium Priority

  1. Developer Experience

    • Watch mode for file changes
    • Custom reporters (JSON, JUnit, HTML, Markdown)
    • Performance benchmarking API
    • Better error messages with suggestions
  2. Enhanced toolsArg

    • Test data injection
    • Context sharing between tests
    • Parameterized tests
  3. Test Organization

    • Hierarchical test suites
    • Nested describe blocks
    • Suite-level lifecycle hooks

Low Priority

  1. Analytics and Performance
    • Test analytics dashboard
    • Code coverage integration
    • Trend analysis
    • Flaky test detection

Recently Fixed Issues

  • tap.todo(): Now fully implemented with test object creation
  • tap.skip.test(): Now creates test objects and maintains accurate test count
  • tap.only.test(): Works correctly - when .only tests exist, only those run

Remaining Minor Issues

  • Protocol Output: Some protocol messages still appear in console output
  1. Add Watch Mode (Phase 4) - high developer value for fast feedback
  2. Implement Custom Reporters - important for CI/CD integration
  3. Implement performance benchmarking API
  4. Add better error messages with suggestions