321 lines
		
	
	
		
			12 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			321 lines
		
	
	
		
			12 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Improvement Plan for tstest and tapbundle
 | |
| 
 | |
| !! FIRST: Reread /home/philkunz/.claude/CLAUDE.md to ensure following all guidelines !!
 | |
| 
 | |
| ## Improved Internal Protocol (NEW - Critical) ✅ COMPLETED
 | |
| 
 | |
| ### Current Issues ✅ RESOLVED
 | |
| - ✅ TAP protocol uses `#` for metadata which conflicts with test descriptions containing `#`
 | |
| - ✅ Fragile regex parsing that breaks with special characters
 | |
| - ✅ Limited extensibility for new metadata types
 | |
| 
 | |
| ### Proposed Solution: Protocol V2 ✅ IMPLEMENTED
 | |
| - ✅ Use Unicode delimiters `⟦TSTEST:META:{}⟧` that won't appear in test names
 | |
| - ✅ Structured JSON metadata format
 | |
| - ✅ Separate protocol blocks for complex data (errors, snapshots)
 | |
| - ✅ Complete replacement of v1 (no backwards compatibility needed)
 | |
| 
 | |
| ### Implementation ✅ COMPLETED
 | |
| - ✅ Phase 1: Create protocol v2 implementation in ts_tapbundle_protocol
 | |
| - ✅ Phase 2: Replace all v1 code in both tstest and tapbundle with v2
 | |
| - ✅ Phase 3: Delete all v1 parsing and generation code
 | |
| 
 | |
| #### ts_tapbundle_protocol Directory
 | |
| The protocol v2 implementation will be contained in the `ts_tapbundle_protocol` directory as isomorphic TypeScript code:
 | |
| - **Isomorphic Design**: All code must work in both browser and Node.js environments
 | |
| - **No Node.js Imports**: No Node.js-specific modules allowed (no fs, path, child_process, etc.)
 | |
| - **Protocol Classes**: Contains classes implementing all sides of the protocol:
 | |
|   - ✅ `ProtocolEmitter`: For generating protocol v2 messages (used by tapbundle)
 | |
|   - ✅ `ProtocolParser`: For parsing protocol v2 messages (used by tstest)
 | |
|   - ✅ `ProtocolMessage`: Base classes for different message types
 | |
|   - ✅ `ProtocolTypes`: TypeScript interfaces and types for protocol structures
 | |
| - **Pure TypeScript**: Only browser-compatible APIs and pure TypeScript/JavaScript code
 | |
| - **Build Integration**: 
 | |
|   - Compiled by `pnpm build` (via tsbuild) to `dist_ts_tapbundle_protocol/`
 | |
|   - Build order defined in tspublish.json files
 | |
|   - Imported by ts and ts_tapbundle modules from the compiled dist directory
 | |
| 
 | |
| See `readme.protocol.md` for detailed specification.
 | |
| 
 | |
| ## Test Configuration System (NEW)
 | |
| 
 | |
| ### Global Test Configuration via 00init.ts
 | |
| - **Discovery**: Check for `test/00init.ts` before running tests
 | |
| - **Execution**: Import and execute before any test files if found
 | |
| - **Purpose**: Define project-wide default test settings
 | |
| 
 | |
| ### tap.settings() API
 | |
| ```typescript
 | |
| interface TapSettings {
 | |
|   // Timing
 | |
|   timeout?: number;              // Default timeout for all tests (ms)
 | |
|   slowThreshold?: number;        // Mark tests as slow if they exceed this (ms)
 | |
|   
 | |
|   // Execution Control
 | |
|   bail?: boolean;                // Stop on first test failure
 | |
|   retries?: number;              // Number of retries for failed tests
 | |
|   retryDelay?: number;           // Delay between retries (ms)
 | |
|   
 | |
|   // Output Control
 | |
|   suppressConsole?: boolean;     // Suppress console output in passing tests
 | |
|   verboseErrors?: boolean;       // Show full stack traces
 | |
|   showTestDuration?: boolean;    // Show duration for each test
 | |
|   
 | |
|   // Parallel Execution
 | |
|   maxConcurrency?: number;       // Max parallel tests (for .para files)
 | |
|   isolateTests?: boolean;        // Run each test in fresh context
 | |
|   
 | |
|   // Lifecycle Hooks
 | |
|   beforeAll?: () => Promise<void> | void;
 | |
|   afterAll?: () => Promise<void> | void;
 | |
|   beforeEach?: (testName: string) => Promise<void> | void;
 | |
|   afterEach?: (testName: string, passed: boolean) => Promise<void> | void;
 | |
|   
 | |
|   // Environment
 | |
|   env?: Record<string, string>;  // Additional environment variables
 | |
|   
 | |
|   // Features
 | |
|   enableSnapshots?: boolean;     // Enable snapshot testing
 | |
|   snapshotDirectory?: string;    // Custom snapshot directory
 | |
|   updateSnapshots?: boolean;     // Update snapshots instead of comparing
 | |
| }
 | |
| ```
 | |
| 
 | |
| ### Settings Inheritance
 | |
| - Global (00init.ts) → File level → Test level
 | |
| - More specific settings override less specific ones
 | |
| - Arrays/objects are merged, primitives are replaced
 | |
| 
 | |
| ### Implementation Phases
 | |
| 1. **Core Infrastructure**: Settings storage and merge logic
 | |
| 2. **Discovery**: 00init.ts loading mechanism
 | |
| 3. **Application**: Apply settings to test execution
 | |
| 4. **Advanced**: Parallel execution and snapshot configuration
 | |
| 
 | |
| ## 1. Enhanced Communication Between tapbundle and tstest ✅ COMPLETED
 | |
| 
 | |
| ### 1.1 Real-time Test Progress API ✅ COMPLETED
 | |
| - ✅ Create a bidirectional communication channel between tapbundle and tstest
 | |
| - ✅ Emit events for test lifecycle stages (start, progress, completion)
 | |
| - ✅ Allow tstest to subscribe to tapbundle events for better progress reporting
 | |
| - ✅ Implement a standardized message format for test metadata
 | |
| 
 | |
| ### 1.2 Rich Error Reporting ✅ COMPLETED
 | |
| - ✅ Pass structured error objects from tapbundle to tstest
 | |
| - ✅ Include stack traces, code snippets, and contextual information
 | |
| - ✅ Support for error categorization (assertion failures, timeouts, uncaught exceptions)
 | |
| - ✅ Visual diff output for failed assertions
 | |
| 
 | |
| ## 2. Enhanced toolsArg Functionality
 | |
| 
 | |
| ### 2.3 Test Data and Context Sharing (Partial)
 | |
| ```typescript
 | |
| tap.test('data-driven test', async (toolsArg) => {
 | |
|   // Parameterized test data (not yet implemented)
 | |
|   const testData = toolsArg.data<TestInput>();
 | |
|   expect(processData(testData)).toEqual(expected);
 | |
| });
 | |
| ```
 | |
| 
 | |
| ## 3. Nested Tests and Test Suites
 | |
| 
 | |
| ### 3.2 Hierarchical Test Organization (Not yet implemented)
 | |
| - Support for multiple levels of nesting
 | |
| - Inherited context and configuration from parent suites
 | |
| - Aggregated reporting for test suites
 | |
| - Suite-level lifecycle hooks
 | |
| 
 | |
| ## 4. Advanced Test Features
 | |
| 
 | |
| ### 4.1 Snapshot Testing ✅ (Basic implementation complete)
 | |
| 
 | |
| ### 4.2 Performance Benchmarking
 | |
| ```typescript
 | |
| tap.test('performance test', async (toolsArg) => {
 | |
|   const benchmark = toolsArg.benchmark();
 | |
|   
 | |
|   // Run operation
 | |
|   await expensiveOperation();
 | |
|   
 | |
|   // Assert performance constraints
 | |
|   benchmark.expect({
 | |
|     maxDuration: 1000,
 | |
|     maxMemory: '100MB'
 | |
|   });
 | |
| });
 | |
| ```
 | |
| 
 | |
| 
 | |
| ## 5. Test Execution Improvements
 | |
| 
 | |
| 
 | |
| ### 5.2 Watch Mode ✅ COMPLETED
 | |
| - Automatically re-run tests on file changes
 | |
| - Debounced file change detection (300ms)
 | |
| - Clear console output between runs
 | |
| - Shows which files triggered re-runs
 | |
| - Graceful exit with Ctrl+C
 | |
| - `--watch-ignore` option for excluding patterns
 | |
| 
 | |
| ### 5.3 Advanced Test Filtering (Partial) ⚠️
 | |
| ```typescript
 | |
| // Exclude tests by pattern (not yet implemented)
 | |
| tstest --exclude "**/slow/**"
 | |
| 
 | |
| // Run only failed tests from last run (not yet implemented)
 | |
| tstest --failed
 | |
| 
 | |
| // Run tests modified in git (not yet implemented)
 | |
| tstest --changed
 | |
| ```
 | |
| 
 | |
| ## 6. Reporting and Analytics
 | |
| 
 | |
| ### 6.1 Custom Reporters
 | |
| - Plugin architecture for custom reporters
 | |
| - Built-in reporters: JSON, JUnit, HTML, Markdown
 | |
| - Real-time streaming reporters
 | |
| - Aggregated test metrics and trends
 | |
| 
 | |
| ### 6.2 Coverage Integration
 | |
| - Built-in code coverage collection
 | |
| - Coverage thresholds and enforcement
 | |
| - Coverage trending over time
 | |
| - Integration with CI/CD pipelines
 | |
| 
 | |
| ### 6.3 Test Analytics Dashboard
 | |
| - Web-based dashboard for test results
 | |
| - Historical test performance data
 | |
| - Flaky test detection
 | |
| - Test impact analysis
 | |
| 
 | |
| ## 7. Developer Experience
 | |
| 
 | |
| ### 7.1 Better Error Messages
 | |
| - Clear, actionable error messages
 | |
| - Suggestions for common issues
 | |
| - Links to documentation
 | |
| - Code examples in error output
 | |
| 
 | |
| ## Implementation Phases
 | |
| 
 | |
| ### Phase 1: Improved Internal Protocol (Priority: Critical) ✅ COMPLETED
 | |
| 1. ✅ Create ts_tapbundle_protocol directory with isomorphic protocol v2 implementation
 | |
|    - ✅ Implement ProtocolEmitter class for message generation
 | |
|    - ✅ Implement ProtocolParser class for message parsing
 | |
|    - ✅ Define ProtocolMessage types and interfaces
 | |
|    - ✅ Ensure all code is browser and Node.js compatible
 | |
|    - ✅ Add tspublish.json to configure build order
 | |
| 2. ✅ Update build configuration to compile ts_tapbundle_protocol first
 | |
| 3. ✅ Replace TAP parser in tstest with Protocol V2 parser importing from dist_ts_tapbundle_protocol
 | |
| 4. ✅ Replace TAP generation in tapbundle with Protocol V2 emitter importing from dist_ts_tapbundle_protocol
 | |
| 5. ✅ Delete all v1 TAP parsing code from tstest
 | |
| 6. ✅ Delete all v1 TAP generation code from tapbundle
 | |
| 7. ✅ Test with real-world test suites containing special characters
 | |
| 
 | |
| ### Phase 2: Test Configuration System (Priority: High) ✅ COMPLETED
 | |
| 1. ✅ Implement tap.settings() API with TypeScript interfaces
 | |
| 2. ✅ Add 00init.ts discovery and loading mechanism
 | |
| 3. ✅ Implement settings inheritance and merge logic
 | |
| 4. ✅ Apply settings to test execution (timeouts, retries, etc.)
 | |
| 
 | |
| ### Phase 3: Enhanced Communication (Priority: High) ✅ COMPLETED
 | |
| 1. ✅ Build on Protocol V2 for richer communication
 | |
| 2. ✅ Implement real-time test progress API
 | |
| 3. ✅ Add structured error reporting with diffs and traces
 | |
| 
 | |
| ### Phase 4: Developer Experience (Priority: Medium) ❌ NOT STARTED
 | |
| 1. Add watch mode
 | |
| 2. Implement custom reporters
 | |
| 3. Complete advanced test filtering options
 | |
| 4. Add performance benchmarking API
 | |
| 
 | |
| ### Phase 5: Analytics and Performance (Priority: Low) ❌ NOT STARTED
 | |
| 1. Build test analytics dashboard
 | |
| 2. Implement coverage integration
 | |
| 3. Create trend analysis tools
 | |
| 4. Add test impact analysis
 | |
| 
 | |
| ## Technical Considerations
 | |
| 
 | |
| ### API Design Principles
 | |
| - Clean, modern API design without legacy constraints
 | |
| - Progressive enhancement approach
 | |
| - Well-documented features and APIs
 | |
| - Clear, simple interfaces
 | |
| 
 | |
| ### Performance Goals
 | |
| - Minimal overhead for test execution
 | |
| - Efficient parallel execution
 | |
| - Fast test discovery
 | |
| - Optimized browser test bundling
 | |
| 
 | |
| ### Integration Points
 | |
| - Clean interfaces between tstest and tapbundle
 | |
| - Extensible plugin architecture
 | |
| - Standard test result format
 | |
| - Compatible with existing CI/CD tools
 | |
| 
 | |
| ## Summary of Remaining Work
 | |
| 
 | |
| ### ✅ Completed
 | |
| - **Protocol V2**: Full implementation with Unicode delimiters, structured metadata, and special character handling
 | |
| - **Test Configuration System**: tap.settings() API, 00init.ts discovery, settings inheritance, lifecycle hooks
 | |
| - **Enhanced Communication**: Event-based test lifecycle reporting, visual diff output for assertion failures, real-time test progress API
 | |
| - **Rich Error Reporting**: Stack traces, error metadata, and visual diffs through protocol
 | |
| - **Tags Filtering**: `--tags` option for running specific tagged tests
 | |
| 
 | |
| ### ✅ Existing Features (Not in Plan)
 | |
| - **Timeout Support**: `--timeout` option and per-test timeouts
 | |
| - **Test Retries**: `tap.retry()` for flaky test handling
 | |
| - **Parallel Tests**: `.testParallel()` for concurrent execution
 | |
| - **Snapshot Testing**: Basic implementation with `toMatchSnapshot()`
 | |
| - **Test Lifecycle**: `describe()` blocks with `beforeEach`/`afterEach`
 | |
| - **Skip Tests**: `tap.skip.test()` (though it doesn't create test objects)
 | |
| - **Log Files**: `--logfile` option saves output to `.nogit/testlogs/`
 | |
| - **Test Range**: `--startFrom` and `--stopAt` for partial runs
 | |
| 
 | |
| ### ⚠️ Partially Completed
 | |
| - **Advanced Test Filtering**: Have `--tags` but missing `--exclude`, `--failed`, `--changed`
 | |
| 
 | |
| ### ❌ Not Started
 | |
| 
 | |
| #### High Priority
 | |
| 
 | |
| #### Medium Priority
 | |
| 2. **Developer Experience**
 | |
|    - Watch mode for file changes
 | |
|    - Custom reporters (JSON, JUnit, HTML, Markdown)
 | |
|    - Performance benchmarking API
 | |
|    - Better error messages with suggestions
 | |
| 
 | |
| 3. **Enhanced toolsArg**
 | |
|    - Test data injection
 | |
|    - Context sharing between tests
 | |
|    - Parameterized tests
 | |
| 
 | |
| 4. **Test Organization**
 | |
|    - Hierarchical test suites
 | |
|    - Nested describe blocks
 | |
|    - Suite-level lifecycle hooks
 | |
| 
 | |
| #### Low Priority
 | |
| 5. **Analytics and Performance**
 | |
|    - Test analytics dashboard
 | |
|    - Code coverage integration
 | |
|    - Trend analysis
 | |
|    - Flaky test detection
 | |
| 
 | |
| ### Recently Fixed Issues ✅
 | |
| - **tap.todo()**: Now fully implemented with test object creation
 | |
| - **tap.skip.test()**: Now creates test objects and maintains accurate test count
 | |
| - **tap.only.test()**: Works correctly - when .only tests exist, only those run
 | |
| 
 | |
| ### Remaining Minor Issues
 | |
| - **Protocol Output**: Some protocol messages still appear in console output
 | |
| 
 | |
| ### Next Recommended Steps
 | |
| 1. Add Watch Mode (Phase 4) - high developer value for fast feedback
 | |
| 2. Implement Custom Reporters - important for CI/CD integration  
 | |
| 3. Implement performance benchmarking API
 | |
| 4. Add better error messages with suggestions |