Compare commits

...

11 Commits

Author SHA1 Message Date
6524adea18 1.9.0
Some checks failed
Default (tags) / security (push) Failing after 0s
Default (tags) / test (push) Failing after 0s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-11-04 02:19:57 +00:00
4bf0c02618 feat(context): Add intelligent DiffProcessor to summarize and prioritize git diffs and integrate it into the commit context pipeline 2025-11-04 02:19:57 +00:00
f84a65217d 1.8.3
Some checks failed
Default (tags) / security (push) Failing after 0s
Default (tags) / test (push) Failing after 0s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-11-04 01:37:15 +00:00
3f22fc91ae fix(context): Prevent enormous git diffs and OOM during context building by adding exclusion patterns, truncation, and diagnostic logging 2025-11-04 01:37:15 +00:00
11e65b92ec 1.8.2
Some checks failed
Default (tags) / security (push) Failing after 0s
Default (tags) / test (push) Failing after 0s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-11-03 17:53:03 +00:00
0a3080518f 1.8.1
Some checks failed
Default (tags) / security (push) Failing after 0s
Default (tags) / test (push) Failing after 0s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-11-03 17:50:09 +00:00
d0a4ddbb4b fix(git diff): improve git diff 2025-11-03 17:49:35 +00:00
481339d3cb 1.8.0
Some checks failed
Default (tags) / security (push) Failing after 0s
Default (tags) / test (push) Failing after 0s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-11-03 13:37:16 +00:00
ebc3d760af feat(context): Wire OpenAI provider through task context factory and add git-diff support to iterative context builder 2025-11-03 13:37:16 +00:00
a6d678e36c 1.7.0
Some checks failed
Default (tags) / security (push) Failing after 0s
Default (tags) / test (push) Failing after 0s
Default (tags) / release (push) Has been skipped
Default (tags) / metadata (push) Has been skipped
2025-11-03 13:19:29 +00:00
8c3e16a4f2 feat(IterativeContextBuilder): Add iterative AI-driven context builder and integrate into task factory; add tests and iterative configuration 2025-11-03 13:19:29 +00:00
20 changed files with 1662 additions and 137 deletions

View File

@@ -1,5 +1,42 @@
# Changelog
## 2025-11-04 - 1.9.0 - feat(context)
Add intelligent DiffProcessor to summarize and prioritize git diffs and integrate it into the commit context pipeline
- Add DiffProcessor (ts/context/diff-processor.ts) to intelligently process git diffs: include small files fully, summarize medium files (head/tail sampling), and mark very large files as metadata-only to stay within token budgets.
- Integrate DiffProcessor into commit workflow (ts/aidocs_classes/commit.ts): preprocess raw diffs, emit processed diff statistics, and pass a token-efficient diff section into the TaskContextFactory for commit context generation.
- Export DiffProcessor and its types through the context index and types (ts/context/index.ts, ts/context/types.ts) so other context components can reuse it.
- Add comprehensive tests for the DiffProcessor behavior and integration (test/test.diffprocessor.node.ts) covering small/medium/large diffs, added/deleted files, prioritization, token budgets, and formatting for context.
- Minor adjustments across context/task factories and builders to accept and propagate processed diff strings rather than raw diffs, reducing risk of token overflows during iterative context building.
## 2025-11-04 - 1.8.3 - fix(context)
Prevent enormous git diffs and OOM during context building by adding exclusion patterns, truncation, and diagnostic logging
- Add comprehensive git diff exclusion globs (locks, build artifacts, maps, bundles, IDE folders, logs, caches) when collecting uncommitted diffs to avoid noisy/huge diffs
- Pass glob patterns directly to smartgit.getUncommittedDiff for efficient server-side matching
- Emit diagnostic statistics for diffs (files changed, total characters, estimated tokens, number of exclusion patterns) and warn on unusually large diffs
- Introduce pre-tokenization safety checks in iterative context builder: truncate raw diff text if it exceeds MAX_DIFF_CHARS and throw a clear error if token count still exceeds MAX_DIFF_TOKENS
- Format and log token counts using locale-aware formatting for clarity
- Improve robustness of commit context generation to reduce risk of OOM / model-limit overruns
## 2025-11-03 - 1.8.0 - feat(context)
Wire OpenAI provider through task context factory and add git-diff support to iterative context builder
- Pass AiDoc.openaiInstance through TaskContextFactory into IterativeContextBuilder to reuse the same OpenAI provider and avoid reinitialization.
- IterativeContextBuilder now accepts an optional OpenAiProvider and an additionalContext string; when provided, git diffs (or other extra context) are prepended to the AI context and token counts are updated.
- createContextForCommit now forwards the git diff into the iterative builder so commit-specific context includes the diff.
- Updated aidocs_classes (commit, description, readme) to supply the existing openaiInstance when creating the TaskContextFactory.
## 2025-11-03 - 1.7.0 - feat(IterativeContextBuilder)
Add iterative AI-driven context builder and integrate into task factory; add tests and iterative configuration
- Introduce IterativeContextBuilder: iterative, token-aware context construction that asks the AI which files to load and evaluates context sufficiency.
- Switch TaskContextFactory to use IterativeContextBuilder for readme, description and commit tasks (replaces earlier EnhancedContext flow for these tasks).
- Add iterative configuration options (maxIterations, firstPassFileLimit, subsequentPassFileLimit, temperature, model) in types and ConfigManager and merge support for user config.
- Update CLI (tokens and aidoc flows) to use the iterative context factory and improve task handling and messaging.
- Add test coverage: test/test.iterativecontextbuilder.node.ts to validate initialization, iterative builds, token budget respect and multiple task types.
- Enhance ContextCache, LazyFileLoader, ContextAnalyzer and ContextTrimmer to support the iterative pipeline and smarter prioritization/prompts.
## 2025-11-03 - 1.6.1 - fix(context)
Improve context building, caching and test robustness

View File

@@ -1,6 +1,6 @@
{
"name": "@git.zone/tsdoc",
"version": "1.6.1",
"version": "1.9.0",
"private": false,
"description": "A comprehensive TypeScript documentation tool that leverages AI to generate and enhance project documentation, including dynamic README creation, API docs via TypeDoc, and smart commit message generation.",
"type": "module",

8
pnpm-lock.yaml generated
View File

@@ -3799,8 +3799,8 @@ packages:
resolution: {integrity: sha512-I9jwMn07Sy/IwOj3zVkVik2JTvgpaykDZEigL6Rx6N9LbMywwUSMtxET+7lVoDLLd3O3IXwJwvuuns8UB/HeAg==}
engines: {node: '>=4'}
minimatch@10.0.3:
resolution: {integrity: sha512-IPZ167aShDZZUMdRk66cyQAW3qr0WzbHkPdMYa8bzZhlHhO3jALbKdxcaak7W9FfT2rZNpQuUu4Od7ILEpXSaw==}
minimatch@10.1.1:
resolution: {integrity: sha512-enIvLvRAFZYXJzkCYG5RKmPfrFArdLv+R+lbQ53BmIMLIry74bjKzX6iHAm8WYamJkhSSEabrWN5D97XnKObjQ==}
engines: {node: 20 || >=22}
minimatch@3.1.2:
@@ -9797,7 +9797,7 @@ snapshots:
dependencies:
foreground-child: 3.3.1
jackspeak: 4.1.1
minimatch: 10.0.3
minimatch: 10.1.1
minipass: 7.1.2
package-json-from-dist: 1.0.1
path-scurry: 2.0.0
@@ -10680,7 +10680,7 @@ snapshots:
min-indent@1.0.1: {}
minimatch@10.0.3:
minimatch@10.1.1:
dependencies:
'@isaacs/brace-expansion': 5.0.0

View File

@@ -33,7 +33,10 @@ tap.test('should build commit object', async () => {
expect(commitObject).toHaveProperty('recommendedNextVersionLevel');
expect(commitObject).toHaveProperty('recommendedNextVersionScope');
expect(commitObject).toHaveProperty('recommendedNextVersionMessage');
});
})
tap.test('should stop AIdocs', async () => {
await aidocs.stop();
});
tap.start();

View File

@@ -0,0 +1,304 @@
import { tap, expect } from '@git.zone/tstest/tapbundle';
import { DiffProcessor } from '../ts/context/diff-processor.js';
// Sample diff strings for testing
const createSmallDiff = (filepath: string, addedLines = 5, removedLines = 3): string => {
const lines: string[] = [];
lines.push(`--- a/${filepath}`);
lines.push(`+++ b/${filepath}`);
lines.push(`@@ -1,10 +1,12 @@`);
for (let i = 0; i < removedLines; i++) {
lines.push(`-removed line ${i + 1}`);
}
for (let i = 0; i < addedLines; i++) {
lines.push(`+added line ${i + 1}`);
}
lines.push(' unchanged line');
return lines.join('\n');
};
const createMediumDiff = (filepath: string): string => {
const lines: string[] = [];
lines.push(`--- a/${filepath}`);
lines.push(`+++ b/${filepath}`);
lines.push(`@@ -1,100 +1,150 @@`);
// 150 lines of changes
for (let i = 0; i < 75; i++) {
lines.push(`+added line ${i + 1}`);
}
for (let i = 0; i < 75; i++) {
lines.push(`-removed line ${i + 1}`);
}
return lines.join('\n');
};
const createLargeDiff = (filepath: string): string => {
const lines: string[] = [];
lines.push(`--- a/${filepath}`);
lines.push(`+++ b/${filepath}`);
lines.push(`@@ -1,1000 +1,1500 @@`);
// 2500 lines of changes
for (let i = 0; i < 1250; i++) {
lines.push(`+added line ${i + 1}`);
}
for (let i = 0; i < 1250; i++) {
lines.push(`-removed line ${i + 1}`);
}
return lines.join('\n');
};
const createDeletedFileDiff = (filepath: string): string => {
return `--- a/${filepath}
+++ /dev/null
@@ -1,5 +0,0 @@
-deleted line 1
-deleted line 2
-deleted line 3
-deleted line 4
-deleted line 5`;
};
const createAddedFileDiff = (filepath: string): string => {
return `--- /dev/null
+++ b/${filepath}
@@ -0,0 +1,5 @@
+added line 1
+added line 2
+added line 3
+added line 4
+added line 5`;
};
tap.test('DiffProcessor should parse small diff correctly', async () => {
const processor = new DiffProcessor();
const smallDiff = createSmallDiff('src/test.ts', 5, 3);
const result = processor.processDiffs([smallDiff]);
expect(result.totalFiles).toEqual(1);
expect(result.fullDiffs.length).toEqual(1);
expect(result.summarizedDiffs.length).toEqual(0);
expect(result.metadataOnly.length).toEqual(0);
expect(result.totalTokens).toBeGreaterThan(0);
});
tap.test('DiffProcessor should summarize medium diff', async () => {
const processor = new DiffProcessor();
const mediumDiff = createMediumDiff('src/medium-file.ts');
const result = processor.processDiffs([mediumDiff]);
expect(result.totalFiles).toEqual(1);
expect(result.fullDiffs.length).toEqual(0);
expect(result.summarizedDiffs.length).toEqual(1);
expect(result.metadataOnly.length).toEqual(0);
// Verify the summarized diff contains the sample
const formatted = processor.formatForContext(result);
expect(formatted).toInclude('SUMMARIZED DIFFS');
expect(formatted).toInclude('lines omitted');
});
tap.test('DiffProcessor should handle large diff as metadata only', async () => {
const processor = new DiffProcessor();
const largeDiff = createLargeDiff('dist/bundle.js');
const result = processor.processDiffs([largeDiff]);
expect(result.totalFiles).toEqual(1);
expect(result.fullDiffs.length).toEqual(0);
expect(result.summarizedDiffs.length).toEqual(0);
expect(result.metadataOnly.length).toEqual(1);
const formatted = processor.formatForContext(result);
expect(formatted).toInclude('METADATA ONLY');
expect(formatted).toInclude('dist/bundle.js');
});
tap.test('DiffProcessor should prioritize source files over build artifacts', async () => {
const processor = new DiffProcessor();
const diffs = [
createSmallDiff('dist/bundle.js'),
createSmallDiff('src/important.ts'),
createSmallDiff('build/output.js'),
createSmallDiff('src/core.ts'),
];
const result = processor.processDiffs(diffs);
expect(result.totalFiles).toEqual(4);
// Source files should be included fully first
const formatted = processor.formatForContext(result);
const srcImportantIndex = formatted.indexOf('src/important.ts');
const srcCoreIndex = formatted.indexOf('src/core.ts');
const distBundleIndex = formatted.indexOf('dist/bundle.js');
const buildOutputIndex = formatted.indexOf('build/output.js');
// Source files should appear before build artifacts
expect(srcImportantIndex).toBeLessThan(distBundleIndex);
expect(srcCoreIndex).toBeLessThan(buildOutputIndex);
});
tap.test('DiffProcessor should respect token budget', async () => {
const processor = new DiffProcessor({
maxDiffTokens: 500, // Very small budget to force metadata-only
});
// Create multiple large diffs that will exceed budget
const diffs = [
createLargeDiff('src/file1.ts'),
createLargeDiff('src/file2.ts'),
createLargeDiff('src/file3.ts'),
createLargeDiff('src/file4.ts'),
];
const result = processor.processDiffs(diffs);
expect(result.totalTokens).toBeLessThanOrEqual(500);
// With such a small budget and large files, most should be metadata only
expect(result.metadataOnly.length).toBeGreaterThanOrEqual(2);
});
tap.test('DiffProcessor should handle deleted files', async () => {
const processor = new DiffProcessor();
const deletedDiff = createDeletedFileDiff('src/old-file.ts');
const result = processor.processDiffs([deletedDiff]);
expect(result.totalFiles).toEqual(1);
// Small deleted file should be included fully
expect(result.fullDiffs.length).toEqual(1);
const formatted = processor.formatForContext(result);
expect(formatted).toInclude('src/old-file.ts');
// Verify the file appears in the output
expect(formatted).toInclude('FULL DIFFS');
});
tap.test('DiffProcessor should handle added files', async () => {
const processor = new DiffProcessor();
const addedDiff = createAddedFileDiff('src/new-file.ts');
const result = processor.processDiffs([addedDiff]);
expect(result.totalFiles).toEqual(1);
// Small added file should be included fully
expect(result.fullDiffs.length).toEqual(1);
const formatted = processor.formatForContext(result);
expect(formatted).toInclude('src/new-file.ts');
// Verify the file appears in the output
expect(formatted).toInclude('FULL DIFFS');
});
tap.test('DiffProcessor should handle mixed file sizes', async () => {
const processor = new DiffProcessor();
const diffs = [
createSmallDiff('src/small.ts'),
createMediumDiff('src/medium.ts'),
createLargeDiff('dist/large.js'),
];
const result = processor.processDiffs(diffs);
expect(result.totalFiles).toEqual(3);
expect(result.fullDiffs.length).toEqual(1); // small file
expect(result.summarizedDiffs.length).toEqual(1); // medium file
expect(result.metadataOnly.length).toEqual(1); // large file
const formatted = processor.formatForContext(result);
expect(formatted).toInclude('FULL DIFFS (1 files)');
expect(formatted).toInclude('SUMMARIZED DIFFS (1 files)');
expect(formatted).toInclude('METADATA ONLY (1 files)');
});
tap.test('DiffProcessor should handle empty diff array', async () => {
const processor = new DiffProcessor();
const result = processor.processDiffs([]);
expect(result.totalFiles).toEqual(0);
expect(result.fullDiffs.length).toEqual(0);
expect(result.summarizedDiffs.length).toEqual(0);
expect(result.metadataOnly.length).toEqual(0);
expect(result.totalTokens).toEqual(0);
});
tap.test('DiffProcessor should generate comprehensive summary', async () => {
const processor = new DiffProcessor();
const diffs = [
createSmallDiff('src/file1.ts'),
createSmallDiff('src/file2.ts'),
createMediumDiff('src/file3.ts'),
createLargeDiff('dist/bundle.js'),
];
const result = processor.processDiffs(diffs);
const formatted = processor.formatForContext(result);
expect(formatted).toInclude('GIT DIFF SUMMARY');
expect(formatted).toInclude('Files changed: 4 total');
expect(formatted).toInclude('included in full');
expect(formatted).toInclude('summarized');
expect(formatted).toInclude('metadata only');
expect(formatted).toInclude('Estimated tokens:');
expect(formatted).toInclude('END OF GIT DIFF');
});
tap.test('DiffProcessor should handle custom options', async () => {
const processor = new DiffProcessor({
maxDiffTokens: 50000,
smallFileLines: 30,
mediumFileLines: 150,
sampleHeadLines: 10,
sampleTailLines: 10,
});
const mediumDiff = createMediumDiff('src/file.ts'); // 150 lines
const result = processor.processDiffs([mediumDiff]);
// With custom settings, this should be summarized (exactly at the mediumFileLines threshold)
expect(result.summarizedDiffs.length).toEqual(1);
});
tap.test('DiffProcessor should prioritize test files appropriately', async () => {
const processor = new DiffProcessor();
const diffs = [
createSmallDiff('src/core.ts'),
createSmallDiff('test/core.test.ts'),
createSmallDiff('config.json'),
];
const result = processor.processDiffs(diffs);
const formatted = processor.formatForContext(result);
// Source files should come before test files
const srcIndex = formatted.indexOf('src/core.ts');
const testIndex = formatted.indexOf('test/core.test.ts');
expect(srcIndex).toBeLessThan(testIndex);
});
tap.test('DiffProcessor should handle files with no changes gracefully', async () => {
const processor = new DiffProcessor();
const emptyDiff = `--- a/src/file.ts
+++ b/src/file.ts
@@ -1,1 +1,1 @@`;
const result = processor.processDiffs([emptyDiff]);
expect(result.totalFiles).toEqual(1);
expect(result.fullDiffs.length).toEqual(1); // Still included as a small file
});
export default tap.start();

View File

@@ -0,0 +1,147 @@
import { tap, expect } from '@git.zone/tstest/tapbundle';
import * as path from 'path';
import { IterativeContextBuilder } from '../ts/context/iterative-context-builder.js';
import type { IIterativeConfig, TaskType } from '../ts/context/types.js';
import * as qenv from '@push.rocks/qenv';
// Test project directory
const testProjectRoot = path.join(process.cwd());
// Helper to check if OPENAI_TOKEN is available
async function hasOpenAIToken(): Promise<boolean> {
try {
const qenvInstance = new qenv.Qenv();
const token = await qenvInstance.getEnvVarOnDemand('OPENAI_TOKEN');
return !!token;
} catch (error) {
return false;
}
}
tap.test('IterativeContextBuilder should create instance with default config', async () => {
const builder = new IterativeContextBuilder(testProjectRoot);
expect(builder).toBeInstanceOf(IterativeContextBuilder);
});
tap.test('IterativeContextBuilder should create instance with custom config', async () => {
const customConfig: Partial<IIterativeConfig> = {
maxIterations: 3,
firstPassFileLimit: 5,
subsequentPassFileLimit: 3,
temperature: 0.5,
model: 'gpt-4',
};
const builder = new IterativeContextBuilder(testProjectRoot, customConfig);
expect(builder).toBeInstanceOf(IterativeContextBuilder);
});
tap.test('IterativeContextBuilder should initialize successfully', async () => {
if (!(await hasOpenAIToken())) {
console.log('⚠️ Skipping initialization test - OPENAI_TOKEN not available');
return;
}
const builder = new IterativeContextBuilder(testProjectRoot);
await builder.initialize();
// If we get here without error, initialization succeeded
expect(true).toEqual(true);
});
tap.test('IterativeContextBuilder should build context iteratively for readme task', async () => {
if (!(await hasOpenAIToken())) {
console.log('⚠️ Skipping iterative build test - OPENAI_TOKEN not available');
return;
}
const builder = new IterativeContextBuilder(testProjectRoot, {
maxIterations: 2, // Limit iterations for testing
firstPassFileLimit: 3,
subsequentPassFileLimit: 2,
});
await builder.initialize();
const result = await builder.buildContextIteratively('readme');
// Verify result structure
expect(result).toBeTypeOf('object');
expect(result.context).toBeTypeOf('string');
expect(result.context.length).toBeGreaterThan(0);
expect(result.tokenCount).toBeTypeOf('number');
expect(result.tokenCount).toBeGreaterThan(0);
expect(result.includedFiles).toBeInstanceOf(Array);
expect(result.includedFiles.length).toBeGreaterThan(0);
expect(result.iterationCount).toBeTypeOf('number');
expect(result.iterationCount).toBeGreaterThan(0);
expect(result.iterationCount).toBeLessThanOrEqual(2);
expect(result.iterations).toBeInstanceOf(Array);
expect(result.iterations.length).toEqual(result.iterationCount);
expect(result.apiCallCount).toBeTypeOf('number');
expect(result.apiCallCount).toBeGreaterThan(0);
expect(result.totalDuration).toBeTypeOf('number');
expect(result.totalDuration).toBeGreaterThan(0);
// Verify iteration structure
for (const iteration of result.iterations) {
expect(iteration.iteration).toBeTypeOf('number');
expect(iteration.filesLoaded).toBeInstanceOf(Array);
expect(iteration.tokensUsed).toBeTypeOf('number');
expect(iteration.totalTokensUsed).toBeTypeOf('number');
expect(iteration.decision).toBeTypeOf('object');
expect(iteration.duration).toBeTypeOf('number');
}
console.log(`✅ Iterative context build completed:`);
console.log(` Iterations: ${result.iterationCount}`);
console.log(` Files: ${result.includedFiles.length}`);
console.log(` Tokens: ${result.tokenCount}`);
console.log(` API calls: ${result.apiCallCount}`);
console.log(` Duration: ${(result.totalDuration / 1000).toFixed(2)}s`);
});
tap.test('IterativeContextBuilder should respect token budget', async () => {
if (!(await hasOpenAIToken())) {
console.log('⚠️ Skipping token budget test - OPENAI_TOKEN not available');
return;
}
const builder = new IterativeContextBuilder(testProjectRoot, {
maxIterations: 5,
});
await builder.initialize();
const result = await builder.buildContextIteratively('description');
// Token count should not exceed budget significantly (allow 5% margin for safety)
const configManager = (await import('../ts/context/config-manager.js')).ConfigManager.getInstance();
const maxTokens = configManager.getMaxTokens();
expect(result.tokenCount).toBeLessThanOrEqual(maxTokens * 1.05);
console.log(`✅ Token budget respected: ${result.tokenCount}/${maxTokens}`);
});
tap.test('IterativeContextBuilder should work with different task types', async () => {
if (!(await hasOpenAIToken())) {
console.log('⚠️ Skipping task types test - OPENAI_TOKEN not available');
return;
}
const taskTypes: TaskType[] = ['readme', 'description', 'commit'];
for (const taskType of taskTypes) {
const builder = new IterativeContextBuilder(testProjectRoot, {
maxIterations: 2,
firstPassFileLimit: 2,
});
await builder.initialize();
const result = await builder.buildContextIteratively(taskType);
expect(result.includedFiles.length).toBeGreaterThan(0);
console.log(`${taskType}: ${result.includedFiles.length} files, ${result.tokenCount} tokens`);
}
});
export default tap.start();

View File

@@ -1,8 +0,0 @@
import { expect, tap } from '@push.rocks/tapbundle';
import * as tsdoc from '../ts/index.js';
tap.test('first test', async () => {
console.log('test');
});
tap.start();

View File

@@ -3,6 +3,6 @@
*/
export const commitinfo = {
name: '@git.zone/tsdoc',
version: '1.6.1',
version: '1.9.0',
description: 'A comprehensive TypeScript documentation tool that leverages AI to generate and enhance project documentation, including dynamic README creation, API docs via TypeDoc, and smart commit message generation.'
}

View File

@@ -1,6 +1,7 @@
import * as plugins from '../plugins.js';
import { AiDoc } from '../classes.aidoc.js';
import { ProjectContext } from './projectcontext.js';
import { DiffProcessor } from '../context/diff-processor.js';
export interface INextCommitObject {
recommendedNextVersionLevel: 'fix' | 'feat' | 'BREAKING CHANGE'; // the recommended next version level of the project
@@ -27,18 +28,101 @@ export class Commit {
smartgitInstance,
this.projectDir
);
const diffStringArray = await gitRepo.getUncommittedDiff([
// Define comprehensive exclusion patterns
// smartgit@3.3.0+ supports glob patterns natively
const excludePatterns = [
// Lock files
'pnpm-lock.yaml',
'package-lock.json',
]);
'npm-shrinkwrap.json',
'yarn.lock',
'deno.lock',
'bun.lockb',
// Build artifacts (main culprit for large diffs!)
'dist/**',
'dist_*/**', // dist_ts, dist_web, etc.
'build/**',
'.next/**',
'out/**',
'public/dist/**',
// Compiled/bundled files
'**/*.js.map',
'**/*.d.ts.map',
'**/*.min.js',
'**/*.bundle.js',
'**/*.chunk.js',
// IDE/Editor directories
'.claude/**',
'.cursor/**',
'.vscode/**',
'.idea/**',
'**/*.swp',
'**/*.swo',
// Logs and caches
'.nogit/**',
'**/*.log',
'.cache/**',
'.rpt2_cache/**',
'coverage/**',
'.nyc_output/**',
];
// Pass glob patterns directly to smartgit - it handles matching internally
const diffStringArray = await gitRepo.getUncommittedDiff(excludePatterns);
// Process diffs intelligently using DiffProcessor
let processedDiffString: string;
if (diffStringArray.length > 0) {
// Diagnostic logging for raw diff statistics
const totalChars = diffStringArray.join('\n\n').length;
const estimatedTokens = Math.ceil(totalChars / 4);
console.log(`📊 Raw git diff statistics:`);
console.log(` Files changed: ${diffStringArray.length}`);
console.log(` Total characters: ${totalChars.toLocaleString()}`);
console.log(` Estimated tokens: ${estimatedTokens.toLocaleString()}`);
console.log(` Exclusion patterns: ${excludePatterns.length}`);
// Use DiffProcessor to intelligently handle large diffs
const diffProcessor = new DiffProcessor({
maxDiffTokens: 100000, // Reserve 100k tokens for diffs
smallFileLines: 50, // Include files <= 50 lines fully
mediumFileLines: 200, // Summarize files <= 200 lines
sampleHeadLines: 20, // Show first 20 lines
sampleTailLines: 20, // Show last 20 lines
});
const processedDiff = diffProcessor.processDiffs(diffStringArray);
processedDiffString = diffProcessor.formatForContext(processedDiff);
console.log(`📝 Processed diff statistics:`);
console.log(` Full diffs: ${processedDiff.fullDiffs.length} files`);
console.log(` Summarized: ${processedDiff.summarizedDiffs.length} files`);
console.log(` Metadata only: ${processedDiff.metadataOnly.length} files`);
console.log(` Final tokens: ${processedDiff.totalTokens.toLocaleString()}`);
if (estimatedTokens > 50000) {
console.log(`✅ DiffProcessor reduced token usage: ${estimatedTokens.toLocaleString()}${processedDiff.totalTokens.toLocaleString()}`);
}
} else {
processedDiffString = 'No changes.';
}
// Use the new TaskContextFactory for optimized context
const taskContextFactory = new (await import('../context/index.js')).TaskContextFactory(this.projectDir);
const taskContextFactory = new (await import('../context/index.js')).TaskContextFactory(
this.projectDir,
this.aiDocsRef.openaiInstance
);
await taskContextFactory.initialize();
// Generate context specifically for commit task
const contextResult = await taskContextFactory.createContextForCommit(
diffStringArray[0] ? diffStringArray.join('\n\n') : 'No changes.'
);
const contextResult = await taskContextFactory.createContextForCommit(processedDiffString);
// Get the optimized context string
let contextString = contextResult.context;

View File

@@ -19,7 +19,10 @@ export class Description {
public async build() {
// Use the new TaskContextFactory for optimized context
const taskContextFactory = new (await import('../context/index.js')).TaskContextFactory(this.projectDir);
const taskContextFactory = new (await import('../context/index.js')).TaskContextFactory(
this.projectDir,
this.aiDocsRef.openaiInstance
);
await taskContextFactory.initialize();
// Generate context specifically for description task

View File

@@ -18,7 +18,10 @@ export class Readme {
let finalReadmeString = ``;
// Use the new TaskContextFactory for optimized context
const taskContextFactory = new (await import('../context/index.js')).TaskContextFactory(this.projectDir);
const taskContextFactory = new (await import('../context/index.js')).TaskContextFactory(
this.projectDir,
this.aiDocsRef.openaiInstance
);
await taskContextFactory.initialize();
// Generate context specifically for readme task

View File

@@ -64,7 +64,7 @@ export class AiDoc {
await this.npmextraKV.writeKey('OPENAI_TOKEN', this.openaiToken);
}
}
if (!this.openaiToken) {
if (!this.openaiToken && this.npmextraKV) {
this.openaiToken = await this.npmextraKV.readKey('OPENAI_TOKEN');
}
@@ -76,7 +76,11 @@ export class AiDoc {
}
public async stop() {
await this.openaiInstance.stop();
if (this.openaiInstance) {
await this.openaiInstance.stop();
}
// No explicit cleanup needed for npmextraKV or aidocInteract
// They don't keep event loop alive
}
public async buildReadme(projectDirArg: string) {

View File

@@ -57,29 +57,25 @@ export const run = async () => {
logger.log('info', `Calculating context token count...`);
// Determine context mode based on args
let contextMode: context.ContextMode = 'full';
if (argvArg.trim || argvArg.trimmed) {
contextMode = 'trimmed';
} else if (argvArg.summarize || argvArg.summarized) {
contextMode = 'summarized';
}
// Get task type if specified
let taskType: context.TaskType | undefined = undefined;
if (argvArg.task) {
if (['readme', 'commit', 'description'].includes(argvArg.task)) {
taskType = argvArg.task as context.TaskType;
} else {
logger.log('warn', `Unknown task type: ${argvArg.task}. Using default context.`);
logger.log('warn', `Unknown task type: ${argvArg.task}. Using default (readme).`);
taskType = 'readme';
}
} else {
// Default to readme if no task specified
taskType = 'readme';
}
// Use enhanced context
// Use iterative context building
const taskFactory = new context.TaskContextFactory(paths.cwd);
await taskFactory.initialize();
let contextResult: context.IContextResult;
let contextResult: context.IIterativeContextResult;
if (argvArg.all) {
// Show stats for all task types
@@ -100,21 +96,8 @@ export const run = async () => {
return;
}
if (taskType) {
// Get context for specific task
contextResult = await taskFactory.createContextForTask(taskType);
} else {
// Get generic context with specified mode
const enhancedContext = new context.EnhancedContext(paths.cwd);
await enhancedContext.initialize();
enhancedContext.setContextMode(contextMode);
if (argvArg.maxTokens) {
enhancedContext.setTokenBudget(parseInt(argvArg.maxTokens, 10));
}
contextResult = await enhancedContext.buildContext();
}
// Get context for specific task
contextResult = await taskFactory.createContextForTask(taskType);
// Display results
logger.log('ok', `Total context token count: ${contextResult.tokenCount}`);

View File

@@ -9,7 +9,8 @@ import type {
ICacheConfig,
IAnalyzerConfig,
IPrioritizationWeights,
ITierConfig
ITierConfig,
IIterativeConfig
} from './types.js';
/**
@@ -98,6 +99,13 @@ export class ConfigManager {
essential: { minScore: 0.8, trimLevel: 'none' },
important: { minScore: 0.5, trimLevel: 'light' },
optional: { minScore: 0.2, trimLevel: 'aggressive' }
},
iterative: {
maxIterations: 5,
firstPassFileLimit: 10,
subsequentPassFileLimit: 5,
temperature: 0.3,
model: 'gpt-4-turbo-preview'
}
};
}
@@ -216,6 +224,14 @@ export class ConfigManager {
};
}
// Merge iterative configuration
if (userConfig.iterative) {
result.iterative = {
...result.iterative,
...userConfig.iterative
};
}
return result;
}
@@ -331,6 +347,19 @@ export class ConfigManager {
};
}
/**
* Get iterative configuration
*/
public getIterativeConfig(): IIterativeConfig {
return this.config.iterative || {
maxIterations: 5,
firstPassFileLimit: 10,
subsequentPassFileLimit: 5,
temperature: 0.3,
model: 'gpt-4-turbo-preview'
};
}
/**
* Clear the config cache (force reload on next access)
*/

View File

@@ -1,6 +1,7 @@
import * as plugins from '../plugins.js';
import * as fs from 'fs';
import type { ICacheEntry, ICacheConfig } from './types.js';
import { logger } from '../logging.js';
/**
* ContextCache provides persistent caching of file contents and token counts

View File

@@ -0,0 +1,341 @@
/**
* Intelligent git diff processor that handles large diffs by sampling and prioritization
* instead of blind truncation.
*/
export interface IDiffFileInfo {
filepath: string;
status: 'added' | 'modified' | 'deleted';
linesAdded: number;
linesRemoved: number;
totalLines: number;
estimatedTokens: number;
diffContent: string;
}
export interface IProcessedDiff {
summary: string; // Human-readable overview
fullDiffs: string[]; // Small files included fully
summarizedDiffs: string[]; // Medium files with head/tail
metadataOnly: string[]; // Large files, just stats
totalFiles: number;
totalTokens: number;
}
export interface IDiffProcessorOptions {
maxDiffTokens?: number; // Maximum tokens for entire diff section (default: 100000)
smallFileLines?: number; // Files <= this are included fully (default: 50)
mediumFileLines?: number; // Files <= this are summarized (default: 200)
sampleHeadLines?: number; // Lines to show at start of medium files (default: 20)
sampleTailLines?: number; // Lines to show at end of medium files (default: 20)
}
export class DiffProcessor {
private options: Required<IDiffProcessorOptions>;
constructor(options: IDiffProcessorOptions = {}) {
this.options = {
maxDiffTokens: options.maxDiffTokens ?? 100000,
smallFileLines: options.smallFileLines ?? 50,
mediumFileLines: options.mediumFileLines ?? 200,
sampleHeadLines: options.sampleHeadLines ?? 20,
sampleTailLines: options.sampleTailLines ?? 20,
};
}
/**
* Process an array of git diffs into a structured, token-efficient format
*/
public processDiffs(diffStringArray: string[]): IProcessedDiff {
// Parse all diffs into file info objects
const fileInfos: IDiffFileInfo[] = diffStringArray
.map(diffString => this.parseDiffFile(diffString))
.filter(info => info !== null) as IDiffFileInfo[];
// Prioritize files (source files first, build artifacts last)
const prioritized = this.prioritizeFiles(fileInfos);
const result: IProcessedDiff = {
summary: '',
fullDiffs: [],
summarizedDiffs: [],
metadataOnly: [],
totalFiles: prioritized.length,
totalTokens: 0,
};
let tokensUsed = 0;
const tokenBudget = this.options.maxDiffTokens;
// Categorize and include files based on size and token budget
for (const fileInfo of prioritized) {
const remainingBudget = tokenBudget - tokensUsed;
if (remainingBudget <= 0) {
// Budget exhausted - rest are metadata only
result.metadataOnly.push(this.formatMetadataOnly(fileInfo));
continue;
}
if (fileInfo.totalLines <= this.options.smallFileLines) {
// Small file - include fully if budget allows
if (fileInfo.estimatedTokens <= remainingBudget) {
const statusPrefix = this.getFileStatusPrefix(fileInfo);
result.fullDiffs.push(`${statusPrefix}${fileInfo.diffContent}`);
tokensUsed += fileInfo.estimatedTokens;
} else {
result.metadataOnly.push(this.formatMetadataOnly(fileInfo));
}
} else if (fileInfo.totalLines <= this.options.mediumFileLines) {
// Medium file - try to include summary with head/tail
const summary = this.extractDiffSample(
fileInfo,
this.options.sampleHeadLines,
this.options.sampleTailLines
);
const summaryTokens = Math.ceil(summary.length / 4); // Rough estimate
if (summaryTokens <= remainingBudget) {
result.summarizedDiffs.push(summary);
tokensUsed += summaryTokens;
} else {
result.metadataOnly.push(this.formatMetadataOnly(fileInfo));
}
} else {
// Large file - metadata only
result.metadataOnly.push(this.formatMetadataOnly(fileInfo));
}
}
result.totalTokens = tokensUsed;
result.summary = this.generateSummary(result);
return result;
}
/**
* Format the processed diff for inclusion in context
*/
public formatForContext(processed: IProcessedDiff): string {
const sections: string[] = [];
// Summary section
sections.push('====== GIT DIFF SUMMARY ======');
sections.push(processed.summary);
sections.push('');
// Full diffs section
if (processed.fullDiffs.length > 0) {
sections.push(`====== FULL DIFFS (${processed.fullDiffs.length} files) ======`);
sections.push(processed.fullDiffs.join('\n\n'));
sections.push('');
}
// Summarized diffs section
if (processed.summarizedDiffs.length > 0) {
sections.push(`====== SUMMARIZED DIFFS (${processed.summarizedDiffs.length} files) ======`);
sections.push(processed.summarizedDiffs.join('\n\n'));
sections.push('');
}
// Metadata only section
if (processed.metadataOnly.length > 0) {
sections.push(`====== METADATA ONLY (${processed.metadataOnly.length} files) ======`);
sections.push(processed.metadataOnly.join('\n'));
sections.push('');
}
sections.push('====== END OF GIT DIFF ======');
return sections.join('\n');
}
/**
* Parse a single git diff string into file information
*/
private parseDiffFile(diffString: string): IDiffFileInfo | null {
if (!diffString || diffString.trim().length === 0) {
return null;
}
const lines = diffString.split('\n');
let filepath = '';
let status: 'added' | 'modified' | 'deleted' = 'modified';
let linesAdded = 0;
let linesRemoved = 0;
// Parse diff header to extract filepath and status
for (const line of lines) {
if (line.startsWith('--- a/')) {
filepath = line.substring(6);
} else if (line.startsWith('+++ b/')) {
const newPath = line.substring(6);
if (newPath === '/dev/null') {
status = 'deleted';
} else if (filepath === '/dev/null') {
status = 'added';
filepath = newPath;
} else {
filepath = newPath;
}
} else if (line.startsWith('+') && !line.startsWith('+++')) {
linesAdded++;
} else if (line.startsWith('-') && !line.startsWith('---')) {
linesRemoved++;
}
}
const totalLines = linesAdded + linesRemoved;
const estimatedTokens = Math.ceil(diffString.length / 4);
return {
filepath,
status,
linesAdded,
linesRemoved,
totalLines,
estimatedTokens,
diffContent: diffString,
};
}
/**
* Prioritize files by importance (source files before build artifacts)
*/
private prioritizeFiles(files: IDiffFileInfo[]): IDiffFileInfo[] {
return files.sort((a, b) => {
const scoreA = this.getFileImportanceScore(a.filepath);
const scoreB = this.getFileImportanceScore(b.filepath);
return scoreB - scoreA; // Higher score first
});
}
/**
* Calculate importance score for a file path
*/
private getFileImportanceScore(filepath: string): number {
// Source files - highest priority
if (filepath.match(/^(src|lib|app|components|pages|api)\//)) {
return 100;
}
// Test files - high priority
if (filepath.match(/\.(test|spec)\.(ts|js|tsx|jsx)$/) || filepath.startsWith('test/')) {
return 80;
}
// Configuration files - medium-high priority
if (filepath.match(/\.(json|yaml|yml|toml|config\.(ts|js))$/)) {
return 60;
}
// Documentation - medium priority
if (filepath.match(/\.(md|txt|rst)$/)) {
return 40;
}
// Build artifacts - low priority
if (filepath.match(/^(dist|build|out|\.next|public\/dist)\//)) {
return 10;
}
// Everything else - default priority
return 50;
}
/**
* Extract head and tail lines from a diff, omitting the middle
*/
private extractDiffSample(fileInfo: IDiffFileInfo, headLines: number, tailLines: number): string {
const lines = fileInfo.diffContent.split('\n');
const totalLines = lines.length;
if (totalLines <= headLines + tailLines) {
// File is small enough to include fully
return fileInfo.diffContent;
}
// Extract file metadata from diff header
const headerLines: string[] = [];
let bodyStartIndex = 0;
for (let i = 0; i < lines.length; i++) {
if (lines[i].startsWith('@@')) {
headerLines.push(...lines.slice(0, i + 1));
bodyStartIndex = i + 1;
break;
}
}
const bodyLines = lines.slice(bodyStartIndex);
const head = bodyLines.slice(0, headLines);
const tail = bodyLines.slice(-tailLines);
const omittedLines = bodyLines.length - headLines - tailLines;
const statusEmoji = fileInfo.status === 'added' ? '' :
fileInfo.status === 'deleted' ? '' : '📝';
const parts: string[] = [];
parts.push(`${statusEmoji} FILE: ${fileInfo.filepath}`);
parts.push(`CHANGES: +${fileInfo.linesAdded} lines, -${fileInfo.linesRemoved} lines (${fileInfo.totalLines} total)`);
parts.push('');
parts.push(...headerLines);
parts.push(...head);
parts.push('');
parts.push(`[... ${omittedLines} lines omitted - use Read tool to see full file ...]`);
parts.push('');
parts.push(...tail);
return parts.join('\n');
}
/**
* Get file status prefix with emoji
*/
private getFileStatusPrefix(fileInfo: IDiffFileInfo): string {
const statusEmoji = fileInfo.status === 'added' ? '' :
fileInfo.status === 'deleted' ? '' : '📝';
return `${statusEmoji} `;
}
/**
* Extract filepath from diff content
*/
private extractFilepathFromDiff(diffContent: string): string {
const lines = diffContent.split('\n');
for (const line of lines) {
if (line.startsWith('+++ b/')) {
return line.substring(6);
}
}
return 'unknown';
}
/**
* Format file info as metadata only
*/
private formatMetadataOnly(fileInfo: IDiffFileInfo): string {
const statusEmoji = fileInfo.status === 'added' ? '' :
fileInfo.status === 'deleted' ? '' : '📝';
return `${statusEmoji} ${fileInfo.filepath} (+${fileInfo.linesAdded}, -${fileInfo.linesRemoved})`;
}
/**
* Generate human-readable summary of processed diff
*/
private generateSummary(result: IProcessedDiff): string {
const parts: string[] = [];
parts.push(`Files changed: ${result.totalFiles} total`);
parts.push(`- ${result.fullDiffs.length} included in full`);
parts.push(`- ${result.summarizedDiffs.length} summarized (head/tail shown)`);
parts.push(`- ${result.metadataOnly.length} metadata only`);
parts.push(`Estimated tokens: ~${result.totalTokens.toLocaleString()}`);
if (result.metadataOnly.length > 0) {
parts.push('');
parts.push('NOTE: Some files excluded to stay within token budget.');
parts.push('Use Read tool with specific file paths to see full content.');
}
return parts.join('\n');
}
}

View File

@@ -5,6 +5,7 @@ import { ContextTrimmer } from './context-trimmer.js';
import { LazyFileLoader } from './lazy-file-loader.js';
import { ContextCache } from './context-cache.js';
import { ContextAnalyzer } from './context-analyzer.js';
import { DiffProcessor } from './diff-processor.js';
import type {
ContextMode,
IContextConfig,
@@ -22,7 +23,12 @@ import type {
ICacheEntry,
IFileDependencies,
IFileAnalysis,
IAnalysisResult
IAnalysisResult,
IIterativeConfig,
IIterativeContextResult,
IDiffFileInfo,
IProcessedDiff,
IDiffProcessorOptions
} from './types.js';
export {
@@ -34,6 +40,7 @@ export {
LazyFileLoader,
ContextCache,
ContextAnalyzer,
DiffProcessor,
};
// Types
@@ -54,5 +61,10 @@ export type {
ICacheEntry,
IFileDependencies,
IFileAnalysis,
IAnalysisResult
IAnalysisResult,
IIterativeConfig,
IIterativeContextResult,
IDiffFileInfo,
IProcessedDiff,
IDiffProcessorOptions
};

View File

@@ -0,0 +1,523 @@
import * as plugins from '../plugins.js';
import * as fs from 'fs';
import { logger } from '../logging.js';
import type {
TaskType,
IFileMetadata,
IFileInfo,
IIterativeContextResult,
IIterationState,
IFileSelectionDecision,
IContextSufficiencyDecision,
IIterativeConfig,
} from './types.js';
import { LazyFileLoader } from './lazy-file-loader.js';
import { ContextCache } from './context-cache.js';
import { ContextAnalyzer } from './context-analyzer.js';
import { ConfigManager } from './config-manager.js';
/**
* Iterative context builder that uses AI to intelligently select files
* across multiple iterations until sufficient context is gathered
*/
export class IterativeContextBuilder {
private projectRoot: string;
private lazyLoader: LazyFileLoader;
private cache: ContextCache;
private analyzer: ContextAnalyzer;
private config: Required<IIterativeConfig>;
private tokenBudget: number = 190000;
private openaiInstance: plugins.smartai.OpenAiProvider;
private externalOpenaiInstance?: plugins.smartai.OpenAiProvider;
/**
* Creates a new IterativeContextBuilder
* @param projectRoot - Root directory of the project
* @param config - Iterative configuration
* @param openaiInstance - Optional pre-configured OpenAI provider instance
*/
constructor(
projectRoot: string,
config?: Partial<IIterativeConfig>,
openaiInstance?: plugins.smartai.OpenAiProvider
) {
this.projectRoot = projectRoot;
this.lazyLoader = new LazyFileLoader(projectRoot);
this.cache = new ContextCache(projectRoot);
this.analyzer = new ContextAnalyzer(projectRoot);
this.externalOpenaiInstance = openaiInstance;
// Default configuration
this.config = {
maxIterations: config?.maxIterations ?? 5,
firstPassFileLimit: config?.firstPassFileLimit ?? 10,
subsequentPassFileLimit: config?.subsequentPassFileLimit ?? 5,
temperature: config?.temperature ?? 0.3,
model: config?.model ?? 'gpt-4-turbo-preview',
};
}
/**
* Initialize the builder
*/
public async initialize(): Promise<void> {
await this.cache.init();
const configManager = ConfigManager.getInstance();
await configManager.initialize(this.projectRoot);
this.tokenBudget = configManager.getMaxTokens();
// Use external OpenAI instance if provided, otherwise create a new one
if (this.externalOpenaiInstance) {
this.openaiInstance = this.externalOpenaiInstance;
} else {
// Initialize OpenAI instance from environment
const qenvInstance = new plugins.qenv.Qenv();
const openaiToken = await qenvInstance.getEnvVarOnDemand('OPENAI_TOKEN');
if (!openaiToken) {
throw new Error('OPENAI_TOKEN environment variable is required for iterative context building');
}
this.openaiInstance = new plugins.smartai.OpenAiProvider({
openaiToken,
});
await this.openaiInstance.start();
}
}
/**
* Build context iteratively using AI decision making
* @param taskType - Type of task being performed
* @param additionalContext - Optional additional context (e.g., git diff for commit tasks)
* @returns Complete iterative context result
*/
public async buildContextIteratively(taskType: TaskType, additionalContext?: string): Promise<IIterativeContextResult> {
const startTime = Date.now();
logger.log('info', '🤖 Starting iterative context building...');
logger.log('info', ` Task: ${taskType}, Budget: ${this.tokenBudget} tokens, Max iterations: ${this.config.maxIterations}`);
// Phase 1: Scan project files for metadata
logger.log('info', '📋 Scanning project files...');
const metadata = await this.scanProjectFiles(taskType);
const totalEstimatedTokens = metadata.reduce((sum, m) => sum + m.estimatedTokens, 0);
logger.log('info', ` Found ${metadata.length} files (~${totalEstimatedTokens} estimated tokens)`);
// Phase 2: Analyze files for initial prioritization
logger.log('info', '🔍 Analyzing file dependencies and importance...');
const analysis = await this.analyzer.analyze(metadata, taskType, []);
logger.log('info', ` Analysis complete in ${analysis.analysisDuration}ms`);
// Track state across iterations
const iterations: IIterationState[] = [];
let totalTokensUsed = 0;
let apiCallCount = 0;
let loadedContent = '';
const includedFiles: IFileInfo[] = [];
// If additional context (e.g., git diff) is provided, prepend it
if (additionalContext) {
// CRITICAL SAFETY: Check raw string size BEFORE tokenization to prevent OOM
const MAX_DIFF_CHARS = 500000; // ~125k tokens max (conservative 4 chars/token ratio)
const MAX_DIFF_TOKENS = 150000; // Hard token limit for safety
// First check: raw character count
if (additionalContext.length > MAX_DIFF_CHARS) {
const originalSize = additionalContext.length;
logger.log('warn', `⚠️ Git diff too large (${originalSize.toLocaleString()} chars > ${MAX_DIFF_CHARS.toLocaleString()} limit)`);
logger.log('warn', ` This likely includes build artifacts (dist/, *.js.map, bundles, etc.)`);
logger.log('warn', ` Truncating to first ${MAX_DIFF_CHARS.toLocaleString()} characters.`);
logger.log('warn', ` Consider: git stash build files, improve .gitignore, or review uncommitted changes.`);
additionalContext = additionalContext.substring(0, MAX_DIFF_CHARS) +
'\n\n[... DIFF TRUNCATED - exceeded size limit of ' + MAX_DIFF_CHARS.toLocaleString() + ' chars ...]';
}
const diffSection = `
====== GIT DIFF ======
${additionalContext}
====== END OF GIT DIFF ======
`;
// Second check: actual token count after truncation
const diffTokens = this.countTokens(diffSection);
if (diffTokens > MAX_DIFF_TOKENS) {
logger.log('error', `❌ Git diff still too large after truncation (${diffTokens.toLocaleString()} tokens > ${MAX_DIFF_TOKENS.toLocaleString()} limit)`);
throw new Error(
`Git diff size (${diffTokens.toLocaleString()} tokens) exceeds maximum (${MAX_DIFF_TOKENS.toLocaleString()} tokens). ` +
`This indicates massive uncommitted changes, likely build artifacts. ` +
`Please commit or stash dist/, build/, or other generated files.`
);
}
loadedContent = diffSection;
totalTokensUsed += diffTokens;
logger.log('info', `📝 Added git diff to context (${diffTokens.toLocaleString()} tokens)`);
}
// Phase 3: Iterative file selection and loading
for (let iteration = 1; iteration <= this.config.maxIterations; iteration++) {
const iterationStart = Date.now();
logger.log('info', `\n🤔 Iteration ${iteration}/${this.config.maxIterations}: Asking AI which files to examine...`);
const remainingBudget = this.tokenBudget - totalTokensUsed;
logger.log('info', ` Token budget remaining: ${remainingBudget}/${this.tokenBudget} (${Math.round((remainingBudget / this.tokenBudget) * 100)}%)`);
// Get AI decision on which files to load
const decision = await this.getFileSelectionDecision(
metadata,
analysis.files.slice(0, 30), // Top 30 files by importance
taskType,
iteration,
totalTokensUsed,
remainingBudget,
loadedContent
);
apiCallCount++;
logger.log('info', ` AI reasoning: ${decision.reasoning}`);
logger.log('info', ` AI requested ${decision.filesToLoad.length} files`);
// Load requested files
const iterationFiles: IFileInfo[] = [];
let iterationTokens = 0;
if (decision.filesToLoad.length > 0) {
logger.log('info', '📥 Loading requested files...');
for (const filePath of decision.filesToLoad) {
try {
const fileInfo = await this.loadFile(filePath);
if (totalTokensUsed + fileInfo.tokenCount! <= this.tokenBudget) {
const formattedFile = this.formatFileForContext(fileInfo);
loadedContent += formattedFile;
includedFiles.push(fileInfo);
iterationFiles.push(fileInfo);
iterationTokens += fileInfo.tokenCount!;
totalTokensUsed += fileInfo.tokenCount!;
logger.log('info', `${fileInfo.relativePath} (${fileInfo.tokenCount} tokens)`);
} else {
logger.log('warn', `${fileInfo.relativePath} - would exceed budget, skipping`);
}
} catch (error) {
logger.log('warn', ` ✗ Failed to load ${filePath}: ${error.message}`);
}
}
}
// Record iteration state
const iterationDuration = Date.now() - iterationStart;
iterations.push({
iteration,
filesLoaded: iterationFiles,
tokensUsed: iterationTokens,
totalTokensUsed,
decision,
duration: iterationDuration,
});
logger.log('info', ` Iteration ${iteration} complete: ${iterationFiles.length} files loaded, ${iterationTokens} tokens used`);
// Check if we should continue
if (totalTokensUsed >= this.tokenBudget * 0.95) {
logger.log('warn', '⚠️ Approaching token budget limit, stopping iterations');
break;
}
// Ask AI if context is sufficient
if (iteration < this.config.maxIterations) {
logger.log('info', '🤔 Asking AI if context is sufficient...');
const sufficiencyDecision = await this.evaluateContextSufficiency(
loadedContent,
taskType,
iteration,
totalTokensUsed,
remainingBudget - iterationTokens
);
apiCallCount++;
logger.log('info', ` AI decision: ${sufficiencyDecision.sufficient ? '✅ SUFFICIENT' : '⏭️ NEEDS MORE'}`);
logger.log('info', ` Reasoning: ${sufficiencyDecision.reasoning}`);
if (sufficiencyDecision.sufficient) {
logger.log('ok', '✅ Context building complete - AI determined context is sufficient');
break;
}
}
}
const totalDuration = Date.now() - startTime;
logger.log('ok', `\n✅ Iterative context building complete!`);
logger.log('info', ` Files included: ${includedFiles.length}`);
logger.log('info', ` Token usage: ${totalTokensUsed}/${this.tokenBudget} (${Math.round((totalTokensUsed / this.tokenBudget) * 100)}%)`);
logger.log('info', ` Iterations: ${iterations.length}, API calls: ${apiCallCount}`);
logger.log('info', ` Total duration: ${(totalDuration / 1000).toFixed(2)}s`);
return {
context: loadedContent,
tokenCount: totalTokensUsed,
includedFiles,
trimmedFiles: [],
excludedFiles: [],
tokenSavings: 0,
iterationCount: iterations.length,
iterations,
apiCallCount,
totalDuration,
};
}
/**
* Scan project files based on task type
*/
private async scanProjectFiles(taskType: TaskType): Promise<IFileMetadata[]> {
const configManager = ConfigManager.getInstance();
const taskConfig = configManager.getTaskConfig(taskType);
const includeGlobs = taskConfig?.includePaths?.map(p => `${p}/**/*.ts`) || [
'ts/**/*.ts',
'ts*/**/*.ts'
];
const configGlobs = [
'package.json',
'readme.md',
'readme.hints.md',
'npmextra.json'
];
return await this.lazyLoader.scanFiles([...configGlobs, ...includeGlobs]);
}
/**
* Get AI decision on which files to load
*/
private async getFileSelectionDecision(
allMetadata: IFileMetadata[],
analyzedFiles: any[],
taskType: TaskType,
iteration: number,
tokensUsed: number,
remainingBudget: number,
loadedContent: string
): Promise<IFileSelectionDecision> {
const isFirstIteration = iteration === 1;
const fileLimit = isFirstIteration
? this.config.firstPassFileLimit
: this.config.subsequentPassFileLimit;
const systemPrompt = this.buildFileSelectionPrompt(
allMetadata,
analyzedFiles,
taskType,
iteration,
tokensUsed,
remainingBudget,
loadedContent,
fileLimit
);
const response = await this.openaiInstance.chat({
systemMessage: `You are an AI assistant that helps select the most relevant files for code analysis.
You must respond ONLY with valid JSON that can be parsed with JSON.parse().
Do not wrap the JSON in markdown code blocks or add any other text.`,
userMessage: systemPrompt,
messageHistory: [],
});
// Parse JSON response, handling potential markdown formatting
const content = response.message.replace('```json', '').replace('```', '').trim();
const parsed = JSON.parse(content);
return {
reasoning: parsed.reasoning || 'No reasoning provided',
filesToLoad: parsed.files_to_load || [],
estimatedTokensNeeded: parsed.estimated_tokens_needed,
};
}
/**
* Build prompt for file selection
*/
private buildFileSelectionPrompt(
metadata: IFileMetadata[],
analyzedFiles: any[],
taskType: TaskType,
iteration: number,
tokensUsed: number,
remainingBudget: number,
loadedContent: string,
fileLimit: number
): string {
const taskDescriptions = {
readme: 'generating a comprehensive README that explains the project\'s purpose, features, and API',
commit: 'analyzing code changes to generate an intelligent commit message',
description: 'generating a concise project description for package.json',
};
const alreadyLoadedFiles = loadedContent
? loadedContent.split('\n======').slice(1).map(section => {
const match = section.match(/START OF FILE (.+?) ======/);
return match ? match[1] : '';
}).filter(Boolean)
: [];
const availableFiles = metadata
.filter(m => !alreadyLoadedFiles.includes(m.relativePath))
.map(m => {
const analysis = analyzedFiles.find(a => a.path === m.path);
return `- ${m.relativePath} (${m.size} bytes, ~${m.estimatedTokens} tokens${analysis ? `, importance: ${analysis.importanceScore.toFixed(2)}` : ''})`;
})
.join('\n');
return `You are building context for ${taskDescriptions[taskType]} in a TypeScript project.
ITERATION: ${iteration}
TOKENS USED: ${tokensUsed}/${tokensUsed + remainingBudget} (${Math.round((tokensUsed / (tokensUsed + remainingBudget)) * 100)}%)
REMAINING BUDGET: ${remainingBudget} tokens
${alreadyLoadedFiles.length > 0 ? `FILES ALREADY LOADED:\n${alreadyLoadedFiles.map(f => `- ${f}`).join('\n')}\n\n` : ''}AVAILABLE FILES (not yet loaded):
${availableFiles}
Your task: Select up to ${fileLimit} files that will give you the MOST understanding for this ${taskType} task.
${iteration === 1 ? `This is the FIRST iteration. Focus on:
- Main entry points (index.ts, main exports)
- Core classes and interfaces
- Package configuration
` : `This is iteration ${iteration}. You've already seen some files. Now focus on:
- Files that complement what you've already loaded
- Dependencies of already-loaded files
- Missing pieces for complete understanding
`}
Consider:
1. File importance scores (if provided)
2. File paths (ts/index.ts is likely more important than ts/internal/utils.ts)
3. Token efficiency (prefer smaller files if they provide good information)
4. Remaining budget (${remainingBudget} tokens)
Respond in JSON format:
{
"reasoning": "Brief explanation of why you're selecting these files",
"files_to_load": ["path/to/file1.ts", "path/to/file2.ts"],
"estimated_tokens_needed": 15000
}`;
}
/**
* Evaluate if current context is sufficient
*/
private async evaluateContextSufficiency(
loadedContent: string,
taskType: TaskType,
iteration: number,
tokensUsed: number,
remainingBudget: number
): Promise<IContextSufficiencyDecision> {
const prompt = `You have been building context for a ${taskType} task across ${iteration} iterations.
CURRENT STATE:
- Tokens used: ${tokensUsed}
- Remaining budget: ${remainingBudget}
- Files loaded: ${loadedContent.split('\n======').length - 1}
CONTEXT SO FAR:
${loadedContent.substring(0, 3000)}... (truncated for brevity)
Question: Do you have SUFFICIENT context to successfully complete the ${taskType} task?
Consider:
- For README: Do you understand the project's purpose, main features, API surface, and usage patterns?
- For commit: Do you understand what changed and why?
- For description: Do you understand the project's core value proposition?
Respond in JSON format:
{
"sufficient": true or false,
"reasoning": "Detailed explanation of your decision"
}`;
const response = await this.openaiInstance.chat({
systemMessage: `You are an AI assistant that evaluates whether gathered context is sufficient for a task.
You must respond ONLY with valid JSON that can be parsed with JSON.parse().
Do not wrap the JSON in markdown code blocks or add any other text.`,
userMessage: prompt,
messageHistory: [],
});
// Parse JSON response, handling potential markdown formatting
const content = response.message.replace('```json', '').replace('```', '').trim();
const parsed = JSON.parse(content);
return {
sufficient: parsed.sufficient || false,
reasoning: parsed.reasoning || 'No reasoning provided',
};
}
/**
* Load a single file with caching
*/
private async loadFile(filePath: string): Promise<IFileInfo> {
// Try cache first
const cached = await this.cache.get(filePath);
if (cached) {
return {
path: filePath,
relativePath: plugins.path.relative(this.projectRoot, filePath),
contents: cached.contents,
tokenCount: cached.tokenCount,
};
}
// Load from disk
const contents = await plugins.smartfile.fs.toStringSync(filePath);
const tokenCount = this.countTokens(contents);
const relativePath = plugins.path.relative(this.projectRoot, filePath);
// Cache it
const stats = await fs.promises.stat(filePath);
await this.cache.set({
path: filePath,
contents,
tokenCount,
mtime: Math.floor(stats.mtimeMs),
cachedAt: Date.now(),
});
return {
path: filePath,
relativePath,
contents,
tokenCount,
};
}
/**
* Format a file for inclusion in context
*/
private formatFileForContext(file: IFileInfo): string {
return `
====== START OF FILE ${file.relativePath} ======
${file.contents}
====== END OF FILE ${file.relativePath} ======
`;
}
/**
* Count tokens in text
*/
private countTokens(text: string): number {
try {
const tokens = plugins.gptTokenizer.encode(text);
return tokens.length;
} catch (error) {
return Math.ceil(text.length / 4);
}
}
}

View File

@@ -1,22 +1,25 @@
import * as plugins from '../plugins.js';
import { EnhancedContext } from './enhanced-context.js';
import { IterativeContextBuilder } from './iterative-context-builder.js';
import { ConfigManager } from './config-manager.js';
import type { IContextResult, TaskType } from './types.js';
import type { IIterativeContextResult, TaskType } from './types.js';
/**
* Factory class for creating task-specific context
* Factory class for creating task-specific context using iterative context building
*/
export class TaskContextFactory {
private projectDir: string;
private configManager: ConfigManager;
private openaiInstance?: any; // OpenAI provider instance
/**
* Create a new TaskContextFactory
* @param projectDirArg The project directory
* @param openaiInstance Optional pre-configured OpenAI provider instance
*/
constructor(projectDirArg: string) {
constructor(projectDirArg: string, openaiInstance?: any) {
this.projectDir = projectDirArg;
this.configManager = ConfigManager.getInstance();
this.openaiInstance = openaiInstance;
}
/**
@@ -29,71 +32,52 @@ export class TaskContextFactory {
/**
* Create context for README generation
*/
public async createContextForReadme(): Promise<IContextResult> {
const contextBuilder = new EnhancedContext(this.projectDir);
await contextBuilder.initialize();
// Get README-specific configuration
const taskConfig = this.configManager.getTaskConfig('readme');
if (taskConfig.mode) {
contextBuilder.setContextMode(taskConfig.mode);
}
// Build the context for README task
return await contextBuilder.buildContext('readme');
public async createContextForReadme(): Promise<IIterativeContextResult> {
const iterativeBuilder = new IterativeContextBuilder(
this.projectDir,
this.configManager.getIterativeConfig(),
this.openaiInstance
);
await iterativeBuilder.initialize();
return await iterativeBuilder.buildContextIteratively('readme');
}
/**
* Create context for description generation
*/
public async createContextForDescription(): Promise<IContextResult> {
const contextBuilder = new EnhancedContext(this.projectDir);
await contextBuilder.initialize();
// Get description-specific configuration
const taskConfig = this.configManager.getTaskConfig('description');
if (taskConfig.mode) {
contextBuilder.setContextMode(taskConfig.mode);
}
// Build the context for description task
return await contextBuilder.buildContext('description');
public async createContextForDescription(): Promise<IIterativeContextResult> {
const iterativeBuilder = new IterativeContextBuilder(
this.projectDir,
this.configManager.getIterativeConfig(),
this.openaiInstance
);
await iterativeBuilder.initialize();
return await iterativeBuilder.buildContextIteratively('description');
}
/**
* Create context for commit message generation
* @param gitDiff Optional git diff to include
* @param gitDiff Optional git diff to include in the context
*/
public async createContextForCommit(gitDiff?: string): Promise<IContextResult> {
const contextBuilder = new EnhancedContext(this.projectDir);
await contextBuilder.initialize();
// Get commit-specific configuration
const taskConfig = this.configManager.getTaskConfig('commit');
if (taskConfig.mode) {
contextBuilder.setContextMode(taskConfig.mode);
}
// Build the context for commit task
const contextResult = await contextBuilder.buildContext('commit');
// If git diff is provided, add it to the context
if (gitDiff) {
contextBuilder.updateWithGitDiff(gitDiff);
}
return contextBuilder.getContextResult();
public async createContextForCommit(gitDiff?: string): Promise<IIterativeContextResult> {
const iterativeBuilder = new IterativeContextBuilder(
this.projectDir,
this.configManager.getIterativeConfig(),
this.openaiInstance
);
await iterativeBuilder.initialize();
return await iterativeBuilder.buildContextIteratively('commit', gitDiff);
}
/**
* Create context for any task type
* @param taskType The task type to create context for
* @param additionalContent Optional additional content to include
* @param additionalContent Optional additional content (currently not used)
*/
public async createContextForTask(
taskType: TaskType,
additionalContent?: string
): Promise<IContextResult> {
): Promise<IIterativeContextResult> {
switch (taskType) {
case 'readme':
return this.createContextForReadme();
@@ -102,10 +86,8 @@ export class TaskContextFactory {
case 'commit':
return this.createContextForCommit(additionalContent);
default:
// Generic context for unknown task types
const contextBuilder = new EnhancedContext(this.projectDir);
await contextBuilder.initialize();
return await contextBuilder.buildContext();
// Default to readme for unknown task types
return this.createContextForReadme();
}
}

View File

@@ -66,6 +66,8 @@ export interface IContextConfig {
prioritization?: IPrioritizationWeights;
/** Tier configuration for adaptive trimming */
tiers?: ITierConfig;
/** Iterative context building configuration */
iterative?: IIterativeConfig;
}
/**
@@ -245,3 +247,78 @@ export interface IAnalysisResult {
/** Analysis duration in ms */
analysisDuration: number;
}
/**
* Configuration for iterative context building
*/
export interface IIterativeConfig {
/** Maximum number of iterations allowed */
maxIterations?: number;
/** Maximum files to request in first iteration */
firstPassFileLimit?: number;
/** Maximum files to request in subsequent iterations */
subsequentPassFileLimit?: number;
/** Temperature for AI decision making (0-1) */
temperature?: number;
/** Model to use for iterative decisions */
model?: string;
}
/**
* AI decision for file selection
*/
export interface IFileSelectionDecision {
/** AI's reasoning for file selection */
reasoning: string;
/** File paths to load */
filesToLoad: string[];
/** Estimated tokens needed */
estimatedTokensNeeded?: number;
}
/**
* AI decision for context sufficiency
*/
export interface IContextSufficiencyDecision {
/** Whether context is sufficient */
sufficient: boolean;
/** AI's reasoning */
reasoning: string;
/** Additional files needed (if not sufficient) */
additionalFilesNeeded?: string[];
}
/**
* State for a single iteration
*/
export interface IIterationState {
/** Iteration number (1-based) */
iteration: number;
/** Files loaded in this iteration */
filesLoaded: IFileInfo[];
/** Tokens used in this iteration */
tokensUsed: number;
/** Total tokens used so far */
totalTokensUsed: number;
/** AI decision made in this iteration */
decision: IFileSelectionDecision | IContextSufficiencyDecision;
/** Duration of this iteration in ms */
duration: number;
}
/**
* Result of iterative context building
*/
export interface IIterativeContextResult extends IContextResult {
/** Number of iterations performed */
iterationCount: number;
/** Details of each iteration */
iterations: IIterationState[];
/** Total API calls made */
apiCallCount: number;
/** Total duration in ms */
totalDuration: number;
}
// Export DiffProcessor types
export type { IDiffFileInfo, IProcessedDiff, IDiffProcessorOptions } from './diff-processor.js';