diff --git a/changelog.md b/changelog.md index 5d9440c..2bea5ce 100644 --- a/changelog.md +++ b/changelog.md @@ -1,5 +1,18 @@ # Changelog +## 2026-01-20 - 1.4.0 - feat(docs) +document Dual-Agent Driver/Guardian architecture, new standard tools, streaming/vision support, progress events, and updated API/export docs + +- Add DualAgentOrchestrator concept and describe Driver/Guardian agents and BaseToolWrapper +- Document six standard tools including new JsonValidatorTool and expanded descriptions for Filesystem, Http, Shell, Browser, Deno +- Add examples for scoped filesystem with exclusion patterns and line-range reads +- Add token streaming (onToken) and progress events (onProgress) examples and event types +- Document vision support for passing images as base64 and example usage +- Expose additional config options in docs: name, verbose, maxResultChars, maxHistoryMessages, onProgress, onToken, logPrefix +- Document additional result fields: toolCallCount, rejectionCount, toolLog, and error +- Update API signatures in docs: run(task, options?) and registerScopedFilesystemTool(basePath, excludePatterns?) +- Update re-exports to include IFilesystemToolOptions, TDenoPermission, JsonValidatorTool and re-export several smartai types + ## 2026-01-20 - 1.3.0 - feat(smartagent) add JsonValidatorTool and support passing base64-encoded images with task runs (vision-capable models); bump @push.rocks/smartai to ^0.12.0 diff --git a/readme.hints.md b/readme.hints.md index b886c16..75ec482 100644 --- a/readme.hints.md +++ b/readme.hints.md @@ -1,15 +1,39 @@ # Project Readme Hints ## Overview -`@push.rocks/smartagent` is an agentic framework built on top of `@push.rocks/smartai`. It provides autonomous AI agent capabilities including tool use, multi-step reasoning, and conversation memory. +`@push.rocks/smartagent` is a dual-agent agentic framework built on top of `@push.rocks/smartai`. It implements a Driver/Guardian architecture where the Driver proposes tool calls and the Guardian evaluates them against security policies. ## Architecture -- **SmartAgent**: Main class that wraps SmartAi and adds agentic behaviors -- **plugins.ts**: Imports and re-exports smartai -- **index.ts**: Main entry point, exports SmartAgent class and relevant types +- **DualAgentOrchestrator**: Main entry point, coordinates Driver and Guardian agents +- **DriverAgent**: Reasons about tasks, plans steps, proposes tool calls +- **GuardianAgent**: Evaluates tool calls against configured policies +- **BaseToolWrapper**: Base class for creating custom tools +- **plugins.ts**: Imports and re-exports smartai and other dependencies + +## Standard Tools +1. **FilesystemTool** - File operations with scoping and exclusion patterns +2. **HttpTool** - HTTP requests +3. **ShellTool** - Secure shell commands (no injection possible) +4. **BrowserTool** - Web page interaction via Puppeteer +5. **DenoTool** - Sandboxed TypeScript/JavaScript execution +6. **JsonValidatorTool** - JSON validation and formatting + +## Key Features +- Token streaming support (`onToken` callback) +- Vision support (pass images as base64) +- Progress events (`onProgress` callback) +- Scoped filesystem with exclusion patterns +- Result truncation with configurable limits +- History windowing to manage token usage ## Key Dependencies -- `@push.rocks/smartai`: Provides the underlying multi-modal AI provider interface +- `@push.rocks/smartai`: Multi-provider AI interface +- `@push.rocks/smartfs`: Filesystem operations +- `@push.rocks/smartshell`: Shell command execution +- `@push.rocks/smartbrowser`: Browser automation +- `@push.rocks/smartdeno`: Deno code execution +- `@push.rocks/smartrequest`: HTTP requests +- `minimatch`: Glob pattern matching for exclusions ## Test Structure - Tests use `@git.zone/tstest/tapbundle` diff --git a/readme.md b/readme.md index 9539a39..635e78e 100644 --- a/readme.md +++ b/readme.md @@ -50,6 +50,7 @@ flowchart TB Shell["Shell"] Browser["Browser"] Deno["Deno"] + JSON["JSON Validator"] end Task --> Orchestrator @@ -99,7 +100,7 @@ await orchestrator.stop(); ## Standard Tools -SmartAgent comes with five battle-tested tools out of the box: +SmartAgent comes with six battle-tested tools out of the box: ### 🗂️ FilesystemTool @@ -117,11 +118,29 @@ File and directory operations powered by `@push.rocks/smartfs`. ``` -**Scoped Filesystem**: Lock file operations to a specific directory: +**Scoped Filesystem**: Lock file operations to a specific directory with optional exclusion patterns: ```typescript // Only allow access within a specific directory orchestrator.registerScopedFilesystemTool('/home/user/workspace'); + +// With exclusion patterns (glob syntax) +orchestrator.registerScopedFilesystemTool('/home/user/workspace', [ + '.nogit/**', + 'node_modules/**', + '*.secret', +]); +``` + +**Line-range Reading**: Read specific portions of large files: + +```typescript + + filesystem + read + {"path": "/var/log/app.log", "startLine": 100, "endLine": 150} + Reading only the relevant log section to avoid token overload + ``` ### 🌐 HttpTool @@ -212,6 +231,105 @@ By default, code runs **fully sandboxed with no permissions**. Permissions must ``` +### 📋 JsonValidatorTool + +Validate and format JSON data. Perfect for agents to self-check their JSON output before completing tasks. + +**Actions**: `validate`, `format` + +```typescript +// Validate JSON with required field checking + + json + validate + { + "jsonString": "{\"name\": \"test\", \"version\": \"1.0.0\"}", + "requiredFields": ["name", "version", "description"] + } + Ensuring the config has all required fields before saving + + +// Pretty-print JSON + + json + format + {"jsonString": "{\"compact\":true,\"data\":[1,2,3]}"} + Formatting JSON for readable output + +``` + +## 🎥 Streaming Support + +SmartAgent supports token-by-token streaming for real-time output during LLM generation: + +```typescript +const orchestrator = new DualAgentOrchestrator({ + openaiToken: 'sk-...', + defaultProvider: 'openai', + guardianPolicyPrompt: '...', + + // Token streaming callback + onToken: (token, source) => { + // source is 'driver' or 'guardian' + process.stdout.write(token); + }, +}); +``` + +This is perfect for CLI applications or UIs that need to show progress as the agent thinks. + +## 🖼️ Vision Support + +Pass images to vision-capable models for multimodal tasks: + +```typescript +import { readFileSync } from 'fs'; + +// Load image as base64 +const imageBase64 = readFileSync('screenshot.png').toString('base64'); + +// Run task with images +const result = await orchestrator.run( + 'Analyze this UI screenshot and describe any usability issues', + { images: [imageBase64] } +); +``` + +## 📊 Progress Events + +Get real-time feedback on task execution with the `onProgress` callback: + +```typescript +const orchestrator = new DualAgentOrchestrator({ + openaiToken: 'sk-...', + guardianPolicyPrompt: '...', + logPrefix: '[MyAgent]', // Optional prefix for log messages + + onProgress: (event) => { + // Pre-formatted log message ready for output + console.log(event.logMessage); + + // Or handle specific event types + switch (event.type) { + case 'tool_proposed': + console.log(`Proposing: ${event.toolName}.${event.action}`); + break; + case 'tool_approved': + console.log(`✓ Approved`); + break; + case 'tool_rejected': + console.log(`✗ Rejected: ${event.reason}`); + break; + case 'task_completed': + console.log(`Done in ${event.iteration} iterations`); + break; + } + }, +}); +``` + +**Event Types**: `task_started`, `iteration_started`, `tool_proposed`, `guardian_evaluating`, `tool_approved`, `tool_rejected`, `tool_executing`, `tool_completed`, `task_completed`, `clarification_needed`, `max_iterations`, `max_rejections` + ## Guardian Policy Examples The Guardian's power comes from your policy. Here are battle-tested examples: @@ -294,10 +412,19 @@ interface IDualAgentOptions { // Agent configuration driverSystemMessage?: string; // Custom system message for Driver guardianPolicyPrompt: string; // REQUIRED: Policy for Guardian to enforce + name?: string; // Agent system name + verbose?: boolean; // Enable verbose logging // Limits maxIterations?: number; // Max task iterations (default: 20) maxConsecutiveRejections?: number; // Abort after N rejections (default: 3) + maxResultChars?: number; // Max chars for tool results before truncation (default: 15000) + maxHistoryMessages?: number; // Max history messages for API (default: 20) + + // Callbacks + onProgress?: (event: IProgressEvent) => void; // Progress event callback + onToken?: (token: string, source: 'driver' | 'guardian') => void; // Streaming callback + logPrefix?: string; // Prefix for log messages } ``` @@ -311,6 +438,10 @@ interface IDualAgentRunResult { iterations: number; // Number of iterations taken history: IAgentMessage[]; // Full conversation history status: TDualAgentRunStatus; // 'completed' | 'max_iterations_reached' | etc. + toolCallCount?: number; // Number of tool calls made + rejectionCount?: number; // Number of Guardian rejections + toolLog?: IToolExecutionLog[]; // Detailed tool execution log + error?: string; // Error message if status is 'error' } type TDualAgentRunStatus = @@ -365,6 +496,7 @@ class MyCustomTool extends BaseToolWrapper { return { success: true, result: { processed: params.input }, + summary: `Processed input: ${params.input}`, // Optional human-readable summary }; } @@ -439,11 +571,11 @@ const orchestrator = new DualAgentOrchestrator({ |--------|-------------| | `start()` | Initialize all tools and AI providers | | `stop()` | Cleanup all tools and resources | -| `run(task: string)` | Execute a task and return result | -| `continueTask(input: string)` | Continue a task with user input | +| `run(task, options?)` | Execute a task with optional images for vision | +| `continueTask(input)` | Continue a task with user input | | `registerTool(tool)` | Register a custom tool | | `registerStandardTools()` | Register all built-in tools | -| `registerScopedFilesystemTool(basePath)` | Register filesystem tool with path restriction | +| `registerScopedFilesystemTool(basePath, excludePatterns?)` | Register filesystem tool with path restriction | | `setGuardianPolicy(policy)` | Update Guardian policy at runtime | | `getHistory()` | Get conversation history | | `getToolNames()` | Get list of registered tool names | @@ -459,14 +591,18 @@ export { GuardianAgent } from '@push.rocks/smartagent'; // Tools export { BaseToolWrapper } from '@push.rocks/smartagent'; -export { FilesystemTool } from '@push.rocks/smartagent'; +export { FilesystemTool, type IFilesystemToolOptions } from '@push.rocks/smartagent'; export { HttpTool } from '@push.rocks/smartagent'; export { ShellTool } from '@push.rocks/smartagent'; export { BrowserTool } from '@push.rocks/smartagent'; -export { DenoTool } from '@push.rocks/smartagent'; +export { DenoTool, type TDenoPermission } from '@push.rocks/smartagent'; +export { JsonValidatorTool } from '@push.rocks/smartagent'; // Types and interfaces -export * from '@push.rocks/smartagent'; // All interfaces +export * from '@push.rocks/smartagent'; // All interfaces + +// Re-exported from @push.rocks/smartai +export { type ISmartAiOptions, type TProvider, type ChatMessage, type ChatOptions, type ChatResponse }; ``` ## License and Legal Information @@ -483,7 +619,7 @@ Use of these trademarks must comply with Task Venture Capital GmbH's Trademark G ### Company Information -Task Venture Capital GmbH +Task Venture Capital GmbH Registered at District Court Bremen HRB 35230 HB, Germany For any legal inquiries or further information, please contact us via email at hello@task.vc. diff --git a/ts/00_commitinfo_data.ts b/ts/00_commitinfo_data.ts index a714aae..75ed38f 100644 --- a/ts/00_commitinfo_data.ts +++ b/ts/00_commitinfo_data.ts @@ -3,6 +3,6 @@ */ export const commitinfo = { name: '@push.rocks/smartagent', - version: '1.3.0', + version: '1.4.0', description: 'an agentic framework built on top of @push.rocks/smartai' }