feat(docs): document Dual-Agent Driver/Guardian architecture, new standard tools, streaming/vision support, progress events, and updated API/export docs

This commit is contained in:
2026-01-20 01:30:26 +00:00
parent 60f8bbe1b6
commit 52d1d128c7
4 changed files with 188 additions and 15 deletions

View File

@@ -1,5 +1,18 @@
# Changelog
## 2026-01-20 - 1.4.0 - feat(docs)
document Dual-Agent Driver/Guardian architecture, new standard tools, streaming/vision support, progress events, and updated API/export docs
- Add DualAgentOrchestrator concept and describe Driver/Guardian agents and BaseToolWrapper
- Document six standard tools including new JsonValidatorTool and expanded descriptions for Filesystem, Http, Shell, Browser, Deno
- Add examples for scoped filesystem with exclusion patterns and line-range reads
- Add token streaming (onToken) and progress events (onProgress) examples and event types
- Document vision support for passing images as base64 and example usage
- Expose additional config options in docs: name, verbose, maxResultChars, maxHistoryMessages, onProgress, onToken, logPrefix
- Document additional result fields: toolCallCount, rejectionCount, toolLog, and error
- Update API signatures in docs: run(task, options?) and registerScopedFilesystemTool(basePath, excludePatterns?)
- Update re-exports to include IFilesystemToolOptions, TDenoPermission, JsonValidatorTool and re-export several smartai types
## 2026-01-20 - 1.3.0 - feat(smartagent)
add JsonValidatorTool and support passing base64-encoded images with task runs (vision-capable models); bump @push.rocks/smartai to ^0.12.0

View File

@@ -1,15 +1,39 @@
# Project Readme Hints
## Overview
`@push.rocks/smartagent` is an agentic framework built on top of `@push.rocks/smartai`. It provides autonomous AI agent capabilities including tool use, multi-step reasoning, and conversation memory.
`@push.rocks/smartagent` is a dual-agent agentic framework built on top of `@push.rocks/smartai`. It implements a Driver/Guardian architecture where the Driver proposes tool calls and the Guardian evaluates them against security policies.
## Architecture
- **SmartAgent**: Main class that wraps SmartAi and adds agentic behaviors
- **plugins.ts**: Imports and re-exports smartai
- **index.ts**: Main entry point, exports SmartAgent class and relevant types
- **DualAgentOrchestrator**: Main entry point, coordinates Driver and Guardian agents
- **DriverAgent**: Reasons about tasks, plans steps, proposes tool calls
- **GuardianAgent**: Evaluates tool calls against configured policies
- **BaseToolWrapper**: Base class for creating custom tools
- **plugins.ts**: Imports and re-exports smartai and other dependencies
## Standard Tools
1. **FilesystemTool** - File operations with scoping and exclusion patterns
2. **HttpTool** - HTTP requests
3. **ShellTool** - Secure shell commands (no injection possible)
4. **BrowserTool** - Web page interaction via Puppeteer
5. **DenoTool** - Sandboxed TypeScript/JavaScript execution
6. **JsonValidatorTool** - JSON validation and formatting
## Key Features
- Token streaming support (`onToken` callback)
- Vision support (pass images as base64)
- Progress events (`onProgress` callback)
- Scoped filesystem with exclusion patterns
- Result truncation with configurable limits
- History windowing to manage token usage
## Key Dependencies
- `@push.rocks/smartai`: Provides the underlying multi-modal AI provider interface
- `@push.rocks/smartai`: Multi-provider AI interface
- `@push.rocks/smartfs`: Filesystem operations
- `@push.rocks/smartshell`: Shell command execution
- `@push.rocks/smartbrowser`: Browser automation
- `@push.rocks/smartdeno`: Deno code execution
- `@push.rocks/smartrequest`: HTTP requests
- `minimatch`: Glob pattern matching for exclusions
## Test Structure
- Tests use `@git.zone/tstest/tapbundle`

154
readme.md
View File

@@ -50,6 +50,7 @@ flowchart TB
Shell["Shell"]
Browser["Browser"]
Deno["Deno"]
JSON["JSON Validator"]
end
Task --> Orchestrator
@@ -99,7 +100,7 @@ await orchestrator.stop();
## Standard Tools
SmartAgent comes with five battle-tested tools out of the box:
SmartAgent comes with six battle-tested tools out of the box:
### 🗂️ FilesystemTool
@@ -117,11 +118,29 @@ File and directory operations powered by `@push.rocks/smartfs`.
</tool_call>
```
**Scoped Filesystem**: Lock file operations to a specific directory:
**Scoped Filesystem**: Lock file operations to a specific directory with optional exclusion patterns:
```typescript
// Only allow access within a specific directory
orchestrator.registerScopedFilesystemTool('/home/user/workspace');
// With exclusion patterns (glob syntax)
orchestrator.registerScopedFilesystemTool('/home/user/workspace', [
'.nogit/**',
'node_modules/**',
'*.secret',
]);
```
**Line-range Reading**: Read specific portions of large files:
```typescript
<tool_call>
<tool>filesystem</tool>
<action>read</action>
<params>{"path": "/var/log/app.log", "startLine": 100, "endLine": 150}</params>
<reasoning>Reading only the relevant log section to avoid token overload</reasoning>
</tool_call>
```
### 🌐 HttpTool
@@ -212,6 +231,105 @@ By default, code runs **fully sandboxed with no permissions**. Permissions must
</tool_call>
```
### 📋 JsonValidatorTool
Validate and format JSON data. Perfect for agents to self-check their JSON output before completing tasks.
**Actions**: `validate`, `format`
```typescript
// Validate JSON with required field checking
<tool_call>
<tool>json</tool>
<action>validate</action>
<params>{
"jsonString": "{\"name\": \"test\", \"version\": \"1.0.0\"}",
"requiredFields": ["name", "version", "description"]
}</params>
<reasoning>Ensuring the config has all required fields before saving</reasoning>
</tool_call>
// Pretty-print JSON
<tool_call>
<tool>json</tool>
<action>format</action>
<params>{"jsonString": "{\"compact\":true,\"data\":[1,2,3]}"}</params>
<reasoning>Formatting JSON for readable output</reasoning>
</tool_call>
```
## 🎥 Streaming Support
SmartAgent supports token-by-token streaming for real-time output during LLM generation:
```typescript
const orchestrator = new DualAgentOrchestrator({
openaiToken: 'sk-...',
defaultProvider: 'openai',
guardianPolicyPrompt: '...',
// Token streaming callback
onToken: (token, source) => {
// source is 'driver' or 'guardian'
process.stdout.write(token);
},
});
```
This is perfect for CLI applications or UIs that need to show progress as the agent thinks.
## 🖼️ Vision Support
Pass images to vision-capable models for multimodal tasks:
```typescript
import { readFileSync } from 'fs';
// Load image as base64
const imageBase64 = readFileSync('screenshot.png').toString('base64');
// Run task with images
const result = await orchestrator.run(
'Analyze this UI screenshot and describe any usability issues',
{ images: [imageBase64] }
);
```
## 📊 Progress Events
Get real-time feedback on task execution with the `onProgress` callback:
```typescript
const orchestrator = new DualAgentOrchestrator({
openaiToken: 'sk-...',
guardianPolicyPrompt: '...',
logPrefix: '[MyAgent]', // Optional prefix for log messages
onProgress: (event) => {
// Pre-formatted log message ready for output
console.log(event.logMessage);
// Or handle specific event types
switch (event.type) {
case 'tool_proposed':
console.log(`Proposing: ${event.toolName}.${event.action}`);
break;
case 'tool_approved':
console.log(`✓ Approved`);
break;
case 'tool_rejected':
console.log(`✗ Rejected: ${event.reason}`);
break;
case 'task_completed':
console.log(`Done in ${event.iteration} iterations`);
break;
}
},
});
```
**Event Types**: `task_started`, `iteration_started`, `tool_proposed`, `guardian_evaluating`, `tool_approved`, `tool_rejected`, `tool_executing`, `tool_completed`, `task_completed`, `clarification_needed`, `max_iterations`, `max_rejections`
## Guardian Policy Examples
The Guardian's power comes from your policy. Here are battle-tested examples:
@@ -294,10 +412,19 @@ interface IDualAgentOptions {
// Agent configuration
driverSystemMessage?: string; // Custom system message for Driver
guardianPolicyPrompt: string; // REQUIRED: Policy for Guardian to enforce
name?: string; // Agent system name
verbose?: boolean; // Enable verbose logging
// Limits
maxIterations?: number; // Max task iterations (default: 20)
maxConsecutiveRejections?: number; // Abort after N rejections (default: 3)
maxResultChars?: number; // Max chars for tool results before truncation (default: 15000)
maxHistoryMessages?: number; // Max history messages for API (default: 20)
// Callbacks
onProgress?: (event: IProgressEvent) => void; // Progress event callback
onToken?: (token: string, source: 'driver' | 'guardian') => void; // Streaming callback
logPrefix?: string; // Prefix for log messages
}
```
@@ -311,6 +438,10 @@ interface IDualAgentRunResult {
iterations: number; // Number of iterations taken
history: IAgentMessage[]; // Full conversation history
status: TDualAgentRunStatus; // 'completed' | 'max_iterations_reached' | etc.
toolCallCount?: number; // Number of tool calls made
rejectionCount?: number; // Number of Guardian rejections
toolLog?: IToolExecutionLog[]; // Detailed tool execution log
error?: string; // Error message if status is 'error'
}
type TDualAgentRunStatus =
@@ -365,6 +496,7 @@ class MyCustomTool extends BaseToolWrapper {
return {
success: true,
result: { processed: params.input },
summary: `Processed input: ${params.input}`, // Optional human-readable summary
};
}
@@ -439,11 +571,11 @@ const orchestrator = new DualAgentOrchestrator({
|--------|-------------|
| `start()` | Initialize all tools and AI providers |
| `stop()` | Cleanup all tools and resources |
| `run(task: string)` | Execute a task and return result |
| `continueTask(input: string)` | Continue a task with user input |
| `run(task, options?)` | Execute a task with optional images for vision |
| `continueTask(input)` | Continue a task with user input |
| `registerTool(tool)` | Register a custom tool |
| `registerStandardTools()` | Register all built-in tools |
| `registerScopedFilesystemTool(basePath)` | Register filesystem tool with path restriction |
| `registerScopedFilesystemTool(basePath, excludePatterns?)` | Register filesystem tool with path restriction |
| `setGuardianPolicy(policy)` | Update Guardian policy at runtime |
| `getHistory()` | Get conversation history |
| `getToolNames()` | Get list of registered tool names |
@@ -459,14 +591,18 @@ export { GuardianAgent } from '@push.rocks/smartagent';
// Tools
export { BaseToolWrapper } from '@push.rocks/smartagent';
export { FilesystemTool } from '@push.rocks/smartagent';
export { FilesystemTool, type IFilesystemToolOptions } from '@push.rocks/smartagent';
export { HttpTool } from '@push.rocks/smartagent';
export { ShellTool } from '@push.rocks/smartagent';
export { BrowserTool } from '@push.rocks/smartagent';
export { DenoTool } from '@push.rocks/smartagent';
export { DenoTool, type TDenoPermission } from '@push.rocks/smartagent';
export { JsonValidatorTool } from '@push.rocks/smartagent';
// Types and interfaces
export * from '@push.rocks/smartagent'; // All interfaces
export * from '@push.rocks/smartagent'; // All interfaces
// Re-exported from @push.rocks/smartai
export { type ISmartAiOptions, type TProvider, type ChatMessage, type ChatOptions, type ChatResponse };
```
## License and Legal Information
@@ -483,7 +619,7 @@ Use of these trademarks must comply with Task Venture Capital GmbH's Trademark G
### Company Information
Task Venture Capital GmbH
Task Venture Capital GmbH
Registered at District Court Bremen HRB 35230 HB, Germany
For any legal inquiries or further information, please contact us via email at hello@task.vc.

View File

@@ -3,6 +3,6 @@
*/
export const commitinfo = {
name: '@push.rocks/smartagent',
version: '1.3.0',
version: '1.4.0',
description: 'an agentic framework built on top of @push.rocks/smartai'
}