Skip to content

Performance Best Practices

Optimize your OpenSIN agents and workflows for speed, cost, and resource efficiency.

Model Routing

Use the SmartModelRouter to automatically select the cheapest model capable of handling each task:

typescript
import { SmartModelRouter } from '@opensin/sdk'

const router = new SmartModelRouter({
  models: {
    trivial: 'gpt-4o-mini',
    simple: 'gpt-4o-mini',
    moderate: 'gpt-4o',
    complex: 'claude-sonnet-4-6',
    expert: 'claude-opus-4-6',
  },
})

// Automatically classifies task complexity and routes
const model = router.selectModel(prompt)
// "fix typo in README" → gpt-4o-mini (cheap)
// "refactor auth system" → claude-opus-4-6 (expert)

This alone can reduce LLM costs by 60-80% for typical workloads.

Context Window Management

Large context windows are expensive. Use the ContextCompressor to keep token usage minimal:

typescript
import { ContextCompressor } from '@opensin/sdk'

const compressor = new ContextCompressor({
  maxTokens: 8000,
  strategy: 'sliding-window', // or 'summarize', 'relevance'
})

// Compress conversation history before sending to LLM
const compressed = compressor.compress(messages)

Strategies

StrategyBest ForToken Savings
sliding-windowLong conversations40-60%
summarizeResearch tasks50-70%
relevanceCode editing60-80%

Parallel Tool Execution

Run independent tool calls concurrently instead of sequentially:

typescript
import { ParallelToolExecutor } from '@opensin/sdk'

const executor = new ParallelToolExecutor({
  maxWorkers: 8,
  timeout: 30_000,
})

// These run in parallel (3x faster than sequential)
const results = await executor.execute([
  { tool: 'read', args: { path: 'src/auth.ts' } },
  { tool: 'read', args: { path: 'src/db.ts' } },
  { tool: 'grep', args: { pattern: 'TODO', path: 'src/' } },
])

The executor uses path-scoped concurrency control to prevent conflicting writes to the same file.

Caching

Session Caching

Persist session state to avoid re-reading files on every turn:

typescript
import { SessionManager } from '@opensin/sdk'

const session = new SessionManager({
  persistence: 'file',    // or 'memory', 'supabase'
  cacheReads: true,        // cache file reads within session
  ttl: 3600,               // 1 hour cache TTL
})

MCP Connection Pooling

Reuse MCP connections instead of spawning new processes:

typescript
const mcp = new MCPClient({
  poolSize: 4,             // maintain 4 warm connections
  reuseConnections: true,
  idleTimeout: 60_000,
})

Cost Tracking

Monitor spend in real-time with the UsagePricing module:

typescript
import { UsagePricing } from '@opensin/sdk'

const pricing = new UsagePricing()

// Record each LLM call
pricing.record({
  model: 'claude-sonnet-4-6',
  inputTokens: 2500,
  outputTokens: 800,
})

// Get session summary
const summary = pricing.getSummary()
// { totalCost: 0.0234, totalTokens: 3300, calls: 1 }

Agent Loop Optimization

Reduce Round-Trips

  • Batch independent tool calls in a single turn
  • Use glob + read together instead of multiple grep calls
  • Prefer edit over write for partial file changes

Early Exit

Configure the agent loop to stop early when the task is clearly complete:

typescript
const agent = new AgentLoop({
  maxTurns: 20,
  earlyExit: true,         // stop when no more tool calls
  idleTimeout: 10_000,     // stop after 10s of no activity
})

Infrastructure Sizing

ComponentRecommendedNotes
n8n (OCI VM)2 OCPU, 12GB RAMHandles 50+ concurrent workflows
Supabase (OCI VM)200GB storageConnection pooling enabled
HF SpacesFree tierUse keep-alive pings
MCP servers256MB per processPool and reuse connections

Benchmarks

Typical performance for an OpenSIN agent on a standard workload (read 10 files, edit 3, run tests):

MetricWithout OptimizationWith Optimization
Total time45s18s
LLM calls126
Tokens used85,00032,000
Estimated cost$0.42$0.11

Released under the Apache 2.0 License.