Overview
SuperHarness is a complete AI Agent development framework built with Rust + Python, benchmarked against Claude Code and LangChain.
- Python SDK: Clean API for building AI applications
- Rust CLI/TUI: High-performance terminal interface
- Multi-LLM Support: Anthropic, OpenAI, Gemini, custom endpoints
- Complete Toolchain: File operations, code editing, shell execution
Six-Layer Architecture
┌─────────────────────────────────────────────┐
│ Layer 5: Application Layer (CLI/TUI) │ ← User Interaction
├─────────────────────────────────────────────┤
│ Layer 4: Agent Runtime │ ← Agent Runtime
├─────────────────────────────────────────────┤
│ Layer 3: Tool System │ ← Tool System
├─────────────────────────────────────────────┤
│ Layer 2: Session & Memory │ ← Session & Memory
├─────────────────────────────────────────────┤
│ Layer 1: Core Services │ ← Core Services
├─────────────────────────────────────────────┤
│ Layer 0: Security Foundation │ ← Security Foundation
└─────────────────────────────────────────────┘
Python SDK
from superharness import Agent
# Create Agent
agent = Agent(
api_key="your-api-key",
provider="anthropic", # or "openai", "gemini", "custom"
model="claude-sonnet-4-6"
)
# Simple conversation
response = agent.run("Hello, how are you?")
print(response)
# Streaming output
for chunk in agent.run_stream("Tell me a story"):
print(chunk, end="", flush=True)
# Register tool
@agent.tool
def calculate(expression: str) -> float:
"""Evaluate a mathematical expression."""
return eval(expression)
# Use tool
response = agent.run("What is 123 * 456?")CLI / TUI
# Enter TUI directly
superharness
# Show help
superharness --help
# Show version
superharness --versionDevelopment Progress
Phase 0: SDK/TUI Basic Features ✅ Complete
| Module | Status | Tests |
|---|---|---|
| SDK LLM Calls | ✅ | 82 passed |
| SDK Streaming | ✅ | |
| SDK Tool Calling | ✅ | |
| SDK Session Management | ✅ | |
| TUI Core Features | ✅ | 110 passed |
| TUI Code Editor | ✅ | |
| UI Components | ✅ | 8/8 scenarios |
Phase 1: Production Features ⏳ In Progress
| Feature | Status | Description |
|---|---|---|
| Complete Toolchain | ⏳ | Bash/Read/Write/Edit/Grep |
| Agent Planning | ⏳ | Task decomposition, self-correction |
| Git Integration | ⏳ | diff/commit/PR |
| MCP Protocol | ⏳ | MCP client support |
Design Philosophy
Core Principles
- No MVP/Demo - Build complete product directly
- Benchmark Claude Code/LangChain - Match or exceed
- Precise Tasks - Clear completion criteria and acceptance conditions
- Real Validation - Validate with real user workflows
Transparency Three Principles
- No Hidden Behavior - All operations visible
- State Visualization - Real-time progress display
- Explainable Decisions - Explain tool selection and execution
Current Status
Phase 0 complete, 192 tests passed, Phase 1 in progress.