13. Architecture Comparison and What’s Next
Full Architecture Comparison
| Component | Claude Code | mini-claude | Difference |
|---|---|---|---|
| Agent Loop | 7 continue reasons | Only checks tool_use | Simplified loop control |
| Tool count | 66+ tools | 13 tools (6 core + web_fetch + tool_search + skill + agent + 2 plan mode) | Removed specialized tools |
| Tool execution | Concurrent execution + streaming early start | Parallel execution + streaming early start | Architecture aligned |
| API backend | Anthropic only | Anthropic + OpenAI compatible | Added OpenAI |
| System Prompt | static/dynamic split + API caching | No cache optimization | Removed caching |
| Permission system | 7 layers + AST analysis + 8-level rule sources | 5 modes + rule config + regex + confirmation | Layer alignment |
| Context management | 4-level compression pipeline | 4 layers (budget + snip + microcompact + summary) | Architecture aligned |
| Memory system | 4 types + semantic recall + MEMORY.md index | 4 types + semantic recall + MEMORY.md + async prefetch | Architecture aligned |
| Skills system | 6 sources + lazy loading + inline/fork | 2 sources + preloading + inline/fork | Removed advanced loading |
| Multi-Agent | Sub-Agent + custom + Coordinator + Swarm | Sub-Agent (3 built-in + custom) | Removed Coordinator/Swarm |
| MCP integration | mcpClient.ts + dynamic tool discovery | McpManager + JSON-RPC over stdio | Architecture aligned |
| Budget control | USD/turns/abort three-dimensional budget | USD + turn limits | Removed abort signal |
| Edit validation | 14-step pipeline | Quote normalization + uniqueness + diff output | Kept core steps |
File Mapping Table
| mini-claude (TypeScript) | mini-claude (Python) | Claude Code Source | Description |
|---|---|---|---|
src/agent.ts | python/mini_claude/agent.py | src/query.ts + src/QueryEngine.ts | Agent loop + session management |
src/tools.ts | python/mini_claude/tools.py | src/Tool.ts + src/tools/ (66 directories) | Tool definitions and execution |
src/prompt.ts | python/mini_claude/prompt.py | src/constants/prompts.ts + src/utils/claudemd.ts | Prompt construction |
src/cli.ts | python/mini_claude/__main__.py | src/entrypoints/cli.tsx + src/commands/ | Entry point and commands |
src/ui.ts | python/mini_claude/ui.py | src/components/ (React/Ink components) | UI rendering |
src/session.ts | python/mini_claude/session.py | src/utils/sessionStorage.ts + src/history.ts | Session persistence |
src/memory.ts | python/mini_claude/memory.py | src/utils/memory.ts + system prompt injection | Memory system |
src/skills.ts | python/mini_claude/skills.py | src/utils/skills.ts + src/tools/SkillTool/ | Skills system |
src/subagent.ts | python/mini_claude/subagent.py | src/tools/AgentTool/ (built-in types) | Sub-agent type configuration |
src/mcp.ts | python/mini_claude/mcp.py | src/services/mcpClient.ts | MCP client |
What We Didn’t Implement
Hooks (Hook System)
Claude Code has 25 hook events and 6 hook types, allowing custom logic to be inserted before and after tool execution — intercepting dangerous operations, recording audit logs, automatically running lint checks. It’s the key mechanism that transforms Claude Code from a “tool” into a “platform.”
Why we didn’t implement it: The core challenge isn’t “calling a function” but hook discovery and loading, error isolation, and the stdin/stdout JSON data protocol. These engineering details amount to about 500-800 lines but don’t help with understanding agent principles.
Coordinator / Swarm Multi-Agent Modes
We implemented Sub-Agent (fork-return). Claude Code has two additional modes: Coordinator breaks large tasks into pieces for multiple specialized Agents, and Swarm allows multiple Agents to communicate as peers and explore in parallel. Both modes solve the task decomposition problem when a single Agent’s context isn’t enough.
Why we didn’t implement them: The core challenge is task decomposition accuracy and inter-Agent communication protocol design — more of a prompt engineering problem than a code architecture problem. The implementation itself isn’t complex, but making it truly useful requires extensive prompt tuning.
LSP Integration
LSP gives the agent millisecond-level type error feedback after editing files, without waiting for a full compile/test cycle. In large projects, this can reduce the number of iterations needed to fix a bug by 30-50%.
Why we didn’t implement it: Requires managing LSP server processes, implementing the client protocol (initialization handshake, capability negotiation, incremental sync) — 1000+ lines and depends on deep understanding of the LSP protocol. Getting error feedback through shell commands (tsc --noEmit, python -m py_compile) is sufficient for tutorial scenarios.
Prompt Caching
The Anthropic API supports caching system prompts — Claude Code puts the unchanging parts (role definition, tool specs) first and the changing parts (git status, current file) last. Cache hits can reduce input token cost by 90%.
Why we didn’t implement it: The code change is minimal (20-30 lines), but requires careful design of the prompt partitioning strategy. If your agent is going to production, this should be the first optimization you add.
Bash AST Security Analysis
Claude Code uses tree-sitter to parse shell command ASTs, performing 23 static security checks that can analyze dangerous commands within pipe combinations — something pure regex can’t do.
Why we didn’t implement it: tree-sitter is a native C/C++ library requiring a node-gyp build environment, creating too high an environmental barrier. Regex matching covers 80% of common dangerous patterns, and the risk is acceptable for tutorial scenarios.
Progressive Enhancement Roadmap
Phase 1: Performance and Cost Optimization (1-2 days)
| Enhancement | Problem Solved | Estimated Code |
|---|---|---|
| Prompt Caching | Wasted tokens resending system prompt | ~30 lines |
Prompt Caching is the optimization with the best return on investment: add cache_control: { type: "ephemeral" } markers to the static portions of the system prompt, saving 50%+ input token cost across multi-turn conversations.
Phase 2: Extensibility (3-5 days)
| Enhancement | Problem Solved | Estimated Code |
|---|---|---|
| Hook system | Customizing agent behavior requires modifying source code | ~300 lines |
| Tool type system | switch/case doesn’t scale to 20+ tools | ~200 lines |
The core transition is from hardcoded to plugin-based. The current switch/case works fine at 10 tools, but beyond 20 you need to introduce a Tool interface (or Python’s Protocol/ABC), making each tool an independent module.
Phase 3: Reliability and Security (1-2 weeks)
| Enhancement | Problem Solved | Estimated Code |
|---|---|---|
| 7 error recovery strategies | Currently crashes on errors | ~400 lines |
| Bash AST security analysis | Regex misses complex dangerous commands | ~600 lines |
Claude Code’s query.ts has 1728 lines, most of which handle edge cases: auto-compress and retry on Prompt Too Long, exponential backoff on API overload, feed tool failures back to the model so it can self-repair.
Phase 4: Advanced Agent Capabilities (2-4 weeks)
| Enhancement | Problem Solved | Estimated Code |
|---|---|---|
| Coordinator mode | Large tasks exceed single Agent context capacity | ~500 lines |
| Swarm mode | Exploratory tasks need multi-path parallelism | ~600 lines |
| LSP integration | Type errors can only be found through compilation | ~1000 lines |
Extension Directions
1. Hooks System
The simplest approach is command hooks — spawn a shell child process before executeTool, pass tool information via stdin JSON, and parse stdout JSON to decide allow/deny.
Configuration example:
{
"hooks": {
"PreToolUse": [
{ "matcher": "run_shell", "command": "./hooks/pre-shell.sh" }
]
}
}Core logic: iterate over matching hooks, spawn child processes passing JSON, and decide whether to continue execution based on {"action": "allow"} / {"action": "deny", "reason": "..."}. About 300 lines; the most time-consuming part is handling child process timeouts and crashes.
2. Error Self-Repair
Feed tool execution errors back to the model as tool results instead of breaking the loop. The model can often self-repair: wrong path, try a different path; wrong command arguments, fix the arguments.
try {
result = await executeToolImpl(name, input);
} catch (e) {
result = `Error: ${e.message}\n\nPlease try a different approach.`;
}
// Return result as tool_result to the modelAbout 50-80 lines, but significantly improves the agent’s real-world usability — this is one of Claude Code’s smartest designs.
Core Insights
1. An Agent is essentially a while loop
while true:
response = llm.call(messages)
if no tool_calls in response: break
for tool_call in response.tool_calls:
result = execute(tool_call)
messages.append(result)
All the complexity — permissions, context management, memory, multi-agent — is enhancement and protection built around this loop.
2. Prompts are the cheapest code
A single sentence in the system prompt has the same effect as an if statement, with an implementation cost of 0 lines of code. In agent development, the optimal solution for many behavioral issues isn’t writing more code but writing better prompts — more flexible, easier to modify, and readable by non-technical people.
3. Tool design determines the capability ceiling
Let the model do what it’s good at (understanding intent, generating code), and let tools do what the model isn’t good at (exact string matching, filesystem operations, process management). edit_file is the classic example: the model generates the content to replace, and the tool handles precisely locating and replacing it in the file.
4. Context management is the agent’s “memory”
Context management is to an agent what memory management is to an operating system — using limited resources to provide the illusion of “infinite.” The 4-layer compression pipeline lets the agent maintain memory of long conversations within a finite window.
5. Security is not an afterthought
Permission checking is a step in the agent loop, not a bolted-on middleware. No tool can bypass it. More importantly, it uses a fail-closed design: if a new tool forgets to declare its permission level, it’s automatically treated as “requires confirmation” — the system guarantees safety through defaults.
6. The gap from 3,000 lines to 500,000 lines is edge cases
Most of Claude Code’s additional code handles: cross-environment compatibility, network and API unreliability, user input diversity, enterprise-grade auditing and access control. These “boring” pieces of code don’t appear in architecture diagrams, yet they’re the key to whether a tool can run reliably in the real world. From prototype to product, 80% of the distance is here.
7. The collaboration boundary between LLM and code
The most essential skill in building a coding agent: designing the right collaboration boundary between the LLM and code. What does the LLM decide, and what does the code decide? When the boundary is well-drawn, the agent is both flexible and reliable. Every design decision in this tutorial reflects this principle: the model decides “what to do,” and the code ensures “it’s done safely.”
Cross-Reference
Want to dive deeper into the design principles of each Claude Code module? Check out the detailed documentation in the companion project:
| Topic | This Tutorial | how-claude-code-works |
|---|---|---|
| Agent loop | Ch1: Agent Loop | System Main Loop |
| Tool system | Ch2: Tool System | Tool System |
| Context management | Ch7: Context Management | Context Engineering |
| Permission security | Ch6: Permissions and Security | Permissions and Security |
| Memory system | Ch8: Memory System | Memory System |
| Skills system | Ch9: Skills System | Skills System |
| Plan Mode | Ch10: Plan Mode | — |
| Multi-Agent | Ch11: Multi-Agent | Multi-Agent Architecture |
| MCP integration | Ch12: MCP Integration | — |
Conclusion
~4300 lines of code (TS) / ~3800 lines (Python), 12 files, covering the core components and advanced capabilities of a coding agent:
Phase 1 — Core Components: Agent Loop, Tool System (13 tools + mtime protection + lazy loading + parallel execution), System Prompt (Markdown template + @include + environment injection), CLI / Session (REPL + JSON persistence), Streaming Output (Anthropic + OpenAI dual backend + streaming tool execution), Permission Security (5 modes + declarative rules + regex + confirmation), Context Management (4-layer compression + large result persistence)
Phase 2 — Advanced Capabilities: Memory System (semantic recall + async prefetch), Skills System (inline/fork dual mode), Plan Mode (read-only planning + 4-option approval), Multi-Agent (Sub-Agent + 3 built-in types + custom), MCP Integration (JSON-RPC over stdio), Budget Control
A huge amount of the code in Claude Code’s 500,000 lines is edge case handling and enterprise-grade reliability. But the core agent capabilities — understand user intent → call tools to manipulate code → iterate until complete — are exactly what these ~3400 lines do.
Now you have a feature-rich coding agent, and you understand the design intent behind every line of code. Go extend it.