How Claude Code Works

An In-Depth Analysis of the Source Code Architecture of the Most Successful AI Coding Agent

GitHub stars
GitHub forks
License: MIT
TypeScript
Docs

Want to build one yourself? Check out the companion project Claude Code From Scratch — ~3000 lines of TypeScript, 11 step-by-step tutorial chapters, build your own Claude Code from scratch


Claude Code is the most widely used AI coding Agent today, and in our opinion the best AI coding tool available. It can understand entire code repositories, autonomously execute multi-step programming tasks, and safely run commands — all powered by the engineering wisdom distilled in 500,000+ lines of TypeScript source code.

Anthropic open-sourced (sort of) this source code. But with 500,000 lines of code, where do you even start reading?

This is why we created this project. We both faced the same problem of not being able to read such a massive codebase, and our solution was to read it together with Claude Code, having it write documentation to help us understand the source code. At the same time, we wanted to document this process, which resulted in this project.

Together with Claude Code, working overtime, we distilled 15 topic-specific documents from the source code, covering every key design decision from the core loop to security defenses. Whether you want to build your own AI Agent or want to understand and use Claude Code more deeply, this is the shortest path (probably? Even if it’s not the shortest, we’ll keep updating this project).

System Architecture

graph TB
    User[User Input] --> QE[QueryEngine Session Management]
    QE --> Query[query Main Loop]
    Query --> API[Claude API Call]
    API --> Parse{Parse Response}
    Parse -->|Text| Output[Streaming Output]
    Parse -->|Tool Call| Tools[Tool Execution Engine]
    Tools --> ReadTool[Read File]
    Tools --> EditTool[Edit File]
    Tools --> ShellTool[Shell Execution]
    Tools --> SearchTool[Search Tools]
    Tools --> MCPTool[MCP Tools]
    Tools -->|Results Fed Back| Query

    Context[Context Engineering] --> Query
    Context --> SysPrompt[System Prompt]
    Context --> GitStatus[Git Status]
    Context --> ClaudeMD[CLAUDE.md]
    Context --> Compact[Compression Pipeline]

    Perm[Permission System] --> Tools
    Perm --> Rules[Rule Layer]
    Perm --> AST[Bash AST Analysis]
    Perm --> Confirm[User Confirmation]

Why Is This Source Code Worth Studying In Depth?

Most AI Agent frameworks are “demo-level” — they declare success after getting one scenario to work. Claude Code is different. It’s a production system used daily by millions of developers, dealing with problems far more complex than any demo:

  • When conversations reach millions of tokens and the context window isn’t enough, what do you do? (How memory management and compression schemes are designed — super important)
  • With 66 built-in tools existing simultaneously, how do you coordinate them? (If all tool contexts are given to the AI, it would directly explode)
  • How do you make the user feel it’s “fast”, even though model inference itself takes tens of seconds? (How to implement pipeline design)
  • When a user tells the AI to execute rm -rf /, how do you stop it? (Safety guardrails are critical)

The solutions to these problems are hidden in the source code.

Key Designs Discovered from the Source Code

The following content all comes from actual analysis of the source code, not speculation.

Why Does Claude Code Feel So Fast to Use?

It actually does three clever things:

  1. Full-pipeline streaming output — Instead of waiting for the model to finish thinking before displaying, it shows each token immediately as it’s generated. The entire pipeline from API calls to terminal rendering is streaming.
  2. Tool pre-execution — When the model says “I want to read a certain file”, that file is actually already being read. The system starts parsing and executing tool calls while the model is still outputting, using the 5-30 second window of model generation to hide the ~1 second tool latency.
  3. 9-stage parallel startup — Unrelated initialization tasks are executed in parallel during startup, compressing the critical path to approximately 235ms.

What Happens When Things Go Wrong? — Silent Recovery

Ordinary programs show errors to users when they encounter them. Claude Code’s strategy is: For recoverable errors, the user never sees them at all.

For example, when a conversation gets too long and exceeds the context window, it doesn’t pop up an error dialog asking you to handle it manually. Instead, it quietly compresses the context and automatically retries. Output tokens hit the limit? Automatically upgrades from 4K to 64K and retries. The entire Agent loop has 7 different “continue” strategies, each corresponding to a different failure recovery path.

This is why you rarely encounter errors when using Claude Code — it’s not that there are no errors, but rather that most of them are digested internally.

What About Conversations That Are Too Long? — 4-Level Progressive Compression

This is one of the most elegant designs in the entire system. When the context is about to exceed its limit, instead of compressing everything in one shot, it processes in 4 levels progressively:

  1. Trimming — First, truncate large content blocks (old tool outputs) from historical messages
  2. Deduplication — Remove duplicate content at nearly zero cost
  3. Folding — Fold inactive conversation segments, but without modifying the original content (can be unfolded to restore)
  4. Summarization — As a last resort, launch a sub-Agent to summarize the entire conversation

Each level may release enough space so that subsequent levels don’t need to execute. Moreover, after compression, the system automatically restores the content of the 5 most recently edited files, preventing the model from forgetting what it was just working on.

How to Prevent AI from Executing Dangerous Operations? — 5-Layer Defense in Depth

Claude Code lets AI run commands directly on your computer, so the security design must be rock-solid. Instead of relying on a single “Are you sure?” dialog, it builds a 5-layer defense system:

  1. Permission modes — Different trust levels that limit the scope of executable operations
  2. Rule matching — Whitelist/blacklist based on command patterns
  3. Bash command deep analysis — This is the hardest part: using syntax tree analysis (not regex matching) to dissect the true intent of Shell commands, including 23 security checks covering command injection, environment variable leakage, special character attacks, etc.
  4. User confirmation — Dangerous operations trigger a confirmation dialog, but with 200ms debounce protection to prevent accidental confirmation from rapid keystrokes
  5. Hook validation — Allows users to define custom security rules, and can even dynamically modify tool input parameters (e.g., automatically adding --dry-run to rm)

If any of these five layers blocks it, the operation won’t execute. Defense in depth.

How Do 66 Tools Work Together?

All tools — reading files, writing files, running commands, searching, even third-party MCP tools — follow the same interface specification. This means:

  • Third-party tools and built-in tools go through the exact same execution pipeline, enjoying the same security checks and permission controls
  • Read-only tools automatically execute in parallel, write operations automatically serialize, no need to manually manage concurrency
  • When tool output exceeds 100K characters, it’s automatically saved to disk; the model only gets a summary and file path, reading the full content when needed

How Do Multiple Agents Collaborate?

Claude Code supports three multi-Agent modes:

  • Sub-Agent — The main Agent dispatches tasks to sub-Agents and waits for results to return
  • Coordinator — Pure commander mode, the coordinator can only assign tasks, cannot read files or write code itself, enforcing division of labor
  • Swarm — Multiple named Agents communicate point-to-point, each working independently

To prevent conflicts from multiple Agents modifying the same file simultaneously, the system uses Git Worktree to give each Agent an independent copy of the code.

Deep Dive Topics

#DocumentWhat You’ll Learn
1OverviewThinking behind technology choices (why Bun/React/Zod), 6 core design principles, 9-stage 235ms startup process, complete data flow panorama
2Agent LoopDual-layer architecture of the Agent loop, 7 Continue Sites for failure recovery, tool pre-execution, StreamingToolExecutor concurrency mechanism
3Context EngineeringComplete details of the 4-level compression pipeline, post-compression auto-recovery mechanism (5 files + skill reactivation), prompt caching strategy and cache break detection
4Tool SystemRegistration and concurrency control for 66 tools, detailed MCP 7 transport types, connection state machine, OAuth 2.0 + PKCE authentication flow
5Skills System6-layer skill sources and priority, lazy loading and token budget allocation, Inline/Fork dual execution modes, whitelist permission model, skill retention after compression
6Memory System4 memory types and closed taxonomy, Sonnet semantic recall and async prefetch, background memory extraction Agent, memory drift defense, team memory
7Hooks & ExtensibilityComplete 23+ Hook events panorama, 5 Hook types, 6-stage execution pipeline, PermissionRequest 4 capabilities, trust model and security
8Multi-Agent ArchitectureSub-Agent 4 execution modes and Worktree isolation, coordinator pure orchestration design, Swarm 3 execution backends and mailbox communication
9Plan ModeTwo entry paths, 5-phase and iterative dual workflows, attachment throttling mechanism, Phase 4 four experimental variants, plan file management and recovery, approval and permission restoration
10Code Editing StrategyWhy search-and-replace is better than full file rewrite, uniqueness constraints and anti-hallucination design, code-level implementation of mandatory pre-edit reading
11Task Management SystemFile-level storage with concurrency locking, 3-layer change detection, dependency tracking and atomic claiming, multi-agent task coordination, verification nudge
12Permissions & Security5-layer defense-in-depth system, tree-sitter AST analysis + 23 security checks, race confirmation mechanism and 200ms anti-misclick
13System Prompt Design7-layer progressive prompt architecture, anti-pattern inoculation, blast radius risk framework, 7 agent prompt design principles
14User Experience DesignCustom Ink renderer architecture, Yoga Flexbox layout, virtual scrolling and object pool optimization, Vim mode
15Minimal Essential Components7 minimal essential component framework, item-by-item comparison of minimal vs production implementation, evolution path from 500 lines to 500,000 lines

Who Should Read This?

You AreWhat You’ll Get
A developer who wants to build AI Agent productsAn architecture reference validated by millions of users, helping you avoid detours
A Claude Code userUnderstanding why it works this way, learning to deeply customize with Hooks and CLAUDE.md
Someone interested in AI safetyPractical security design for production AI systems, not just theories in papers
A student or AI researcherFirst-hand material on large-scale engineering practices, more real than any textbook

Key Metrics

MetricValue
Total source code lines512,000+
TypeScript files1,884
Built-in tools66+
Compression pipeline levels4
Permission defense layers5

Reading Recommendations

Only have 10 minutes?
→ Read Quick Start

Want to understand core principles?
→ Read in order: Agent LoopContext EngineeringTool System

Want to build your own AI Agent?
→ First read Minimal Essential Components, then follow the 11-chapter tutorial in claude-code-from-scratch to build it hands-on — ~3000 lines of code, each step explained against the source code

Want to customize Claude Code?
→ Read Hooks & Extensibility + Memory System + Skills System

Concerned about security?
→ Read Permissions & Security + Code Editing Strategy

Contributors

Changelog

DateChanges
2026-04-09Comprehensive review and fix of all 13 chapters: corrected inaccurate numbers/references (line counts, percentages, event counts, chapter numbering), added high-level overviews to chapters that lacked them, restructured sections for better readability (ch05 split/swap, ch08 reorder/merge), synchronized Chinese and English versions
2026-04-03Added Chapter 14: System Prompt Design Philosophy, in-depth analysis of prompt content design principles and engineering practices
2026-04-03Added dark mode, reading progress bar, back-to-top button, context-aware language switching, and other UI improvements
2026-04-03Completed English translations for all 13 documents, supporting bilingual Chinese-English switching
2026-04-01Split Memory & Skills into separate chapters (11→12 articles), renumbered 01-12 by sidebar grouping
2026-04-01Major expansion of all 12 chapters (doubled in length), added source-level implementation details, Mermaid architecture diagrams, and code examples
2026-03-31Added 3 chapters: Hooks & Extensibility, Multi-Agent Architecture, Memory & Skills System
2026-03-31Launched Docsify documentation site with search, Mermaid rendering, and chapter navigation
2026-03-31Initial release: 8 core architecture analysis documents

License

MIT