Claude Code looks like a chat interface, but underneath it's a layered extension system: memory files, rules, skills, custom commands, MCP servers, subagents, hooks, agent teams, and the Agent SDK. Each solves a different problem. Each has tradeoffs the docs don't make obvious. And the real power comes from how they compose.

I've been building with this system daily: skills that drive my workflow, subagents that handle parallel research, hooks that enforce quality gates, MCP servers that bridge Claude to external APIs. Here's what I've learned about each layer and when it actually matters.

Memory: The Foundation Layer

Everything starts with CLAUDE.md. This is a markdown file at your project root that Claude reads at the start of every session. Think of it as persistent context that survives across conversations.

text
project-root/
├── CLAUDE.md                    # Project-level (checked into git)
└── .claude/
    └── settings.json            # Project settings
text
~/.claude/
├── CLAUDE.md                # User-level (all projects)
└── projects/<project>/
    └── memory/
        └── MEMORY.md        # Auto memory (Claude writes this)

Three scopes, three purposes. Project-level CLAUDE.md holds conventions your whole team shares: build commands, architecture decisions, coding standards. User-level CLAUDE.md holds your personal preferences. Auto memory is where Claude writes notes for itself as it discovers patterns in your codebase.

The auto memory system is worth understanding in detail. Claude maintains a MEMORY.md file where it records debugging insights, architectural patterns, and workflow preferences it discovers across sessions. The first 200 lines load into the system prompt automatically. Tell Claude "remember that we use pnpm, not npm" and it persists across every future session in that project.

The constraint that matters: keep CLAUDE.md concise. Every token in that file competes with your actual conversation for context space. I treat mine like infrastructure code. No comments that don't earn their bytes, no aspirational guidelines, only things Claude needs on every single interaction.

For larger projects, .claude/rules/ offers a modular alternative. Instead of one monolithic file, drop individual markdown files into this directory — code style, testing conventions, security requirements — and Claude loads them alongside CLAUDE.md. Rules also support path-scoped frontmatter: add paths: ["src/api/**/*.ts"] and that rule only activates when Claude works on matching files. Your frontend guidelines don't consume tokens when Claude is editing backend code.

Skills: Progressive Disclosure in Practice

Skills are the most architecturally interesting feature in Claude Code. They solve a real problem: you want Claude to have deep expertise across dozens of domains, but loading all that knowledge into every conversation wastes context and money.

The solution is progressive disclosure. Each skill has a SKILL.md file with YAML frontmatter:

YAML
---
name: deploy
description: Deploy the application to production
disable-model-invocation: true
allowed-tools: Bash
---
 
Deploy $ARGUMENTS to production:
 
1. Run the test suite
2. Build for the target environment
3. Push to the deployment target
4. Verify health checks pass

Here's how it works in practice. Claude loads only the skill descriptions (~100 tokens each) into context at session start. When a conversation matches a description, Claude pulls in the full skill content. Your deployment skill with 5,000 tokens of detailed instructions only loads when you're actually deploying.

Two invocation modes matter here:

Auto-invoked skills load when Claude recognizes relevance from the conversation. Good for conventions and reference material like API patterns, error handling guidelines, and code review checklists. Set user-invocable: false for background knowledge Claude should know about but that doesn't make sense as a slash command.

User-invoked skills trigger when you type /skill-name. Good for workflows with side effects: deploying, committing, sending notifications. Set disable-model-invocation: true so Claude doesn't decide to deploy because your code looks ready.

Skills can bundle entire directories of supporting files:

text
blog-content/
├── SKILL.md              # Entry point with instructions
├── references/
│   ├── practical-guide.md    # Loaded when writing guides
│   ├── thought-leadership.md # Loaded when writing opinion pieces
│   └── writing-quality.md    # Quality checklist
└── scripts/
    └── validate.sh           # Executable script

The SKILL.md references these files, but Claude only reads them when the skill logic requires it. This is genuine progressive disclosure; not a flat dump of everything into context.

Skills also support dynamic context injection with the !`command` syntax. Shell commands run before the skill content reaches Claude, replacing the placeholder with actual output:

YAML
---
name: pr-summary
description: Summarize changes in a pull request
context: fork
---
 
## Pull request context
- PR diff: !`gh pr diff`
- PR comments: !`gh pr view --comments`
 
Summarize this pull request...

This means your skill can pull live data (git status, API responses, environment variables) and inject it directly into the prompt. Claude receives a fully rendered context, not a command to execute.

Custom Commands: Lightweight Slash Commands

Alongside skills, Claude Code supports custom commands — markdown files in .claude/commands/ that become slash commands directly. Where skills are full progressive disclosure systems with reference directories and auto-invocation, commands are simpler: a prompt template with optional arguments.

Markdown
---
name: review
description: Review recent changes for quality issues
argument-hint: "[scope]"
---
 
Review the following changes for quality, security, and correctness.
 
Focus on: $ARGUMENTS
 
If no scope was provided, review all staged changes via `git diff --cached`.

Type /review auth module and Claude receives the rendered content with $ARGUMENTS replaced by "auth module". For multiple positional arguments, use $0, $1, $2 or the longer $ARGUMENTS[0] form.

The interesting pattern is composition: a command can load a skill. A /cook command might reference a broader engineering methodology skill, pulling in its detailed references only when you explicitly invoke the workflow. The command is the entry point; the skill provides the depth.

Project-level commands (.claude/commands/) live alongside your code. User-level commands (~/.claude/commands/) follow you across projects. Both become available as slash commands in every session. Custom commands have been merged into the skills system — a file at .claude/commands/review.md and a skill at .claude/skills/review/SKILL.md both create /review and work the same way. Existing command files keep working, but skills are the recommended path forward since they support reference directories and auto-invocation.

MCP Servers: The Connectivity Layer

Model Context Protocol servers bridge Claude to external systems. Without MCP, Claude can only work with files on disk and shell commands. With MCP, it can query databases, call APIs, interact with SaaS tools, and read from data sources in real time.

The architecture is important: MCP is a protocol (the specification), and MCP servers are the implementations. Claude speaks one standard protocol; each server translates that into whatever the external system needs. This decouples Claude's intelligence from your data sources.

In practice, you configure MCP servers in your settings and Claude gets new tools automatically:

JSON
{
  "mcpServers": {
    "postgres": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": {
        "DATABASE_URL": "postgresql://localhost:5432/mydb"
      }
    }
  }
}

Claude now has tools like mcp__postgres__query available. It can read your schema, write queries, and return results without you writing any glue code.

The real value is composability. An MCP server for GitHub plus an MCP server for your issue tracker means Claude can read a bug report, check the relevant code, write a fix, and open a PR, all through standardized tool calls. I wrote about the security considerations of production MCP servers in a previous post, and those concerns haven't gotten less relevant as adoption has grown.

Subagents: Isolated Execution

Subagents are where Claude Code starts behaving less like a chatbot and more like an orchestration system. A subagent is a separate Claude instance with its own context window, system prompt, tool permissions, and model selection. The main conversation spawns it, it does work independently, and it returns a summary.

Claude Code ships with three built-in subagents:

  • Explore runs on Haiku with read-only tools. Fast codebase searches that don't pollute your main context.
  • Plan researches codebases during plan mode. Same model as your conversation, read-only.
  • General-purpose gets all tools and the conversation's model. For complex multi-step work.

Custom subagents are where it gets powerful. Create a markdown file in .claude/agents/:

Markdown
---
name: code-reviewer
description: Reviews code for quality, security, and best practices.
  Use proactively after code changes.
tools: Read, Grep, Glob, Bash
model: sonnet
---
 
You are a senior code reviewer. When invoked:
1. Run git diff to see recent changes
2. Focus on modified files
3. Provide feedback organized by priority:
   critical (must fix), warnings (should fix), suggestions

The description field drives automatic delegation. Include "use proactively" and Claude will spawn this subagent after code changes without you asking. The tools field restricts what the subagent can do. A reviewer that can only read is safer than one that can also edit.

Subagents can preload skills, giving them domain expertise without discovery overhead:

YAML
---
name: api-developer
description: Implement API endpoints following team conventions
skills:
  - api-conventions
  - error-handling-patterns
---

The full skill content gets injected at startup. The subagent doesn't need to discover and load skills during execution; it already knows your conventions.

The critical constraint: subagents cannot spawn other subagents. This prevents runaway recursion but means multi-level delegation requires chaining from the main conversation or using agent teams.

Hooks: The Deterministic Layer

Hooks deserve more attention than they get. While everything else is probabilistic (Claude decides what to do based on context), hooks are deterministic. They fire every time a specific event occurs and execute exactly the code you specify.

Configure them in settings.json:

JSON
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "npx eslint --fix \"$CLAUDE_PROJECT_DIR\"/$(jq -r '.tool_input.file_path' < /dev/stdin)"
          }
        ]
      }
    ]
  }
}

This runs ESLint on every file Claude writes or edits. Not "usually." Not "when Claude remembers." Every single time.

The event system is extensive: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop, SubagentStart, SubagentStop, and more. PreToolUse hooks can block operations. A hook that rejects rm -rf commands will prevent Claude from running them regardless of what the prompt says.

Three hook types serve different needs:

Command hooks run shell scripts. Fast, predictable, scriptable. Your linter, your test runner, your deployment validator.

Prompt hooks send a single-turn LLM evaluation. Ask a fast model "should this tool call be allowed?" and get a JSON yes/no decision. More flexible than regex matching, cheaper than a full subagent.

Agent hooks spawn a subagent that can use tools to verify conditions. "Run the test suite and check if all tests pass before allowing Claude to stop." This is your quality gate with teeth.

The Stop hook deserves special attention. It fires when Claude is about to finish responding. Return {"decision": "block", "reason": "Tests not passing"} and Claude continues working instead of stopping. Combine this with an agent hook that actually runs your tests, and you get an AI developer that can't claim "done" until the code works.

Agent Teams: Multi-Session Orchestration

Agent teams are the newest capability, still experimental. Where subagents work within a single session, agent teams coordinate multiple independent Claude Code sessions.

One session leads, assigning tasks and synthesizing results. Teammates work independently with their own context windows and can communicate directly with each other, not just back to the lead. This matters for large tasks where a single context window isn't enough.

Some examples of where this shines: parallel research where teammates investigate different aspects of a problem simultaneously, cross-layer changes where frontend, backend, and test changes each get their own teammate, and debugging with competing hypotheses where teammates test different theories in parallel.

Agent teams require explicit opt-in and have real limitations around session resumption and shutdown. Other multi-agent tools exist, but most trade architectural security for capability — prompt-level guardrails instead of OS-level sandboxing, open skill marketplaces with minimal vetting, instances exposed to the public internet by default. Agent teams inherit Claude Code's permission model and sandbox, which means parallel execution without opening new attack surface.

Add to settings.json:

JSON
{
	"env": {
		"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
	}
}

Agent SDK: The Programmatic Interface

Everything above works through Claude Code's interactive CLI. The Agent SDK (@anthropic-ai/claude-agent-sdk for TypeScript, claude-agent-sdk for Python) exposes the same engine programmatically. You call a query() function, pass a prompt and options, and consume a stream of messages as Claude autonomously reads files, runs commands, and edits code.

TypeScript
import { query } from "@anthropic-ai/claude-agent-sdk";
 
for await (const message of query({
  prompt: "Find and fix the failing test in auth.py",
  options: {
    allowedTools: ["Read", "Edit", "Bash"],
    maxTurns: 10,
    maxBudgetUsd: 0.50,
  }
})) {
  if (message.type === "result") {
    console.log(`Cost: $${message.total_cost_usd}`);
  }
}

The critical design decision: the SDK loads nothing from the filesystem by default. No CLAUDE.md, no settings, no skills. This ensures predictable, reproducible behavior in CI/CD and production environments. You opt into project context explicitly when you need it.

Where the CLI is for interactive development, the SDK is for automation: code review bots that run on every PR, refactoring pipelines, research agents, or custom development tools built on Claude Code's agent loop. It supports programmatic subagents, in-process MCP servers, session management, and budget controls — the full capability set, accessible from code.

How the Layers Compose

Each layer solves one problem. The real architecture emerges from composition.

A practical example: building a new API endpoint in a monorepo. CLAUDE.md holds the project structure, naming conventions, and test commands so Claude knows the codebase on every session. A skill (/api-scaffold) contains your team's patterns for request validation, error handling, and response formatting. An MCP server connects Claude to your staging database so it can inspect the actual schema. A code-reviewer subagent runs automatically after changes, checking for security issues and style violations in its own context. A PostToolUse hook runs your linter after every file edit. Each layer handles one concern; none of them could replace the others.

The mental model:

LayerSolvesPersistenceActivation
CLAUDE.md"What does Claude need to know every session?"PermanentAlways loaded
Auto Memory"What has Claude learned over time?"PermanentFirst 200 lines auto-loaded
Rules"What instructions apply to specific files?"PermanentOn path match or always
Skills"How should Claude do specific tasks?"PermanentOn relevance match or /invoke
Custom Commands"What workflows need a quick entry point?"PermanentOn /command-name
MCP Servers"What external systems can Claude access?"SessionAlways available
Subagents"What work should run in isolation?"TaskOn delegation
Hooks"What must happen deterministically?"PermanentOn matching event
Agent Teams"What work needs parallel sessions?"SessionOn explicit coordination
Agent SDK"How do I use this from code?"PermanentOn query() call

The key is matching the right layer to the right problem. Workflow instructions belong in skills, not CLAUDE.md. File-scoped conventions belong in rules, not skills. Real-time data belongs in MCP, not skills. Verbose operations belong in subagents, not your main context. Things that must always happen belong in hooks, not prompts. And when you need to run the whole thing from code, that's the SDK.

Start with the Foundation

If you're not using Claude Code's extension system yet, start with CLAUDE.md. Get your project conventions, build commands, and architecture decisions documented. That alone will transform your sessions.

Then add one skill for your most repeated workflow. Then an MCP server for the external system you reference most. Then a hook for the quality check you always forget.

Each layer is independently valuable. Together, they turn Claude Code from an AI-powered text editor into a programmable development platform — one where your accumulated knowledge, workflows, and quality standards persist and compound across every session.