Skip to main content

Context Architecture: The Complete System

You learned HOW to create CLAUDE.md files, Skills, Subagents, and Hooks in Chapter 3. This lesson teaches WHY each exists and WHEN to use each one—as parts of a complete context management system.

Four Tools, Four Loading Patterns

Each tool has a different relationship with your context window:

ToolWhen It LoadsWhat LoadsContext Cost
CLAUDE.mdSession startFull contentEvery request
SkillsDescriptions at start; content when invoked~100 tokens per description; full content on useLow until needed
SubagentsWhen spawnedFresh, isolated contextZero in main session
HooksOn triggerNothing (runs externally)Zero

Understanding this table is understanding context architecture.

The Loading Timeline

At session start, Claude loads:

  1. System prompt (you don't control this)
  2. Your CLAUDE.md (full content)
  3. Skill descriptions (names and one-line summaries)
  4. MCP tool definitions (if any)
  5. Git status and workspace info

During the session, Claude loads:

  • Skill full content (when you invoke /skill-name or Claude decides it's relevant)
  • Subagent results (summaries returned, not full work)
  • Hook output (only if the hook returns messages)

What this means: CLAUDE.md consumes context from turn 1. Skills consume context only when needed. Subagents never consume your main context. Hooks run outside the context entirely.

The Decision Framework

Use this framework to choose the right tool:

Information TypeBest ToolWhy
Always needed, stableCLAUDE.mdPay the cost once, available everywhere
Sometimes needed, stableSkillOn-demand loading saves context
Needs fresh analysisSubagentIsolated context prevents pollution
Must happen every timeHookDeterministic, no LLM variance

When to Use CLAUDE.md

Put information in CLAUDE.md when:

  • Claude needs it for EVERY task (project conventions, build commands)
  • It rarely changes (architectural decisions, team agreements)
  • Removing it would cause Claude to make mistakes

Examples:

  • pnpm, not npm (always relevant)
  • Run tests with pytest -v (needed whenever testing)
  • Use snake_case for Python, camelCase for JavaScript (affects all code)

When to Use Skills

Put information in Skills when:

  • Claude needs it SOMETIMES (domain-specific workflows)
  • It's substantial (more than a few lines)
  • You might invoke it manually (/skill-name)

Examples:

  • Code review checklist (only when reviewing)
  • Deployment procedures (only when deploying)
  • API documentation (only when integrating)

When to Use Subagents

Use Subagents when:

  • Work requires reading many files or extensive research
  • You need a fresh perspective without accumulated bias
  • The work should happen in parallel

Examples:

  • Research task: "Find all usages of deprecated API"
  • Analysis task: "Review security across all auth files"
  • Parallel work: Three agents tackle three modules simultaneously

When to Use Hooks

Use Hooks when:

  • Something must happen EVERY time, no exceptions
  • It's deterministic (no LLM judgment needed)
  • It should run externally without consuming context

Examples:

  • Lint check after every file edit
  • Format validation before every commit
  • Logging for audit purposes

Context Architecture in Practice

Example: A Marketing Consultant

A marketing consultant uses Claude Code for campaign analysis:

CLAUDE.md (~50 lines, always loaded):

# Project Context

- Client: TechStartup Inc.
- Brand voice: Professional but approachable
- Avoid: Industry jargon, corporate speak

# Workflow

- All reports in Markdown
- Include metrics with sources
- Weekly summary format: Executive → Details → Recommendations

Skills (loaded when relevant):

  • /competitor-analysis — Framework for analyzing competitor campaigns
  • /metrics-dashboard — Standard metrics definitions and benchmarks
  • /campaign-brief — Template for new campaign proposals

Subagent (isolated, returns summary):

  • Research agent scans 50 competitor social posts, returns "Top 5 patterns"
  • Main context never sees 50 posts, just the summary

Hook (zero context cost):

  • After every report edit, hook validates required sections exist
  • Returns only pass/fail, doesn't consume context

The Math

Without architecture (everything in CLAUDE.md):

  • 500-line CLAUDE.md = ~4,000 tokens
  • Competitor analysis framework = ~1,500 tokens
  • Metrics definitions = ~1,000 tokens
  • Campaign template = ~800 tokens
  • Total baseline: ~7,300 tokens every request

With architecture:

  • 50-line CLAUDE.md = ~400 tokens (always)
  • 3 skill descriptions = ~150 tokens (always)
  • Skill content = ~3,300 tokens (only when invoked)
  • Research via subagent = 0 tokens in main context
  • Total baseline: ~550 tokens every request

Result: 13x reduction in baseline context load. The saved tokens go to your actual work instead of always-loaded content.

Common Architecture Mistakes

Mistake 1: Everything in CLAUDE.md

Symptom: 300+ line CLAUDE.md, Claude ignores important instructions

Problem: Attention diluted across content that's only sometimes relevant

Fix: Move domain-specific content to Skills, keep CLAUDE.md under 60 lines

Mistake 2: Never Using Subagents

Symptom: Context fills quickly during research tasks, quality degrades

Problem: All file reads and searches accumulate in main context

Fix: Delegate research to Subagents, receive summaries instead of raw data

Mistake 3: Skills for Everything

Symptom: Many skills exist but Claude rarely invokes them correctly

Problem: Skill descriptions don't clearly signal when to use them

Fix: Write clear descriptions, or set disable-model-invocation: true for manual-only skills

Mistake 4: Forgetting Hooks Exist

Symptom: Repetitive validation tasks consume LLM calls

Problem: Using Claude for deterministic checks it doesn't need to reason about

Fix: Move deterministic validations to Hooks, save context for actual reasoning

Lab: Map Your Context Architecture

Objective: Design a context architecture for your project or domain.

Step 1: Inventory Your Information

List everything Claude needs to know for your work:

# Information Inventory

## Always Needed

- [List items Claude needs every single time]

## Sometimes Needed

- [List domain-specific workflows, templates, procedures]

## Research-Heavy

- [List tasks requiring extensive file reading or analysis]

## Deterministic Checks

- [List validations that don't require reasoning]

Step 2: Apply the Framework

For each item, assign the appropriate tool:

InformationToolRationale
[Item 1]CLAUDE.md / Skill / Subagent / Hook[Why this tool]
[Item 2]......

Step 3: Calculate the Cost

Estimate token impact:

# Context Cost Analysis

## Without Architecture

- All content in CLAUDE.md: ~[X] tokens every request

## With Architecture

- CLAUDE.md baseline: ~[Y] tokens
- Skill descriptions: ~[Z] tokens
- Average skill invocation: ~[W] tokens (only when needed)

## Savings

- Baseline reduction: [percentage]
- Context available for work: [additional tokens]

Step 4: Implement One Piece

Choose the highest-impact change and implement it:

  • Move one section from CLAUDE.md to a Skill, OR
  • Create one Subagent for research tasks, OR
  • Add one Hook for deterministic validation

What You Learned

  1. Four tools have four loading patterns — CLAUDE.md always loads, Skills load on-demand, Subagents use isolated context, Hooks run externally
  2. The decision framework maps information type to appropriate tool — always-needed → CLAUDE.md, sometimes-needed → Skill, fresh-analysis → Subagent, deterministic → Hook
  3. Context architecture dramatically reduces baseline load — 10x+ reduction is achievable by distributing information appropriately
  4. Common mistakes include overloading CLAUDE.md, avoiding Subagents, unclear skill descriptions, and forgetting Hooks

Try With AI

Prompt 1: Architecture Audit

Review my current context setup:
- CLAUDE.md has [X] lines
- I have [Y] skills
- I never use subagents
- I have no hooks

Analyze where I'm wasting context.
What should I move to Skills?
What should become Subagent tasks?
What deterministic checks could be Hooks?

What you're learning: Identifying architecture inefficiencies in your own setup.

Prompt 2: Design Challenge

I'm a [your profession] working on [your project type].
My recurring tasks are:
1. [Task 1]
2. [Task 2]
3. [Task 3]

Design a context architecture:
- What goes in CLAUDE.md (under 60 lines)?
- What Skills should I create?
- What Subagent patterns would help?
- What Hooks would reduce context waste?

What you're learning: Applying the framework to your actual work.

Prompt 3: Migration Plan

I have a 400-line CLAUDE.md that I need to refactor.
Here's the current content: [paste content]

Create a migration plan:
1. What stays in CLAUDE.md? (under 60 lines)
2. What becomes Skills? (list with descriptions)
3. What changes to Subagent patterns?
4. What becomes Hooks?

Include rationale for each decision.

What you're learning: Practical migration from overloaded setup to proper architecture.