Chapter Quiz: Claude Agent SDK Mastery
You've now learned all eight unique features that differentiate Claude Agent SDK from other agent development frameworks. This quiz tests your understanding of those features and your ability to apply them in production scenarios.
The quiz has 20 questions spanning the Bloom's taxonomy progression from Remember (foundational knowledge) through Evaluate (judgment about design choices). Each question maps to specific SDK features or production patterns covered in this chapter.
How to use this quiz:
- Read each question carefully
- Think through your answer before revealing the solution
- Review the explanation even if you answered correctly—it reinforces learning
- Track your score using the scoring guide below
- Use results to identify areas for deeper review
Scoring Guide
| Score Range | Assessment | Recommended Action |
|---|---|---|
| 18-20 (90-100%) | Master | Ready to build production Digital FTEs; consider advanced optimization patterns |
| 15-17 (75-89%) | Proficient | Ready for production with senior review; revisit 1-2 concepts for deepening |
| 12-14 (60-74%) | Developing | Return to lessons covering missed concepts; practice with simple agents first |
| 9-11 (45-59%) | Emerging | Work through lessons sequentially; focus on Layer 2 collaboration patterns before Layer 4 production |
| Below 9 (under 45%) | Foundation Building | Start with Lesson 1 and build systematically; ensure each feature is understood before moving forward |
Questions
Question 1 (Remember)
Which tool allows you to capture and restore file state during agent execution?
Show answer
Answer: rewindFiles()
Explanation: File checkpointing with rewindFiles() is a unique Claude Agent SDK feature that allows you to restore files to a previous checkpoint. OpenAI SDK and Google ADK lack this capability, making it essential for building resilient agents that can safely explore multiple approaches.
Chapter Reference: Lesson 8 - File Checkpointing
Question 2 (Remember)
What is the name of the Claude Agent SDK's built-in permission system that lets you decide whether to allow or deny tool execution at runtime?
Show answer
Answer: canUseTool() callback
Explanation: canUseTool() is a runtime permission decision function that evaluates each tool call and returns true/false to allow or deny execution. This is unique to Claude Agent SDK and provides fine-grained control over agent actions, preventing unauthorized operations even when the agent requests them.
Chapter Reference: Lesson 4 - Permission Modes and Security
Question 3 (Remember)
What file path pattern does Claude Agent SDK use to load reusable Agent Skills?
Show answer
Answer: .claude/skills/*.md (SKILL.md files from filesystem)
Explanation: Claude Agent SDK automatically loads SKILL.md files from the .claude/skills/ directory through the settingSources parameter. This filesystem-based skills loading is unique to Claude Agent SDK and enables agents to access accumulated organizational knowledge from your project's skill library.
Chapter Reference: Lesson 5 - Agent Skills and Code
Question 4 (Remember)
What do slash commands in Claude Agent SDK represent?
Show answer
Answer: Custom command patterns defined in .claude/commands/*.md files
Explanation: Slash commands (like /debug, /test, /review) are custom commands you define in .claude/commands/ that agents can recognize and execute. This pattern is unique to Claude Agent SDK and allows you to create domain-specific workflows that agents can invoke, building reusable command patterns for your team.
Chapter Reference: Lesson 6 - Custom Slash Commands
Question 5 (Remember)
Which ClaudeAgentOptions parameter must be set to True to enable file checkpointing?
Show answer
Answer: enable_file_checkpointing=True
Explanation: The enable_file_checkpointing parameter in ClaudeAgentOptions activates checkpoint tracking for all file operations. Without this enabled, the agent's file modifications are not captured as checkpoints, preventing recovery operations.
Chapter Reference: Lesson 8 - File Checkpointing
Question 6 (Understand)
Explain the difference between ClaudeSDKClient and the query() function in Claude Agent SDK.
Show answer
Answer:
- ClaudeSDKClient is a streaming client that maintains connection state, allows multiple sequential queries, and provides access to lifecycle hooks and checkpoint recovery
- query() is a convenience function that runs a single query with default options and returns results
The choice depends on whether you need persistent state (ClaudeSDKClient) or simple one-off execution (query()).
Chapter Reference: Lesson 12 - SDK Client and Streaming
Question 7 (Understand)
How do Agent Skills differ from generic AI prompts or system instructions?
Show answer
Answer: Agent Skills are structured SKILL.md files with:
- Persona - Expert identity and reasoning stance
- Logic - Decision trees for when/how to apply the skill
- Context - Prerequisites and setup requirements
- MCP - Tool integrations specific to the skill
- Data - Patterns and knowledge the skill encodes
- Safety - Guardrails and what to avoid
Generic prompts are text instructions without this structure. Skills are executable knowledge that agents can load, reason about, and apply systematically.
Chapter Reference: Lesson 5 - Agent Skills and Code
Question 8 (Understand)
Why is canUseTool() different from simple allowlist/blocklist permission systems?
Show answer
Answer: Allowlists/blocklists make simple yes/no decisions: "Agent can use Read tool, cannot use Bash."
canUseTool() makes context-aware decisions:
- "Agent can use Bash to run tests, but only in /test/ directory, not /production/"
- "Agent can use Edit on .md files, but not on .json config files"
- "Agent can only use cost-tracking tools if total_cost_usd < $10"
This runtime evaluation of each specific tool call enables security policies that simple allowlists cannot express.
Chapter Reference: Lesson 4 - Permission Modes and Security
Question 9 (Understand)
What are lifecycle hooks and what problems do they solve in agent systems?
Show answer
Answer: Lifecycle hooks are callbacks that fire at key moments during agent execution:
onSessionStart()- Initialize state when session beginsonMessageReceived()- React to incoming messagesonToolCall()- Intercept tool calls before executiononToolResult()- Process tool results before sending to agentonSessionEnd()- Cleanup when session completes
They solve:
- Stateful operations - Initialize databases, load context
- Monitoring - Track execution without modifying agent code
- Validation - Inspect tool calls/results before processing
- Integration - Connect to external systems (logging, analytics)
Chapter Reference: Lesson 10 - Lifecycle Hooks
Question 10 (Understand)
How does context compaction solve the long-running agent problem?
Show answer
Answer: Long-running agents accumulate message history, causing:
- Increasing context window usage (more tokens, higher cost)
- Slower response times (more messages to process)
- Memory pressure on the system
Context compaction solves this by:
- Summarizing old message history into a concise recap
- Replacing verbose message chains with summaries
- Keeping recent messages (last 5-10) for immediate context
- Allowing agents to continue for days/weeks without context overflow
This enables persistent agents that handle thousands of interactions.
Chapter Reference: Lesson 7 - Session Management and Context
Question 11 (Apply)
You're building an agent that reads customer data files. Design a canUseTool() policy that prevents the agent from accidentally deleting customer data while still allowing legitimate read/edit operations.
Show answer
Answer:
def canUseTool(tool_name: str, tool_input: dict) -> bool:
if tool_name == "Read":
# Allow reading customer files
return True
if tool_name == "Edit":
# Allow editing only .csv and .txt files (not .py or .sql)
file_path = tool_input.get("file_path", "")
return file_path.endswith((".csv", ".txt"))
if tool_name == "Bash":
# Allow safe read commands, block all delete/remove commands
command = tool_input.get("command", "")
dangerous_patterns = ["rm ", "delete", "truncate", "drop"]
return not any(pattern in command for pattern in dangerous_patterns)
# Block all other tools
return False
Why this works:
- Read is always safe, so allow it
- Edit restricted to data formats, not code
- Bash allowed only for safe commands, blocking dangerous patterns
- Other tools blocked by default
Chapter Reference: Lesson 4 - Permission Modes and Security
Question 12 (Apply)
Your agent is tracking API call costs. Write the configuration that enables cost tracking in ClaudeAgentOptions and explain how you'd use the cost data.
Show answer
Answer:
from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
options = ClaudeAgentOptions(
model="claude-opus-4-5-20251101", # Required for cost tracking
# Cost tracking is automatic in ClaudeSDKClient
)
async with ClaudeSDKClient(options) as client:
async for message in client.query("Your task here"):
if hasattr(message, 'total_cost_usd'):
print(f"Cost so far: ${message.total_cost_usd:.4f}")
# Stop if costs exceed budget
if message.total_cost_usd > 5.00:
break
How to use cost data:
- Budget enforcement - Stop execution when costs exceed threshold
- Economic modeling - Calculate unit economics for Digital FTE pricing
- Optimization - Identify expensive operations to optimize
- Customer billing - Charge customers based on actual agent costs
Chapter Reference: Lesson 13 - Cost Tracking and Billing
Question 13 (Apply)
You need to implement a custom slash command /debug that agents can invoke to run debugging. Write the SKILL.md structure for this command.
Show answer
Answer:
Create .claude/commands/debug.md:
---
command: /debug
category: development
description: "Run debugging for the current codebase"
---
# /debug Command
## Persona
You are a debugging expert. Think like a systems diagnostician—
identify root causes, not just symptoms.
## When to Use
- Exception traces appear
- Tests fail mysteriously
- Performance issues occur
- Integration fails unexpectedly
## Logic
1. Examine error messages for root cause signals
2. Check related code sections for logical errors
3. Run targeted tests to isolate the problem
4. Propose minimal fix (not refactoring)
## Safe Operations
- Read error logs
- Run test suite
- Inspect configuration
- NOT modify production code without approval
## Example Usage
Agent receives: `/debug Exception in auth service`
Agent:
1. Reads auth service code
2. Runs auth tests to reproduce
3. Identifies root cause
4. Proposes minimal fix
Why this works:
- Defines when command applies (persona + logic)
- Constrains dangerous operations (safety)
- Gives examples agents can learn from
- Makes debugging reproducible across sessions
Chapter Reference: Lesson 6 - Custom Slash Commands
Question 14 (Apply)
You're implementing session forking to explore two different approaches. Write pseudocode showing how to fork a session, pursue an alternative approach, and optionally merge findings back.
Show answer
Answer:
import asyncio
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions
async def session_forking_pattern():
# Main session
main_options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Bash"],
enable_file_checkpointing=True
)
async with ClaudeSDKClient(main_options) as main_session:
# Do initial work
await main_session.query("Analyze the codebase")
# Get checkpoint ID after analysis
analysis_checkpoint = "checkpoint-uuid-here"
# Fork a parallel session for alternative approach
fork_options = ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Bash"],
enable_file_checkpointing=True
)
async with ClaudeSDKClient(fork_options) as fork_session:
# Fork explores different implementation
await fork_session.query("Try refactoring with async/await")
fork_results = []
async for msg in fork_session.receive_response():
fork_results.append(msg)
# Back in main session - either keep fork results or revert
if fork_results_better:
# Keep fork results
print("Fork results adopted")
else:
# Revert to main approach
await main_session.rewind_files(analysis_checkpoint)
print("Reverted to main approach")
When to use:
- Testing multiple implementation strategies
- Risk mitigation (try risky approach in fork, keep main stable)
- Performance comparison (measure both approaches)
- Architectural decision making
Chapter Reference: Lesson 7 - Session Management and Context
Question 15 (Apply)
Design a callback function for the onToolResult() lifecycle hook that validates agent outputs before processing them. What problems does this solve?
Show answer
Answer:
from claude_agent_sdk import ClaudeAgentOptions
def on_tool_result(tool_name: str, result: dict) -> dict:
"""Validate and sanitize tool results before returning to agent."""
if tool_name == "Bash":
# Validate command output
output = result.get("stdout", "")
# Check for error signals
if "error" in output.lower() or result.get("exit_code") != 0:
return {
"output": output,
"validation_warning": "Command returned non-zero exit code"
}
# Truncate massive outputs that could inflate token usage
if len(output) > 10000:
return {
"output": output[:10000] + "\n[... truncated ...]",
"validation_warning": "Output truncated to prevent context overflow"
}
if tool_name == "Read":
# Validate file contents
content = result.get("content", "")
if len(content) > 50000: # Very large file
return {
"content": content[:50000] + "\n[... file truncated ...]",
"validation_warning": "Large file truncated"
}
return result
options = ClaudeAgentOptions(
on_tool_result=on_tool_result
)
Problems this solves:
- Token overflow - Truncate results that would bloat context
- Error detection - Alert agent to failed operations
- Data validation - Ensure results meet expected format
- Sanitization - Remove sensitive data before sending to agent
- Performance - Short-circuit expensive operations based on validation
Chapter Reference: Lesson 10 - Lifecycle Hooks
Question 16 (Analyze)
Compare the three permission modes (default, acceptEdits, bypassPermissions). For each, identify a scenario where it's appropriate and a risk if misused.
Show answer
Answer:
| Mode | When Appropriate | Risk if Misused |
|---|---|---|
| Default (canUseTool() every call) | Building agents that will access production systems or user data | Slowdown from repeated permission checks; bottleneck if checks are expensive |
| acceptEdits (human pre-approves, agent executes) | Collaborative workflows where human wants to see plan before execution; code review agents | Human approves without fully reading; agent executes wrong instructions anyway |
| bypassPermissions (no checks, execute directly) | Ephemeral agents in development/testing; fully isolated environments | Agent inadvertently breaks things; security breach if credentials exposed; no audit trail |
Architectural principle:
- Production → Default (every action checked)
- Collaborative → acceptEdits (human reviews plan)
- Ephemeral/Test → bypassPermissions (speed over safety)
Red flag: Using bypassPermissions in production means agent actions are unguarded.
Chapter Reference: Lesson 4 - Permission Modes and Security
Question 17 (Analyze)
Describe the relationship between Agent Skills, Slash Commands, and Subagents. How do they differ and when would you use each?
Show answer
Answer:
| Component | What It Is | Use When | Example |
|---|---|---|---|
| Agent Skill | Reusable knowledge + decision logic | You want primary agent to know domain patterns | Security skill: "Think like security engineer when reviewing code" |
| Slash Command | Specialized workflow agent can invoke | You want to trigger domain-specific operations | /debug - run debugging workflow |
| Subagent | Autonomous agent managing single responsibility | You need independent decision-making on a task | Code review subagent that evaluates pull requests autonomously |
How they compose:
Main Agent
├── Loads Skills (knows patterns)
├── Invokes Slash Commands (delegates to workflows)
└── Manages Subagents (delegates to specialists)
Decision framework:
- Knowledge needed? → Create Skill
- Workflow to trigger? → Create Slash Command
- Independent authority? → Create Subagent
Real example:
# Main agent loads security skill
options.setting_sources = ["/project/.claude/skills/security.md"]
# Main agent can invoke specialized workflow
agent receives: "/debug-security" command
agent invokes slash command
# Main agent delegates to code review subagent
from claude_agent_sdk import AgentDefinition, Task
review_subagent = AgentDefinition(
name="code-reviewer",
role="evaluate pull requests"
)
task = Task(subagent=review_subagent, goal="review PR #123")
Chapter Reference: Lessons 5, 6, 9 - Skills, Commands, Subagents
Question 18 (Analyze)
An agent is building a microservice with 8 files. Explain how file checkpointing helps you handle the following scenarios:
- Agent makes good progress but then introduces a critical bug
- You want to compare two different architectural approaches
- You need to audit what changes the agent made
Show answer
Answer:
Scenario 1: Critical bug partway through
Without checkpointing: Manually revert 8 files one by one, re-run first part of work
With checkpointing:
# After initial analysis (good state)
good_checkpoint = "uuid-after-analysis"
# Agent works, encounters bug after refactoring 4 files
# You notice the bug
await client.rewindFiles(good_checkpoint) # Back to good state
# Agent continues with different approach
Scenario 2: Compare architectures
Without checkpointing: Try architecture A, manually revert, try architecture B
With checkpointing:
checkpoint_before = await client.getCurrentCheckpoint()
# Agent tries monolithic approach (Architecture A)
await client.query("Build monolithic auth system")
# Check results, decide to compare
await client.rewindFiles(checkpoint_before)
# Agent tries microservices approach (Architecture B)
await client.query("Build microservices auth system")
# Compare both approaches' results
Scenario 3: Audit trail
Checkpointing provides:
- Timestamp - when each checkpoint was created
- File state - exact content at each checkpoint
- Sequence - progression of changes
- Recovery points - where agent succeeded/failed
# Query checkpoint history
history = await client.getCheckpointHistory()
for checkpoint in history:
print(f"{checkpoint.id}: {checkpoint.timestamp}")
print(f" Files changed: {checkpoint.file_list}")
print(f" Agent state: {checkpoint.agent_status}")
Why this matters: Checkpointing enables safe exploration without fear of permanent damage.
Chapter Reference: Lesson 8 - File Checkpointing
Question 19 (Evaluate)
You're designing a Digital FTE that manages sensitive customer data. Choose between three architectural approaches:
- Default permissions with
canUseTool()callback - acceptEdits mode with human approval each step
- Subagent approach with limited tool access
Justify your choice considering: security, throughput, user experience, and auditability.
Show answer
Answer:
Recommended: Hybrid Approach (Default + Subagent separation)
Use Default permissions with canUseTool() for data access, delegating sensitive operations to restricted subagents:
# Main agent
main_options = ClaudeAgentOptions(
allowed_tools=["Read"], # Main can only read
can_use_tool=strict_canUseTool
)
# Sensitive operations delegated to subagent
delete_subagent = AgentDefinition(
name="data-deletion",
allowed_tools=["Edit", "Bash"], # Subagent can modify
description="Handles data deletion with audit logging"
)
# Main agent delegates to subagent for sensitive ops
task = Task(subagent=delete_subagent, goal="Delete customer #123 data")
Comparison:
| Approach | Security | Throughput | UX | Auditability |
|---|---|---|---|---|
| Default + canUseTool() | High (per-call checks) | Medium (checks add latency) | Poor (no human oversight) | Good (all actions logged) |
| acceptEdits | Highest (human reviews) | Low (human bottleneck) | Good (humans see plans) | Best (human + system audit) |
| Subagent isolation | Very High (compartmentalization) | High (parallel execution) | Fair (less transparency) | Excellent (subagent logs) |
Why hybrid works best:
- Security: Compartmentalization (subagents can't access data they don't need)
- Throughput: No human bottleneck; parallel subagent execution
- Auditability: Subagent logs + main agent logs create audit trail
- User experience: Humans see high-level summaries, not every decision
Trade-off: Less transparency than acceptEdits, but acceptable for automated workflows
Chapter Reference: Lessons 4, 9 - Permission Modes, Subagents
Question 20 (Evaluate)
You need to decide between deploying your agent as: (a) ephemeral short-lived process, (b) long-running service with context compaction, or (c) session-fork architecture with persistent state. Evaluate based on: cost, latency, reliability, and state management.
Show answer
Answer:
Decision Framework:
| Pattern | Best For | Cost | Latency | Reliability | State |
|---|---|---|---|---|---|
| Ephemeral | One-off tasks, low volume, development | Lowest (spin up, run, shut down) | High (cold starts) | Medium (stateless, simple) | None (fresh start each time) |
| Long-Running + Compaction | Persistent service, high volume, 24/7 | Medium (always running) | Low (warm connection) | High (recovery from failures) | Built-in (context summary) |
| Session Forking | Exploratory work, A/B testing, learning | Higher (multiple sessions) | Medium (parallel branches) | Highest (multiple paths) | Complex (branch tracking) |
Recommendation by Use Case:
Use Ephemeral if:
- Task runs once/day or less
- Each invocation is independent
- Example: Scheduled data export, nightly report generation
- Pricing model: Pay-per-execution
Use Long-Running + Compaction if:
- Service is always available (24/7)
- High request volume throughout day
- State persists across requests (user context, preferences)
- Example: Customer support agent, personal assistant
- Pricing model: Subscription ($199/month)
Use Session Forking if:
- You need to explore multiple solutions
- Cost of human review justifies complexity
- Testing architectural decisions
- Example: Code architecture exploration, feature design
- Pricing model: Enterprise hourly billing
Real Example - Customer Support Agent:
# Use long-running + compaction
options = ClaudeAgentOptions(
enable_context_compaction=True,
max_context_before_compaction=50000,
persistence_backend="redis" # Persist state across requests
)
# Agent handles 100+ customer requests per day
# Context compaction prevents token overflow
# State retained for multi-turn conversations
Cost Analysis:
If handling 100 requests/day:
- Ephemeral: 100 cold starts = overhead
- Long-running: Single warm connection = efficient
- Forking: Only for exploration (add 20% overhead for research phase)
Conclusion: Choose based on access pattern:
- Sparse (few times/day) → Ephemeral
- Dense (constant requests) → Long-running
- Exploratory (testing) → Forking during development, then switch to production pattern
Chapter Reference: Lessons 7, 14 - Session Management, Production Patterns
What's Next
If you scored 18-20: Congratulations! You have mastery-level understanding of Claude Agent SDK. You're ready to:
- Design and implement production Digital FTEs
- Make sophisticated architecture decisions
- Mentor others on SDK patterns
Consider exploring advanced topics:
- Multi-model orchestration (routing tasks to different models)
- Custom MCP server integration
- Advanced permission policies for team workflows
If you scored 15-17: Solid proficiency. You can build and deploy agents, but deepen your understanding of 1-2 features. Consider:
- Picking one SDK feature you scored lower on
- Building a small project using that feature
- Reviewing the corresponding lesson again with fresh eyes
If you scored 12-14: Good foundation, but more practice needed. Build a simple agent that uses 3-4 SDK features:
- Start with Read, Edit, Bash tools
- Add one permission mode (canUseTool)
- Add one lifecycle hook
- Complete a working project before moving to complex patterns
If you scored below 12: Return to foundational lessons and build incrementally. The SDK has eight features; try mastering one at a time:
- Start with basic query() and built-in tools
- Add canUseTool() for permission control
- Add Agent Skills for reusable knowledge
- Add lifecycle hooks for observability
- Add file checkpointing for recovery
- Continue with remaining features
You have solid conceptual understanding. Hands-on practice will solidify the knowledge.
Ready to build? Create a small Digital FTE that serves a specific domain. Use this quiz as a reference for which features to include. Start simple, then add complexity as you build confidence.