Skip to main content

Teams, CI/CD & Advanced Configuration Exercises

You understand CLAUDE.md hierarchies, path-scoped rules, custom skills with frontmatter, plan mode, iterative refinement, CI/CD pipelines, multi-pass review, and session management. That is eight lessons of configuration architecture and workflow design. But configuring and debugging are different skills, and both need practice.

These exercises close the gap between "I understand team configuration" and "I can build and debug a complete team infrastructure." Each exercise gives you a realistic engineering scenario. Three skills run through every exercise: configuration engineering (building hierarchies that scope correctly), pipeline design (automating Claude Code in CI/CD), and review architecture (designing multi-pass workflows that catch what single passes miss).

Download Exercise Files

Download Teams & CI/CD Exercises (ZIP)

After downloading, unzip the file. Each exercise has its own folder with an INSTRUCTIONS.md and any starter files you need.

If the download link doesn't work, visit the repository releases page directly.


How to Use These Exercises

Start each module after completing the corresponding lessons:

After Lesson...Do Module...
L01-L02: Configuration + Path RulesModule 1: Configuration Architecture
L03-L04: Skills + Plan ModeModule 2: Skills & Execution Strategy
L05: Iterative RefinementModule 3: Iterative Refinement
L06: CI/CD PipelinesModule 4: CI/CD Pipeline
L07: Multi-Pass ReviewModule 5: Review Architecture
L08: Session ManagementModule 6: Session Mastery
All of the aboveModule 7: Integration Capstones

The workflow for every exercise:

  1. Open the exercise folder from the claude-code-teams-cicd-exercises/ directory
  2. Read the INSTRUCTIONS.md inside the folder for setup steps and starter files
  3. Read the walkthrough below for context on what you are practicing and why
  4. Start Claude Code and point it at the exercise folder
  5. Work through the exercise using your own prompts, not just copying the starters
  6. Reflect on the results using the questions at the end of each exercise

The Configuration Engineering Framework

Before touching any exercise, internalize this six-step process. Every configuration task, whether building from scratch or debugging a broken setup, follows the same cycle:

  1. Identify Scope: Who needs this instruction? One user, the whole project, a specific directory, or only files matching a pattern?
  2. Choose Mechanism: CLAUDE.md, .claude/rules/, .claude/skills/, or settings.json? Each has different loading behavior and persistence.
  3. Implement: Write the configuration files in the correct locations with correct syntax.
  4. Verify: Use /memory or -p to confirm which files Claude actually loaded.
  5. Test Boundaries: Edit both matching and non-matching files to check that scoping works as intended.
  6. Document: Explain your reasoning so teammates understand why each file exists and where it lives.

This framework applies to Modules 1 and 2 directly, and its diagnostic steps (Verify, Test Boundaries) are relevant in every module that follows.


Assessment Rubric

After completing an exercise, evaluate your work on five criteria using a 1-4 scale:

CriteriaBeginner (1)Developing (2)Proficient (3)Advanced (4)
Configuration AccuracySettings don't take effectSettings work but wrong scopeCorrect scope and precedenceOptimized for team with docs
Tool SelectionUses wrong tool for taskCorrect tool, wrong configAppropriate tool, correct configCombines tools into workflows
Diagnostic SkillCannot identify issuesFinds issues, wrong root causeCorrect diagnosis using /memoryProactively prevents issues
Pipeline QualityPipeline doesn't runRuns but wrong outputCorrect structured outputProduction-ready, cost-optimized
Review DesignSingle-pass review onlyMultiple passes, no isolationPer-file + cross-file with isolationConfidence-calibrated routing

Scoring targets:

  • Module exercises (1.1 through 6.2): 12/20 or higher per exercise
  • Capstones (A, B, C): 15/20 or higher for integration mastery

Record your scores after each exercise. If you score below target, re-read the corresponding lesson and the reflection questions for improvement ideas.


Module 1: Configuration Architecture (L01-L02)

Core Skill: Building CLAUDE.md hierarchies, path-scoped rules, and verifying configuration loading

1.1Exercise 1.1 -- Build a Team Monorepo Configuration

Exercise 1.1 -- Build a Team Monorepo Configuration (Build)

The Problem: Open the module-1-configuration-architecture/exercise-1.1-monorepo-hierarchy/ folder. You are the lead engineer for a team of three developers working in a monorepo with three packages: frontend/ (React), backend/ (Python/FastAPI), and shared/ (TypeScript utilities). Each package has different coding conventions, but the team shares common standards for commit messages, PR descriptions, and documentation. Currently, all instructions live in a single 400-line CLAUDE.md that every developer complains about.

Your Task: Build a complete CLAUDE.md hierarchy for this monorepo. Create the project-level CLAUDE.md with shared standards, directory-level CLAUDE.md files for each package, and at least two path-scoped .claude/rules/ files for cross-cutting concerns (like test conventions that apply to all *.test.* files regardless of package). Verify everything loads correctly using /memory. Test that path-scoped rules activate only when editing matching files.

What You'll Learn:

  • How the three-level hierarchy (user, project, directory) organizes instructions by scope
  • That path-scoped rules with glob patterns handle cross-cutting concerns better than directory-level files
  • How /memory confirms which configuration files Claude actually loaded into context

Starter Prompt:

"Set up a monorepo configuration for our three packages."

Better Prompt (Build Toward This): "Create a CLAUDE.md hierarchy for a 3-package monorepo. The project-level CLAUDE.md should contain shared standards: commit message format (conventional commits), PR description template, and documentation style. Create directory-level CLAUDE.md files in frontend/ (React hooks, functional components, CSS modules), backend/ (FastAPI patterns, Pydantic models, async/await), and shared/ (strict TypeScript, barrel exports). Add .claude/rules/testing.md with paths matching all test files across packages. Add .claude/rules/security.md with paths matching all API route files. After creating everything, run /memory to verify all files load correctly."

Reflection Questions:

  1. When you ran /memory, did all configuration files appear? If any were missing, what was the cause?
  2. Did the path-scoped testing rules load when you edited a test file in frontend/ but not when you edited a component file? How did you verify this?
  3. If a fourth package (mobile/) were added next month, how many configuration files would you need to create or update?

1.2Exercise 1.2 -- Debug a Broken Configuration

Exercise 1.2 -- Debug a Broken Configuration (Debug)

The Problem: Open the module-1-configuration-architecture/exercise-1.2-debug-broken-config/ folder. A teammate set up the project's Claude Code configuration, but developers report two problems: (1) Python linting rules are not loading when editing backend files, and (2) the security rules meant only for API routes are loading for every file in the project. The folder contains five configuration files. Two of them have bugs.

Your Task: Trace the configuration precedence chain to identify both root causes. The first bug is a broken glob pattern in a .claude/rules/ file that never matches backend Python files. The second bug is a missing paths field in the security rules file, causing it to load unconditionally instead of only for API routes. Fix both issues and verify your fixes with /memory.

What You'll Learn:

  • How to diagnose configuration loading issues by tracing the precedence chain from user to project to directory to path-scoped
  • That a missing paths field in a .claude/rules/ file makes it load for every session, not just matching files
  • That glob pattern syntax errors silently prevent rules from loading, producing no error messages

Reflection Questions:

  1. How did you discover which glob pattern was broken? Did /memory help, or did you need to inspect the YAML frontmatter directly?
  2. What is the difference between a rules file with no paths field (always loads) and one with paths: ["**/*"] (also always loads)? When would you use each?
  3. If you had ten .claude/rules/ files and one was silently failing to load, what systematic debugging approach would you follow?

Module 2: Skills & Execution Strategy (L03-L04)

Core Skill: Creating skills with frontmatter and choosing the right execution mode for each task

2.1Exercise 2.1 -- Create Skills with Frontmatter

Exercise 2.1 -- Create Skills with Frontmatter (Build)

The Problem: Open the module-2-skills-execution-strategy/exercise-2.1-skills-with-frontmatter/ folder. Your team needs three custom skills, each requiring a different frontmatter configuration: an architecture explorer that reads the entire codebase without polluting the main conversation, a security auditor that must never modify files, and a test generator that accepts a target directory as a parameter.

Your Task: Build all three skills in .claude/skills/ with correct SKILL.md files. The architecture explorer needs context: fork so its verbose file reads stay isolated. The security auditor needs allowed-tools: [Read, Grep, Glob] to enforce read-only access. The test generator needs argument-hint: "path to the directory containing source files" so users know what parameter to provide. Test each skill by invoking it and verifying the frontmatter behavior works as expected.

What You'll Learn:

  • How context: fork isolates verbose operations in a sub-agent that returns only a summary to the main conversation
  • How allowed-tools enforces hard restrictions on which tools a skill can use, creating genuinely read-only workflows
  • How argument-hint improves the developer experience by communicating expected parameters without enforcing them

Starter Prompt:

"Create three skills for our team."

Better Prompt (Build Toward This): "Create three skills in .claude/skills/. First: architecture-explorer/SKILL.md with context: fork, description explaining it maps codebase structure, and instructions to read all source directories and produce an architecture summary. Second: safe-reviewer/SKILL.md with allowed-tools: [Read, Grep, Glob], description explaining it performs read-only security audits, and instructions to scan for common vulnerability patterns. Third: test-generator/SKILL.md with argument-hint: 'path to the directory containing source files', description explaining it generates tests for a given directory, and instructions to read source files and create corresponding test files. After creating all three, invoke each one to verify the frontmatter behavior."

Reflection Questions:

  1. When you invoked the architecture explorer, did the main conversation receive a summary or the full verbose output? How could you tell the fork worked?
  2. Try making the security auditor write a file. What happened? Does the allowed-tools restriction produce an error message or silently prevent the operation?
  3. When you invoked the test generator without providing the argument-hint parameter, what happened? Is the hint enforced or advisory?

2.2Exercise 2.2 -- Execution Mode Classifier

Exercise 2.2 -- Execution Mode Classifier (Measure)

The Problem: Open the module-2-skills-execution-strategy/exercise-2.2-execution-mode-classifier/ folder. You have ten task scenarios ranging from simple bug fixes to complex architectural migrations. Each scenario needs the right execution mode: plan mode, direct execution, or an "explore then execute" combination. Choosing wrong wastes time (plan mode for trivial fixes) or causes rework (direct execution for complex migrations).

Your Task: Read each scenario and classify it into the appropriate execution mode. Write your classification and reasoning into the provided decision-tree template. After classifying all ten, compare your answers against the expert classification in the answers/ subfolder. For any disagreements, document why you chose differently and whether you would change your answer.

What You'll Learn:

  • The three decision factors that determine execution mode: scope clarity, architectural risk, and number of valid approaches
  • That "explore then execute" (using the Explore subagent followed by direct execution) is often the best choice for medium-complexity tasks
  • That the cost of choosing the wrong mode is asymmetric: plan mode on a trivial task wastes minutes, but direct execution on a complex task can waste hours

Reflection Questions:

  1. How many of your ten classifications matched the expert answers? For the ones that differed, was your reasoning defensible or did you underestimate complexity?
  2. Which decision factor was hardest to assess: scope clarity, architectural risk, or number of valid approaches?
  3. Write a one-sentence heuristic that captures your personal rule for choosing between plan mode and direct execution.

Module 3: Iterative Refinement (L05)

Core Skill: Applying I/O examples, test-driven iteration, and the interview pattern to get precise results

3.1Exercise 3.1 -- Build a Data Transformer

Exercise 3.1 -- Build a Data Transformer (Apply)

The Problem: Open the module-3-iterative-refinement/exercise-3.1-data-transformer/ folder. You have a messy CSV export from a legacy system. Dates appear in five different formats (MM/DD/YYYY, DD-Mon-YY, YYYY.MM.DD, Unix timestamps, and ISO 8601). Phone numbers mix country codes, parentheses, dashes, and spaces. Currency values use both comma and period decimal separators. Your task is to normalize everything into a clean, consistent format.

Your Task: Write 2-3 concrete I/O examples for each messy field showing the exact input and expected output. Provide these to Claude and have it build the transformation function. Then write a test suite covering edge cases (null values, empty strings, malformed entries). Iterate by sharing test failures with Claude until all tests pass. The exercise combines I/O examples (for specification clarity) with test-driven iteration (for edge case coverage).

What You'll Learn:

  • That I/O examples eliminate ambiguity far more effectively than prose descriptions like "normalize the dates"
  • That test-driven iteration creates an automated feedback loop: fail, fix, rerun, repeat
  • That combining I/O examples with tests is more powerful than either technique alone; examples specify the common cases, tests catch the edges

Reflection Questions:

  1. How many iterations did it take to pass all tests? Were most failures from edge cases you anticipated or from cases you missed?
  2. Compare the function Claude built from your I/O examples alone (before tests) versus the final version after test-driven iteration. What changed?
  3. If you had skipped the I/O examples and gone straight to tests, would the result have been different? What value did the examples add?

3.2Exercise 3.2 -- Technique Comparison

Exercise 3.2 -- Technique Comparison (Measure)

The Problem: Open the module-3-iterative-refinement/exercise-3.2-technique-comparison/ folder. You have three different problems, each suited to a different refinement technique. Problem A is a string transformation with clear input/output patterns. Problem B is a complex validation function with dozens of edge cases. Problem C is a cache invalidation strategy in an unfamiliar domain.

Your Task: Apply a different primary technique to each problem: I/O examples for Problem A, test-driven iteration for Problem B, and the interview pattern for Problem C. After completing all three, document which technique worked best for each problem and why. Then try applying the "wrong" technique to one problem (for example, I/O examples on Problem C) and observe how the results differ.

What You'll Learn:

  • Each refinement technique has a sweet spot: I/O examples for transformations, test-driven iteration for edge-case-heavy logic, interview pattern for unfamiliar domains
  • Applying the wrong technique does not necessarily fail, but it produces weaker results that require more rework
  • Building a personal decision framework for choosing techniques saves time on every future task

Reflection Questions:

  1. Which technique produced the best first-attempt result? Which required the most iterations before you were satisfied?
  2. When you applied the "wrong" technique, what specifically went worse? Was the output incorrect, or just less efficient to reach?
  3. Write a three-row decision table: "When I see [problem type], I start with [technique], because [reason]."

Module 4: CI/CD Pipeline (L06)

Core Skill: Building automated Claude Code pipelines with structured output and duplicate avoidance

4.1Exercise 4.1 -- Build a GitHub Actions Workflow

Exercise 4.1 -- Build a GitHub Actions Workflow (Build)

The Problem: Open the module-4-cicd-pipeline/exercise-4.1-github-actions-workflow/ folder. Your team wants automated code review on every pull request. The review should produce structured JSON output conforming to a schema (file, line, severity, message, suggestion), avoid posting duplicate comments when the pipeline re-runs after a fix, and restrict Claude to read-only tools so the review cannot modify code.

Your Task: First, test your review prompt locally using claude -p to verify it produces correct output. Then build the complete GitHub Actions workflow file. Include the -p flag for non-interactive mode, --output-format json with --json-schema for structured output, and --allowedTools to restrict to read-only operations. Add duplicate avoidance by feeding previous review comments back into subsequent runs. The exercise folder includes a sample PR diff and expected output format.

What You'll Learn:

  • That the -p flag is the single most important detail for CI/CD integration; without it, the pipeline hangs waiting for interactive input
  • That --output-format json combined with --json-schema produces machine-parseable output that downstream tools can process reliably
  • That duplicate comment avoidance requires feeding prior findings back as context, not just checking for matching text

Reflection Questions:

  1. What happened when you first tested your prompt with claude -p? Did the output match the expected JSON schema on the first try, or did you need to iterate?
  2. How does your duplicate avoidance strategy handle the case where a developer partially fixes an issue? The original comment may no longer apply exactly, but the underlying problem persists.
  3. If this workflow runs on 50 PRs per day, what is the approximate monthly cost? How would you reduce it?

4.2Exercise 4.2 -- Debug a Broken Pipeline

Exercise 4.2 -- Debug a Broken Pipeline (Debug)

The Problem: Open the module-4-cicd-pipeline/exercise-4.2-debug-broken-pipeline/ folder. You have three GitHub Actions workflow files, each with bugs. Workflow A hangs indefinitely (missing -p flag). Workflow B produces unstructured text instead of JSON (wrong output format flags). Workflow C posts duplicate comments on every re-run (no duplicate avoidance), uses the wrong trigger event (runs on push instead of pull_request), and grants Claude write access to the codebase (missing --allowedTools restriction). Five bugs total across three files.

Your Task: Diagnose each bug by reading the workflow files and tracing the execution flow. For each bug, document: what the symptom would be, what the root cause is, and how to fix it. Then apply all five fixes and verify the corrected workflows would function correctly.

What You'll Learn:

  • That the -p flag omission is the most common CI/CD bug, and the symptom (pipeline hangs) does not clearly indicate the cause
  • That output format issues produce silent failures: the pipeline "succeeds" but downstream tools cannot parse the output
  • That duplicate avoidance, trigger configuration, and tool restrictions are operational concerns that are easy to overlook during initial setup

Reflection Questions:

  1. If you saw a pipeline hanging in a CI log with no error message, how quickly would you identify the missing -p flag? What other causes could produce the same symptom?
  2. Workflow C had three bugs in one file. In a real team, how would you prevent this kind of bug accumulation? Would a review checklist help?
  3. The certification exam includes a question about the -p flag. After debugging Workflow A, do you feel confident you could answer it without hesitation?

Module 5: Review Architecture (L07)

Core Skill: Designing multi-pass review workflows with session isolation and specific criteria

5.1Exercise 5.1 -- Build a Multi-Pass Review Script

Exercise 5.1 -- Build a Multi-Pass Review Script (Build)

The Problem: Open the module-5-review-architecture/exercise-5.1-multi-pass-review-script/ folder. You have a simulated pull request with 8 changed files. The files contain deliberate bugs: a SQL injection vulnerability in a query builder, a null dereference in an error handler, a broken interface contract between two services, inconsistent error response formats across three API endpoints, and several less severe issues. A single-pass review consistently misses at least two of these.

Your Task: Write a shell script that implements a multi-pass review architecture. The first pass runs per-file local analysis on each of the 8 files individually, checking for bugs, security issues, and code quality. The second pass runs a cross-file integration analysis examining data flow between files, API contract consistency, and pattern uniformity. Both passes should produce structured JSON output with confidence scores. Aggregate the findings into a single report that distinguishes per-file issues from cross-file issues.

What You'll Learn:

  • That per-file passes catch local bugs (null dereferences, SQL injection) with consistent depth across all files
  • That cross-file passes catch integration issues (broken contracts, inconsistent patterns) that per-file analysis cannot see
  • That confidence scores enable routing: high-confidence findings can be auto-posted, while low-confidence findings should go to human review

Reflection Questions:

  1. How many of the deliberate bugs did your per-file pass find? How many required the cross-file pass? Were any missed by both?
  2. Did any per-file finding get contradicted by the cross-file pass (something that looked like a bug in isolation but was correct in context)?
  3. If you added a third pass focused specifically on security, would it catch anything the first two passes missed?

5.2Exercise 5.2 -- Self-Review vs Independent Review

Exercise 5.2 -- Self-Review vs Independent Review (Measure)

The Problem: Open the module-5-review-architecture/exercise-5.2-self-vs-independent-review/ folder. You have a code module with 5 known bugs that range from obvious (syntax error) to subtle (race condition). The exercise asks you to generate code and review it in two different ways, then compare results.

Your Task: Phase 1: In Session A, ask Claude to generate an authentication module. Then, in the same session, ask Claude to review the generated code for security issues. Record every finding. Phase 2: Open a fresh Session B (a completely new Claude instance with no prior context). Provide the same generated code and ask for the same security review. Record every finding. Compare the two sets of findings and quantify the gap.

What You'll Learn:

  • That self-review (reviewing in the same session that generated the code) consistently finds fewer issues because the session retains its reasoning context
  • That independent review (a fresh session) approaches code without the generator's biases and catches issues the generator would rationalize
  • That the self-review gap is not a flaw to work around; it is a fundamental property of session context that the multi-pass architecture exploits

Reflection Questions:

  1. How many findings did Session A (self-review) produce versus Session B (independent review)? What types of bugs did the self-review miss?
  2. Did Session A flag any issues that Session B missed, or was the independent review strictly better?
  3. Based on this experiment, would you ever trust a self-review in a production workflow? Under what conditions?

Module 6: Session Mastery (L08)

Core Skill: Managing long-running investigations with resume, fork, inform, and compact

6.1Exercise 6.1 -- Multi-Day Investigation Workflow

Exercise 6.1 -- Multi-Day Investigation Workflow (Apply)

The Problem: Open the module-6-session-mastery/exercise-6.1-multi-day-investigation/ folder. You are investigating a simulated performance regression in a web application. The investigation spans multiple sessions over several "days" (simulated by the exercise). Between sessions, teammates make changes to the codebase that affect your investigation.

Your Task: Practice the full session management lifecycle. Day 1: Start a named session (claude --resume perf-investigation), investigate the performance issue, identify candidate root causes, and exit. Day 2: Resume the session, inform it about teammate changes since yesterday, continue the investigation, and use fork to explore two competing hypotheses in parallel. Day 3: Resume, evaluate both forked results, use /compact to reclaim context space while preserving key findings, and write your final report.

What You'll Learn:

  • That named sessions survive terminal closure and preserve full conversation context across days
  • That informing a resumed session about external changes prevents stale analysis based on outdated file reads
  • That /compact with targeted instructions ("preserve the root cause candidates and performance measurements") reclaims context without losing critical findings

Reflection Questions:

  1. When you resumed on Day 2, did the session remember your Day 1 findings without prompting? How detailed was its recall?
  2. After forking to explore two hypotheses, did the forked sessions diverge significantly? Was one hypothesis clearly better, or did both produce useful insights?
  3. After compacting on Day 3, what information was lost? Did the custom preservation instructions keep everything you needed?

6.2Exercise 6.2 -- Debug Stale Context

Exercise 6.2 -- Debug Stale Context (Debug)

The Problem: Open the module-6-session-mastery/exercise-6.2-debug-stale-context/ folder. You have a session transcript showing a developer who resumed a session after a two-day absence. During those two days, a teammate refactored the database module extensively. The transcript shows the resumed session making recommendations based on the old file structure, suggesting changes to functions that no longer exist, and missing new patterns introduced by the refactor.

Your Task: Read the session transcript alongside the "before" and "after" versions of the database module. Identify every instance of stale context: recommendations referencing deleted functions, assumptions about file structure that no longer hold, and analysis based on outdated code. For each instance, determine whether inform (telling the session about changes) or a fresh start (new session with a written summary) would be the better remedy. Write a recommendation document explaining your diagnosis.

What You'll Learn:

  • That stale context produces confident but wrong recommendations, because the session has no way to know its cached file reads are outdated
  • That inform works well for targeted changes (one module refactored) but a fresh start is better when changes are pervasive
  • That the decision between resume and fresh start depends on how much of the session's prior analysis is still valid

Reflection Questions:

  1. How many instances of stale context did you find in the transcript? Were any subtle enough that a developer might not notice them?
  2. For the stale references you found, would informing the session about the refactor fix all of them? Or would some require re-reading files that the session already "knows" about?
  3. Write a personal rule: "I should start fresh instead of resuming when [condition]."

Module 7: Integration Capstones

Choose one or more. These combine configuration, skills, CI/CD, review, and session management with no guided prompts.

Capstones are different from the exercises above. There are no starter prompts or better prompts. You design the entire approach yourself. Each project requires integrating concepts from multiple modules into a complete working system.

ACapstone A -- Full Team Infrastructure

Capstone A -- Full Team Infrastructure (Integration)

Open the module-7-capstones/capstone-A-full-team-infrastructure/ folder. You are setting up Claude Code infrastructure for a 4-package monorepo: a React frontend, a Python API, a Go microservice, and a shared protobuf definitions package. Build the complete infrastructure:

  • CLAUDE.md hierarchy with project-level shared standards and directory-level package conventions
  • At least 6 path-scoped .claude/rules/ files covering testing, security, API design, documentation, error handling, and logging
  • 3 custom skills: a codebase explorer (context: fork), a security scanner (allowed-tools: read-only), and a test generator (argument-hint)
  • A CI review pipeline using GitHub Actions with -p, structured JSON output, and duplicate avoidance
  • A multi-pass review script with per-file and cross-file passes

All components must work together. Verify the complete system using the checklist provided in the exercise folder.

What You'll Learn:

  • How all the configuration mechanisms (CLAUDE.md, rules, skills, CI, review) compose into a cohesive team infrastructure
  • That building the configuration hierarchy first and the CI pipeline second prevents rework
  • That a complete infrastructure is more than the sum of its parts: skills reference rules, CI uses skills, review builds on CI output

BCapstone B -- Production CI Pipeline

Capstone B -- Production CI Pipeline (Real-world)

Open the module-7-capstones/capstone-B-production-ci-pipeline/ folder. You have a realistic project with 15 files (8 Python, 7 TypeScript) and a set of requirements: automated review on every PR, test generation for uncovered code, structured JSON output for both, duplicate comment avoidance, and a cost budget of $5/day for a team averaging 10 PRs/day.

Build a production-ready CI pipeline that meets all requirements within the cost constraint. Include a CLAUDE.md that provides CI context for test generation. Track the approximate token usage of each workflow step. Document cost optimization decisions: which files to skip, when to use batch API versus real-time, and how to minimize redundant analysis.

What You'll Learn:

  • That production pipelines must balance thoroughness with cost, and the cost constraint forces real engineering tradeoffs
  • That batch API (50% cheaper, up to 24 hours latency) is appropriate for non-blocking workflows like nightly reports, while real-time API is necessary for blocking PR checks
  • That providing existing test files as context prevents duplicate test suggestions, which is both a quality and cost optimization

CCapstone C -- Audit Your Setup

Capstone C -- Audit Your Setup (Personal)

Open the module-7-capstones/capstone-C-audit-your-setup/ folder for a self-assessment template. Audit your own Claude Code configuration across four dimensions:

  1. Configuration: Do you have a CLAUDE.md? Path-scoped rules? Are instructions in the right scope?
  2. Skills: Do you have custom skills for recurring tasks? Do they use appropriate frontmatter (fork, allowed-tools, argument-hint)?
  3. CI/CD: Are you using Claude Code in any pipelines? If not, identify one workflow that would benefit.
  4. Session Management: Do you use named sessions for multi-day work? Do you inform resumed sessions about changes?

For each dimension, score yourself 1-4 using the assessment rubric. Identify one concrete improvement per dimension. Implement at least one improvement during the exercise.

What Makes This Special: This is the only exercise where the output is changes to your real working environment, not a practice project. The improvement you implement here pays dividends in every future Claude Code session.


What's Next

You have practiced configuration engineering, pipeline design, and review architecture across 15 exercises. These skills compound: every exercise builds intuition for how configuration scopes, how pipelines should be structured, and where multi-pass review catches issues single-pass misses. Next in Lesson 10 is the Chapter Quiz, where you test your mastery of all Chapter 18 concepts for the certification exam.