Skip to main content

Chapter 69: Multi-Agent Reliability — Errors, Escalation, Provenance & Quality

A multi-agent system is only as reliable as its weakest error path. This chapter teaches the five disciplines of multi-agent reliability: structured error propagation, escalation calibration, context management at scale, information provenance, and human review integration. These are the patterns that separate prototype-grade systems from production-grade systems.

What You'll Learn

By the end of this chapter, you'll be able to:

  • Design structured error responses with the MCP isError pattern and error category taxonomy
  • Implement subagent error recovery with partial results propagation and coordinator decision-making
  • Calibrate escalation triggers using explicit criteria and few-shot examples (not sentiment or confidence scores)
  • Preserve critical information across long sessions using the "case facts" pattern
  • Build claim-source mapping pipelines that maintain provenance through multi-agent synthesis
  • Design human review workflows with field-level confidence calibration and stratified sampling
  • Orchestrate coordinator-subagent architectures with parallel execution and iterative refinement

Chapter Structure

  1. Structured Error Propagation — The MCP isError Pattern — error categories (transient, validation, business, permission), structured metadata, access-failure vs valid-empty-result
  2. Subagent Error Recovery and Coordinator Decision-Making — local recovery first, partial results, coverage annotations in synthesis
  3. Escalation Calibration — When to Escalate vs Resolve — what works (explicit criteria + few-shot), what doesn't (confidence scores, sentiment, "be conservative")
  4. Context Management at Scale — progressive summarization risks, the "case facts" pattern, "lost in the middle" effect, scratchpad files
  5. Information Provenance in Multi-Source Synthesis — claim-source mappings, conflicting sources, temporal data, scoped verification tools
  6. Human Review Workflows & Confidence Calibration — aggregate accuracy trap, field-level scores, threshold calibration with labeled sets, stratified sampling
  7. Coordinator-Subagent Orchestration Patterns — hub-and-spoke, context isolation, Task tool, parallel execution, iterative refinement (Capstone)

Running Project

Students build a multi-agent research system (certification exam Scenario 3) with a coordinator agent, web search subagent, document analysis subagent, and synthesis subagent — progressively adding reliability engineering to each layer.

Prerequisites

  • Chapter 65: Anthropic Claude Agent SDK
  • Chapter 64: The Claude API — Agentic Loops
  • Chapters 66-67: MCP Fundamentals and Custom MCP Servers

Certification Exam Coverage

This chapter covers Claude Certified Architect — Foundations exam domains:

  • Domain 2 (18%): Task Statement 2.2 — Structured error responses
  • Domain 5 (15%): Task Statements 5.1-5.6 — Context management, escalation, error propagation, provenance, human review, confidence calibration
  • Directly covers Sample Questions 3, 7, 8, 9
  • Covers exam Scenario 3 (Multi-Agent Research System) end-to-end