Skip to main content

Payment-Enabled Agents: ACP, AP2, x402, and MPP in Production

A crash course on the four protocols that let OpenAI Agents SDK systems spend money: at merchants, against APIs, with other agents, across the open economy. For engineers who have shipped agents and now need them to pay.

19 Concepts. 5 Decisions. 3 diagrams. Four learning tracks. The Reader track is 2-3 hours of pure reading: the four-layer stack, each protocol in depth, the composition rules, no setup. The Beginner, Intermediate, and Advanced tracks add increasing hands-on depth on wiring agents to protocols, running the composition durably, and managing identity, spend, and disputes. They run about 1 day, 2-3 days, and 4-5 days. Honest estimate: 2-3 hours to read it, 4-5 days for a team to make the stack a working habit. Pick your track before the decision lab in Part 5.

Here is the one idea this course is built on: the four protocols are not rivals. They are layers. Most articles ask "ACP or x402?" That question is a mistake about what kind of thing these protocols are. A real system shipping in 2026 uses several of them together, because each one solves a different layer of the agent-commerce problem. A consumer shopping agent uses ACP at the commerce layer and card rails at settlement. An API-paying agent uses x402 at both, because for machine-to-machine micropayments those two layers collapse into one. An enterprise procurement agent uses AP2 mandates to prove the human authorized the spend, then Stripe MPP at settlement. You will learn to read the use case and reach for the right composition.

If you haven't taken the other Agent Factory courses

This course names a few siblings: the Build AI Agents crash course (SDK basics), the Production Worker crash course (running agents durably with Inngest), the Eval-Driven Development crash course, and the Choosing Agentic Architectures crash course. You can read this one without them. The four protocols, the layering, the SDK code, and the decision discipline all stand on their own. The one thing that helps: the Part 3 code assumes you can read Agent, Runner.run(), and @function_tool. If those are new, skim the Build AI Agents crash course or the OpenAI Agents SDK docs first, then come back.

What each protocol maps to if you're on a different stack

If you're not on OpenAI Agents SDK plus Stripe, Coinbase, and Cloudflare, this table maps each reference implementation to common alternatives. The protocol specs are stack-agnostic; only the primitive names differ.

ProtocolPrimary reference SDK (2026)Common alternativesLicense / governance
ACP (Agentic Commerce Protocol)Stripe SDK (stripe) plus OpenAI Agents SDK; PayPal ACP server tools; Worldpay credentialsAdyen ACP, Shopify-native checkout APIApache 2.0, OpenAI plus Stripe at github.com/agentic-commerce-protocol/agentic-commerce-protocol
AP2 (Agent Payments Protocol)Google ADK plus reference implementations in Python, TypeScript, Kotlin, GoLangGraph plus custom mandate signing, AutoGen plus a2a-x402Apache 2.0, Google plus 60+ partners at github.com/google-agentic-commerce/AP2
x402x402-client (Python), @x402/client (JS/TS), Coinbase Developer Platform, Cloudflare withX402Client, AgentPay MCPDirect EIP-3009 implementations; Lobster.cash; Crossmint agent walletsApache 2.0, created by Coinbase, now under the Linux Foundation's x402 Foundation
MPP (Machine Payments Protocol)Stripe PaymentIntents plus MPP extensions; Tempo blockchain SDKDirect Lightning Network; Tempo native SDKsApache 2.0, Stripe plus Tempo; specs at mpp.dev
A2A (Agent2Agent, the layer AP2 extends)Google ADKCustom A2A implementationsApache 2.0, Google plus Linux Foundation
MCP (Model Context Protocol, the discovery layer)Anthropic MCP servers and clients; openai-agents MCP supportLangChain MCPMIT, Anthropic

To use the table: when the course says "wire the agent's @function_tool to a Stripe ACP endpoint" and you're on Adyen plus LangGraph, read it as "wire the equivalent LangGraph tool to an Adyen ACP endpoint." The argument is the same; the names change. You do not need to learn the Stripe stack just to read this course. Map the primitives, follow the framework, apply it to your own stack.

Glossary

📖 The terms this course uses, expand on first read and refer back later

The four headline protocols

  • ACP (Agentic Commerce Protocol). The consumer-shopping protocol, built by OpenAI and Stripe. It governs how an agent completes checkout at a real merchant on a person's behalf. Powers ChatGPT Instant Checkout. Lives mostly at the commerce layer. Apache 2.0.
  • AP2 (Agent Payments Protocol). The authorization protocol, built by Google with 60+ partners. It produces signed "mandates" that prove a human allowed the agent to spend. It does not move money itself; it proves the spending was authorized. Apache 2.0.
  • x402. The HTTP-native settlement protocol, built by Coinbase and now governed by the Linux Foundation. It revives the unused HTTP 402 "Payment Required" status code so an agent can pay for an API call in one to two seconds with a stablecoin. Apache 2.0.
  • MPP (Machine Payments Protocol). Stripe and Tempo's settlement protocol. Its trick is the "session": the agent pre-approves a spending cap, then streams many small payments against it. Multi-rail (stablecoin, Lightning, cards). Apache 2.0.

The layers (the spine of the whole course)

  • Discovery layer. The layer where the agent finds what it can buy. Answered by MCP servers, A2A, agent directories, or AI shopping surfaces. The question: "what's available?"
  • Authorization layer (also Identity and Authorization). The layer that proves two things before money moves: the human allowed this spending, and the agent is who it claims to be. The question: "am I allowed to spend this?"
  • Commerce layer. The layer that runs the full purchase: cart, checkout, fulfillment, dispute, refund. The question: "what's the whole purchase lifecycle?" Skipped entirely for plain API calls.
  • Settlement layer. The layer where money actually changes hands. The question: "where does the money really move?"
  • Settlement rail (or rail). The actual pipe the money travels through: card networks (Visa/Mastercard via Stripe), stablecoins on a blockchain, bank transfer (ACH/SEPA), or Lightning. "Pick a rail" means "pick how the money moves."

Authorization primitives

  • Mandate (AP2). A signed digital proof that a human authorized a specific kind of spending. AP2 has three: Intent, Cart, and Payment (below). Together they form a chain you can audit later.
  • Intent Mandate. The first mandate, signed by the user before the agent starts: the rules of the task. Example: "buy shoes under $120." It sets the limits the agent must stay inside.
  • Cart Mandate. The middle mandate, signed by the user after the agent has built a specific cart: "yes, these exact items at this exact price." Used in flows where a human is present to approve.
  • Payment Mandate. The last mandate, signed (or auto-generated against the Intent Mandate) at the moment of payment: "authorize this exact payment on this exact rail."
  • SPT (Shared Payment Token). ACP's primitive. A one-time token from the payment processor (Stripe) locked to one merchant, one amount cap, and one short time window. If an agent cleared for $50 tries to spend $1,000, the SPT just fails. The card-rail cousin of an AP2 Payment Mandate.
  • Non-repudiable. A signed record the signer cannot later deny making. The AP2 mandate chain is non-repudiable: "I never authorized this" does not hold up against the user's own cryptographic signature.

Settlement and crypto primitives

  • Stablecoin. A cryptocurrency pegged to a stable value, usually one US dollar. Agents use it so a "$0.05 payment" stays worth five cents between sending and settling.
  • USDC. The specific dollar-pegged stablecoin most x402 payments use. One USDC is meant to equal one US dollar. Issued by Circle.
  • HTTP 402 Payment Required. An HTTP status code reserved since 1997 but unused until x402 revived it. The server replies "402" plus payment requirements; the client retries with signed proof of payment attached.
  • EIP-3009 (transferWithAuthorization). The Ethereum standard x402 is built on. It lets a buyer sign a payment off-chain that someone else submits on-chain, so the buyer never pays gas fees or touches the blockchain directly.
  • Facilitator (x402). An optional third party that checks the signature and submits the payment on-chain for the merchant, so the merchant doesn't have to run its own blockchain plumbing. Coinbase and Cloudflare both run facilitators.
  • CAIP-2. A standard way to name a blockchain so a protocol can stay chain-agnostic. x402 writes chains in CAIP-2 form.
  • EIP-155 / chain id. The numbering scheme inside CAIP-2 for Ethereum-style chains. eip155:8453 means the Base chain; the number is the chain id. (You will see eip155:8453 in x402 code; it just means "Base.")
  • Smart-contract wallet. A crypto wallet whose spending rules (per-transaction cap, daily cap, per-recipient cap) are enforced by code on the blockchain itself. Because the chain enforces them, these caps hold even if the agent's own code goes haywire.
  • MPP session. MPP's signature move. Instead of signing every single payment, the agent opens a "session" with a spending cap and a time limit, then streams many small metered payments against it until it closes. Think "a prepaid tab for the agent."
  • Sessions model (MPP). The general name for the above pattern: pre-authorize a cap and duration, then meter many small charges against it. Cheaper than signing every micropayment when calls are frequent.

Commerce and money concepts

  • Settlement. The moment money actually moves from buyer to seller. Everything before settlement is just choreography; the deal is only done when settlement completes.
  • Merchant of record (MoR). The business legally on the hook for a transaction: it handles tax, disputes, and customer support. In ACP the merchant stays the MoR. In plain machine-to-machine x402 calls there often is no MoR.
  • Chargeback. When a buyer's bank reverses a card payment, usually after a dispute. Card rails support chargebacks; pure x402 does not. The need for chargebacks often decides which protocol you use.
  • Dispute. A buyer's formal challenge to a charge ("I didn't authorize this" / "the item never arrived"). Different protocols resolve disputes differently; that difference often forces the protocol choice.
  • Idempotency. A property where doing the same operation twice has the same effect as doing it once. It matters because Stripe retries webhooks with the same event id; without an idempotency key your code could refund or charge twice.

The agent stack and adjacent protocols

  • Agent commerce. Any transaction where an autonomous AI agent is the buyer, the seller, or both, with no human clicking buy at that moment. Different from "AI-assisted" shopping where a person still presses the button.
  • OpenAI Agents SDK. The Python/JavaScript toolkit for building agent loops with Agent, Runner.run(), @function_tool, and guardrails. In this course it is the "universal client": each payment protocol becomes one or more tools the agent can call.
  • MCP (Model Context Protocol). Anthropic's open standard for exposing tools and context to agents. In agent commerce it is often the discovery layer: agents find buyable services through MCP servers. MIT licensed.
  • A2A (Agent2Agent). Google's protocol for agents to talk to and discover each other. AP2 is built on top of A2A: a mandate travels as an A2A message. Apache 2.0.
  • UCP (Universal Commerce Protocol). Google's commerce-layer protocol, the peer to ACP, built around Google's shopping surfaces (Gemini, Google AI Mode). Competes with ACP at the commerce layer.
  • TAP (Trusted Agent Protocol). Visa and Cloudflare's protocol for proving an agent's identity inside HTTP request headers. It verifies who the agent is, not what it's allowed to spend, so it usually adds to another auth protocol rather than replacing it.
  • ERC-8004. An on-chain standard for agent identity and reputation: a public registry of agents plus their transaction history, so an agent with no prior relationship can check another's track record before trusting it.
  • tool_input_guardrail. An OpenAI Agents SDK guardrail that runs before a tool executes and can reject the call. It is the SDK-native way to stop a payment before it happens. This course's spine: it appears in seven payment tools. (Contrast output_guardrail, which runs on the agent's final reply, too late to stop a payment.)

Prerequisites

You'll get the most from this course if you have:

  1. The Build AI Agents crash course, or equivalent SDK experience. The protocol integrations are shown as SDK code, so you need to read Agent, Runner.run(), and @function_tool comfortably. See the Build AI Agents crash course.
  2. The Choosing Agentic Architectures crash course, or equivalent design sense. The "which composition for which use case" framework in Part 5 builds on that pattern-selection discipline. See the Choosing Agentic Architectures crash course.
  3. Basic HTTP. Status codes, the request/response cycle, headers. x402 in particular works at the HTTP level.
  4. Basic payment vocabulary. Merchant, settlement, dispute, chargeback. The course explains the agent-specific parts but assumes you know what a "merchant of record" is.

You do not need blockchain or smart-contract experience (the course teaches enough x402 and EIP-3009 to follow along; no Solidity), and you do not need prior experience with any of the four protocols. They're all taught from primary sources.

The four learning tracks

This course works at four depths. Pick your track before Part 5.

TrackTimeWhat you'll doBest for
Reader2-3 hoursRead all Concepts and Decisions; skip running code.Engineers deciding whether to invest deeper. PMs and architects who need the framework to judge vendor proposals.
Beginner~1 dayReader plus run the x402 client examples plus one ACP test transaction in Stripe test mode.Engineers new to agent commerce who want hands-on time with the simplest protocol (x402) and the most production-ready (ACP).
Intermediate2-3 daysBeginner plus build an agent that uses ACP for one kind of transaction and x402 for another, plus wire AP2 Intent Mandate checks.Engineers shipping a real system that has to compose multiple protocols.
Advanced4-5 daysIntermediate plus run the composed system durably (the Inngest envelope), wire spend limits and human-approval gates, measure trace, cost, and dispute metrics, and handle one full refund cycle.Engineers responsible for production systems moving real money for real users.

A useful self-check: "For the use case I have in mind, what's the smallest protocol composition that ships value?" If you can't answer that after Part 4, re-read Part 4. If you can, your track is just a question of how far you want to go from "smallest composition" to "production-grade composition."

The four-layer stack

This is the diagram everything else hangs on.

Four-layer stack for agent commerce, top to bottom. Layer 1 Discovery: agents find what's available, using MCP, A2A, and agent directories. Layer 2 Identity and Authorization: agents prove they're allowed to spend, using AP2 mandates, ACP shared payment tokens, TAP, and ERC-8004. Layer 3 Commerce: the full purchase lifecycle of cart, dispute, and refund, using ACP, UCP, or a direct API. Layer 4 Settlement: money actually moves, using x402, MPP, card rails, or bank and Lightning. Two example paths on the right: a consumer shopping agent runs all four layers; an API-paying agent skips commerce and collapses to just discovery and settlement. The protocols are layers, not alternatives.

Every agent-commerce use case touches all four layers, but different use cases compose different protocols at each one. A consumer shopping agent: MCP for discovery, an ACP shared payment token for authorization, ACP for commerce, card rails for settlement. An API-paying agent: an agent directory for discovery, an EIP-3009 signature for authorization, no commerce layer at all, x402 for settlement. An enterprise procurement agent: an A2A directory, an AP2 mandate, ACP or UCP for commerce, Stripe MPP for settlement. The rule is simple: pick one protocol per layer, and let the use case justify each pick.

Part 1: Why agent commerce needs new protocols

Reading Part 1 lightly?

Part 1 introduces six names (ACP, AP2, x402, MPP, MCP, A2A) and three layers fast. That's on purpose: this part is the briefing, not the deep dive. The least you need to hold: ACP is consumer shopping (Stripe plus OpenAI), AP2 is authorization mandates (Google plus 60 partners), x402 is per-request stablecoin payment over HTTP (Coinbase), MPP is session-based multi-rail payment (Stripe plus Tempo). MCP and A2A are how agents find and talk to each other, not payment protocols. Hold the names loosely. Part 2 returns to each in depth and they'll stick on the second pass.

Concept 1: The assumption that broke

In one line: Payment systems assumed a human was clicking buy, and agents break that assumption three ways at once.

Payment systems were built on one quiet assumption: a human is at the keyboard, clicking buy. Every screen, every fraud check, every dispute process, every signup form was designed for people. AI agents break that assumption in three ways at the same time.

Break 1: agents don't have email addresses. Consumer payment flows want an account. An account wants an email, a phone number, often a name. An autonomous agent has none of these. Fake them, and you've created an entity that fails KYC the moment fraud detection looks at it. The signup flow assumes a human applying for a relationship; an agent needs something else.

Break 2: agents act thousands of times per second. Fraud detection flags odd behavior by rate, location, and pattern. An agent making 1,000 API calls in a minute looks exactly like a credential-stuffing attack. What's normal for an agent is an alarm for a human, and the rails are tuned for humans.

Break 3: agents can't pick up the phone. Dispute resolution assumes the buyer can be reached: "did you authorize this?" An agent that authorized a charge can't answer, and the human behind it may not even know the charge happened. Disputes need a different model when the buyer is software.

Each break needs a fix at the protocol level, not a coat of paint on the UI:

BreakWhat the human-rail fix can't doWhat an agent-rail fix gives you
No email or accountForce agents to create fake accountsCryptographic identity (TAP, ERC-8004) or scoped tokens (ACP SPT, AP2 Mandate)
High-frequency behaviorBlock traffic that looks like an attackHTTP-native per-request payment (x402) or pre-authorized sessions (MPP)
No phone for disputesEmail disputes the agent can't readMandate-based authorization (AP2) with a non-repudiable audit trail

The "just wrap old payments in nicer agent UX" path already failed. Several startups tried it in 2024 and 2025: give agents human-like accounts with made-up identity. Fraud detection caught them, chargebacks piled up, merchant relationships fell apart. Protocol-level fixes turned out to be required, not optional. That's why ACP, AP2, x402, and MPP all showed up inside the same twelve months.

Bottom line of Concept 1: Payment systems assumed a human clicks buy. Agents break that three ways: no traditional identity, alarming behavior patterns, no channel to clarify a dispute. Each break needs a protocol-level fix. Wrapping old rails in nicer UX failed. The four protocols each cover a different mix of these breaks.

Concept 2: Why one protocol can't win

In one line: The breaks happen at four different layers with four different incumbents, so the protocols split the work by layer instead of one swallowing all of it.

A fair question: if all four protocols showed up to fix the same three breaks, why didn't one of them just win? The answer is structural. The breaks happen at different layers, and a single protocol that tried to fix all of them would be too big for anyone to adopt.

Think about what one unified protocol would have to specify:

  1. How agents find available merchants and services. That's the discovery layer.
  2. How agents prove who they are and that a human authorized the spend. That's the authorization layer.
  3. How agents run a full purchase, disputes and refunds included. That's the commerce layer.
  4. How money actually moves between parties. That's the settlement layer.

Each layer already has strong incumbents. Discovery belongs to search engines and APIs. Identity belongs to OAuth and certificate authorities. Commerce belongs to Stripe, Adyen, and Shopify. Settlement belongs to Visa, Mastercard, and ACH, plus the new crypto rails. A unified protocol would need every incumbent at every layer to agree on it. That was never going to happen.

What happened instead is that each protocol picked the one layer where its sponsor had the most leverage:

ProtocolWhere its sponsor has leverageThe layer it took
ACPOpenAI owns the shopping channel (ChatGPT); Stripe owns merchant integrationCommerce, for human-buyer-via-AI flows
AP2Google has Android wallets and a 60-partner coalitionAuthorization, mandates as signed credentials
x402Coinbase has stablecoin infrastructure; Cloudflare has the HTTP edgeSettlement, for machine-to-machine micropayments
MPPStripe has merchant relationships; Tempo has the blockchainSettlement, for enterprise and multi-rail flows

So the protocols don't fight inside a layer; they compete to define one. At settlement, x402 and MPP genuinely compete, and most "x402 vs MPP" pieces miss that this is the only place they really overlap. At commerce, ACP and UCP compete. At authorization, AP2, TAP, and ERC-8004 compete.

That gives us the one idea this whole course rests on. Within a layer, you pick one protocol. Across layers, you compose several. The four protocols are not alternatives to choose between; they are layers to stack. A consumer shopping agent runs ACP at commerce and card rails at settlement. An API-paying agent runs x402 at settlement and skips commerce entirely. An enterprise agent runs AP2 mandates at authorization and MPP at settlement. The decision tree in Part 5 walks this layer by layer. Hold onto this: every section after this one assumes it.

Bottom line of Concept 2: One protocol can't fix all four breaks because the breaks live at four layers, each with strong incumbents. The protocols specialize: ACP at commerce, AP2 at authorization, x402 and MPP at settlement. Within a layer you pick one; across layers you compose several.

Concept 3: The OpenAI Agents SDK as the universal client

In one line: You don't talk to four protocols four different ways; each one becomes a tool the agent calls, and the SDK is the single client that wires them all in.

A practical question: given four protocols at four layers, how does an agent actually use them? The answer in 2026 is that the agent's framework becomes the universal client. Each protocol exposes an SDK or an HTTP endpoint, and the framework wires it in as a tool. This is true for OpenAI Agents SDK, LangGraph, AutoGen, and CrewAI alike. We'll use the OpenAI Agents SDK throughout.

For the SDK, every protocol integration follows the same shape: wrap the protocol in one or more @function_tool functions, hand them to an Agent, and let Runner.run drive the loop.

The stripe.PaymentTokens.create, X402Client(wallet=...), and from ap2 import ... calls below are illustrative: they show the shape a protocol integration takes. The Agent, Runner, and @function_tool scaffolding around them is real and runs. See the note after the block for what's a stand-in.

from agents import Agent, Runner, function_tool
import stripe # for ACP/MPP
from x402_client import X402Client # for x402
from ap2 import IntentMandate, CartMandate # for AP2
from decimal import Decimal
from .models import PaymentToolResult, X402PaymentResult # the shared result models

# Each protocol becomes one or more @function_tool decorated functions
@function_tool
async def acp_checkout(merchant_id: str, items: list, max_amount: Decimal) -> PaymentToolResult:
"""Complete an ACP checkout at a merchant with the given items."""
# Mint a one-time payment token scoped to this merchant, then POST the order
spt = stripe.PaymentTokens.create(
amount=int(max_amount * 100), # cents as int
currency="usd",
merchant_id=merchant_id,
max_uses=1,
)
response = await acp_post(merchant_id, items, spt.token)
return PaymentToolResult(
status="success" if response.status == "confirmed" else "failed",
details={"order_id": response.order_id, "merchant_status": response.status},
)

@function_tool
async def x402_fetch(url: str, max_payment_usdc: Decimal) -> X402PaymentResult:
"""Fetch a URL that may require x402 payment up to max_payment_usdc."""
client = X402Client(wallet=agent_wallet, max_per_request=max_payment_usdc)
response = await client.get(url)
return X402PaymentResult(
content=response.content,
amount_paid_usdc=response.amount_paid_usdc,
tx_hash=response.payment_proof,
)

# Compose the tools into an agent
shopping_agent = Agent(
name="ShoppingAgent",
instructions="Help the user find and purchase items. Use acp_checkout for retail goods, x402_fetch for paid APIs.",
tools=[acp_checkout, x402_fetch],
model="gpt-5.5",
)

# Run the agent. The SDK handles tool selection, the loop, and retries.
result = await Runner.run(shopping_agent, "Buy me a red t-shirt under $30")

What runs and what's a stand-in here: the Agent, Runner.run, and @function_tool wiring is the real SDK and works as written. The payment clients are stand-ins. stripe.PaymentTokens.create is not a real Stripe call (production ACP integrates through live Stripe ACP endpoints), and the rich X402Client(wallet=..., max_per_request=...) constructor is illustrative too. The real buyer-side package is x402-client, and a live call needs a funded account and a real 402 endpoint, which this course does not do. Part 3 gives each protocol a runnable mock backend so you can see the harness work end to end without moving real money.

The shape is identical across all four protocols. The SDK is the universal client; each protocol is just a tool the agent reasons about and calls when it needs to. Three parts of the SDK's structure matter for payments:

  1. A typed return value for payment results. When a payment tool returns a Pydantic model, the agent's reasoner gets clean type information about what succeeded, what failed, and what to do next. Part 3 defines these shared result models once.
  2. Runner.run(..., context=...) for payment context. The agent often needs the user's identity, spending limits, and wallet handle. Pass these through the SDK's context parameter instead of baking them into instructions. context is per-run and per-user.
  3. tool_input_guardrail for spend limits. A tool input guardrail runs before each tool executes and can reject the call. It's the SDK-native way to block a payment before it happens, not after. Concept 15 walks the full three-level enforcement. The agent-level output_guardrail does not solve this, because it fires on the agent's final reply, after any payment tool has already run.

Bottom line of Concept 3: The OpenAI Agents SDK is the universal client for all four protocols. Each one becomes a @function_tool the agent calls. The SDK's typed returns, context, and tool_input_guardrail map cleanly to payment concerns: structured results, per-user context, and stopping a payment before it happens. The integration shape is the same across all four; Part 3 fills in each one.


Four-layer stack for agent commerce: Discovery, Identity and Authorization, Commerce, and Settlement, top to bottom. A consumer shopping agent runs all four layers; an API-paying agent skips commerce and collapses to discovery and settlement. The protocols are layers, not alternatives.


Part 2: The four layers in depth

You met the four layers in Part 1; here is each one up close. Every layer answers a different question, has its own competing protocols, and forces one decision: for this use case, which protocol best fits this layer? Read Part 2 before Part 3. The layer-by-layer framing here is what makes the protocol details in Part 3 cohere instead of reading like a list of rivals.

Concept 4: Layer 1, Discovery (how agents find what they can buy)

In one line: Discovery is where the agent finds out what's even available to buy, and the right mechanism depends on where those services actually live.

Before an agent can transact, it has to find what's out there. A shopping agent needs merchants that carry the product. An API-paying agent needs to know which endpoints have the data and what they cost. A procurement agent needs suppliers that pass its compliance rules. Discovery answers "what's available?"

Four serious options compete here in 2026:

ProtocolHow it worksBest for
MCP (Anthropic)Tool servers expose callable functions; the agent connects, lists the tools, and calls themProgrammatic access to specific services the developer wired in; high-volume agent work; the dominant agent-tooling discovery layer
A2A (Google)Agents publish what they offer in standard envelopes; other agents discover themMulti-agent ecosystems where agents need to find peer agents; the discovery layer AP2 extends
Agent directories (Agent.market, lobster.cash, Tenzro)Public marketplaces listing paid APIs and services; agents query them like a catalogThird-party services discovered at runtime; the "Yellow Pages" of agent commerce
AI shopping surfaces (ChatGPT Instant Checkout, Google AI Mode, Walmart-in-ChatGPT)Consumer AI products with product discovery plus ACP checkout built inConsumer flows where the user is talking to the AI and it surfaces products inline

The SDK has first-class support for MCP: wire an MCP server to an agent in a few lines and all its tools become available to the agent's reasoning. For non-MCP discovery, wrap it as an ordinary @function_tool that queries the directory and returns structured listings.

The MCPServerStreamableHttp wiring below is real and importable. The agent_market_client call is illustrative: it stands in for a directory client. The @function_tool and Agent scaffolding is real.

from agents import Agent, function_tool
from agents.mcp import MCPServerStreamableHttp
from decimal import Decimal

# Wire an MCP discovery server: all its tools become available
research_mcp = MCPServerStreamableHttp(
name="research-services",
params={"url": "https://research-services.example.com/mcp"},
)

# Wire a non-MCP directory as a regular tool
@function_tool
async def search_agent_market(query: str, max_price_usdc: Decimal) -> list[dict]:
"""Search Agent.market for x402-paid services matching the query."""
return await agent_market_client.search(query, max_price_usdc=max_price_usdc)

agent = Agent(
name="ResearchAgent",
instructions="Find and use research services. Prefer MCP-discovered tools; fall back to Agent.market for niche needs.",
mcp_servers=[research_mcp],
tools=[search_agent_market],
)

The choice you make here is not "MCP vs A2A vs directories." It's: where do my agent's services actually live? Internal to your org, use MCP. Across a network of partner agents, use A2A. Third-party APIs you discover at runtime, use directories. Consumer products, use an AI shopping surface (and ACP at the commerce layer). These aren't mutually exclusive; a real agent often uses several.

Bottom line of Concept 4: Discovery answers "what's available?" Four options compete: MCP for internal tool servers, A2A for multi-agent ecosystems, directories for third-party services, AI surfaces for consumer products. The SDK has first-class MCP support and wires the rest as @function_tool functions. Pick by where the services live; they're not mutually exclusive.

Concept 5: Layer 2, Identity and Authorization (proving the agent is allowed)

In one line: Authorization is where the agent proves the human allowed this spending and that the agent is who it claims to be, before any money moves.

Before money can move, two things have to be true: the agent is who it claims to be, and the human authorized this spending. They're different problems with different solutions, and Layer 2 settles both before settlement happens. Skip Layer 2 and you get one of two failures: fraud (anyone's agent can spend anyone's money) or paralysis (every transaction needs a human to click confirm).

This is the most contested layer in 2026. Four options, four philosophies:

ProtocolHow it worksStrongest where
AP2 Mandates (Google)Signed credentials: Intent Mandate ("buy shoes under $120"), Cart Mandate ("this cart, this price"), Payment Mandate ("authorize this rail")Audit-heavy flows that need non-repudiable proof of consent; multi-agent flows where the merchant has never seen the buyer's agent before
ACP SPT (OpenAI plus Stripe)Stripe mints a Shared Payment Token scoped to one merchant, amount, and time window; the agent presents it; the merchant verifies and chargesConsumer shopping where Stripe is the processor and card rails carry chargeback discipline
TAP (Visa plus Cloudflare)The agent's identity signature rides in HTTP headers; merchants verify it against Visa's directoryIdentity verification specifically (not authorization); usually added to another auth protocol, not used alone
ERC-8004 plus on-chain reputationAn on-chain registry of agent identities and transaction history, with a reputation score from past dealsPure multi-agent flows with no prior trust; high-stakes B2B where reputation is worth checking

The two questions Layer 2 must answer, and how each protocol answers them:

  1. "Did the human authorize this?" AP2 answers with a mandate the user signed before delegating. ACP answers with an SPT Stripe minted only after the user authorized at the account level. TAP doesn't answer this; it's identity-only. ERC-8004 answers with signed on-chain transactions.
  2. "Is the agent who it claims to be?" AP2 answers with the signing key (only the real agent can sign). ACP answers with the SPT being merchant-scoped (only the authorized merchant can redeem it). TAP answers with Visa's directory lookup. ERC-8004 answers with the on-chain identity record.

The SDK gives you two integration points: a tool input guardrail that runs before the payment tool, and the run context that carries per-user state into both.

The guardrail, @function_tool, and Agent wiring below is the real SDK and runs (the attribute paths data.context.tool_arguments and data.context.context are confirmed against the installed SDK). The stripe.PaymentTokens.create call inside the tool is illustrative; production ACP uses live Stripe ACP endpoints.

from agents import Agent, function_tool, RunContextWrapper
from agents.tool_guardrails import (
tool_input_guardrail,
ToolInputGuardrailData,
ToolGuardrailFunctionOutput,
)
from decimal import Decimal
import json
import stripe

# Pattern 1: a tool input guardrail. Runs BEFORE the payment tool executes.
# This is the SDK-native way to block a payment before it happens.
@tool_input_guardrail
def block_over_user_cap(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
"""Reject any payment tool call where the request would exceed the user's per-run cap."""
args = json.loads(data.context.tool_arguments or "{}") # raw JSON args -> dict
requested = Decimal(str(args.get("max_amount_usd", 0)))
ctx = data.context.context # the run context (a dict)
user_cap = Decimal(str(ctx["user_session"].per_run_spend_cap_usd))
run_spent = Decimal(str(ctx.get("run_spend_usd", 0)))
if run_spent + requested > user_cap:
return ToolGuardrailFunctionOutput.reject_content(
f"Refusing payment tool: would spend ${run_spent + requested}, exceeds run cap ${user_cap}"
)
return ToolGuardrailFunctionOutput.allow()

# Pattern 2: the payment tool itself, guarded at the function-tool level
@function_tool(tool_input_guardrails=[block_over_user_cap])
async def purchase_with_acp(
ctx: RunContextWrapper,
merchant_id: str,
items: list,
max_amount_usd: Decimal,
) -> PaymentToolResult:
"""Use ACP to buy items from the merchant up to max_amount_usd.
The guardrail has already verified spend is within bounds before we reach here."""
user_session = ctx.context["user_session"]
spt = stripe.PaymentTokens.create(
amount=int(max_amount_usd * 100), # Stripe expects cents as int
currency="usd",
merchant_id=merchant_id,
user_session_id=user_session.id,
max_uses=1,
)
response = await acp_post(merchant_id, items, spt.token)
return PaymentToolResult(
status="success" if response.status == "confirmed" else "failed",
details={"order_id": response.order_id, "merchant_status": response.status},
)

agent = Agent(
name="ShoppingAgent",
instructions="Help the user shop. Always verify authorization before any purchase.",
tools=[purchase_with_acp],
# No output_guardrails for spend control. Those run on the agent's FINAL output,
# not on individual tool calls. See Concept 15 for the full three-level enforcement.
)
Which guardrail stops a payment

The SDK has three guardrail types and they fire at different moments. input_guardrail runs on the first agent's initial input. output_guardrail runs on the final agent's response to the user. tool_input_guardrail and tool_output_guardrail run on every function-tool call, before and after it executes. For payment safety you need the tool input guardrail: it fires before the payment tool runs and can reject the call. An output guardrail fires too late to stop the payment; it's useful for cleaning up the final reply (say, redacting sensitive data), not for blocking a tool. Concept 15 returns to this; it's the single most common mistake.

The choice you make here comes down to your trust model. If the user is signed into your app and you can mint SPTs through Stripe, ACP gives you the most production-ready story. If you need non-repudiable audit trails, as in regulated industries or B2B procurement, AP2 mandates fit. If you need cryptographic identity on its own, separate from authorization, add TAP. In a pure multi-agent setting with no shared trust, ERC-8004 fills the gap. These are not interchangeable.

Bottom line of Concept 5: Authorization answers two questions: "did the human authorize this?" and "is the agent who it claims to be?" Four protocols compete: AP2 mandates (audit-heavy), ACP SPT (Stripe-native), TAP (identity-only), ERC-8004 (multi-agent trust). The SDK integrates via the run context for per-tool checks and tool_input_guardrail (which runs before each tool and can reject it) for spend limits. Pick by trust model.

Concept 6: Layer 3, Commerce (the full purchase lifecycle)

In one line: Commerce is everything in a real purchase that isn't authorization or settlement: cart, order, fulfillment, dispute, refund, and a plain API call skips it entirely.

Authorization and settlement together cover "money moves with permission." Commerce covers everything else in a purchase: structured carts, order confirmation, fulfillment tracking, dispute resolution, refunds, chargebacks, returns. This is the layer that separates a checkout from a money transfer. A plain machine-to-machine API call doesn't need a commerce layer; it's just an API call. A consumer purchase clearly needs one.

Three meaningfully different options:

ProtocolWhat it doesBest for
ACP (OpenAI plus Stripe)A structured flow: cart format, order confirmation, fulfillment status, dispute escalation, refunds. The merchant stays the merchant of record.Consumer shopping: retail goods, subscriptions, physical fulfillment. Powers ChatGPT Instant Checkout.
UCP (Google)A similar lifecycle, built around Google's shopping surfaces (Gemini, Google AI Mode) and Google PayMerchants on Google Shopping; agents on Google AI surfaces
Direct API (machine-to-machine)No commerce protocol at all, just an HTTP API. Payment via x402 or MPP. No cart, no disputes, no refunds.API access, compute, data feeds: purchases where the thing bought is a stateless API response

So ACP and UCP compete; "direct API" is the absence of this layer, not a third competitor, and it often sits happily next to nothing. A consumer platform picks ACP or UCP (or both, if it spans ChatGPT and Gemini). An API marketplace doesn't pick at this layer at all, because its purchases have no lifecycle to manage.

The part most engineers underestimate is refunds and disputes. A customer who orders the wrong size t-shirt expects to send it back. The commerce protocol has to say how the return starts, how the agent hears about it, and how the refund flows back through settlement. ACP gets this right by keeping the merchant as merchant of record: existing dispute machinery (Stripe's chargeback flows, the retailer's return policy) just works. Direct-API approaches get it wrong by ignoring it. There's often no refund path at all, which is fine for a $0.0001 API call and wrong for $500 of API credits.

Commerce flows usually need several tools working in sequence:

The acp_client calls below are illustrative; they stand in for an ACP commerce backend. The Pydantic models, @function_tool, and Agent wiring is real.

from agents import Agent, function_tool
from pydantic import BaseModel
from decimal import Decimal

class CartItem(BaseModel):
sku: str
quantity: int
unit_price: Decimal

class OrderResult(BaseModel):
order_id: str
status: str # "confirmed", "fulfilled", "shipped", "delivered"
tracking_url: str | None = None
estimated_arrival: str | None = None

@function_tool
async def acp_create_cart(merchant_id: str, items: list[CartItem]) -> PaymentToolResult:
"""Create a cart at an ACP merchant. Does NOT charge yet."""
cart = await acp_client.cart.create(merchant_id=merchant_id, items=items)
return PaymentToolResult(
status="success",
details={"cart_id": cart.id, "merchant_id": merchant_id, "item_count": len(items)},
)

@function_tool
async def acp_checkout(cart_id: str, spt_token: str) -> OrderResult:
"""Complete the checkout for a previously-created cart."""
return await acp_client.checkout.complete(cart_id=cart_id, spt_token=spt_token)

@function_tool
async def acp_check_order_status(order_id: str) -> OrderResult:
"""Get the current status of an order. The agent calls this to follow up."""
return await acp_client.order.status(order_id=order_id)

@function_tool
async def acp_initiate_refund(order_id: str, reason: str) -> RefundResult:
"""Start a refund for an order. Returns refund_id for follow-up."""
response = await acp_client.refund.create(order_id=order_id, reason=reason)
return RefundResult(
refund_id=response.refund_id,
order_id=order_id,
status=response.status,
amount_refunded_usd=response.amount_refunded_usd,
)

shopping_agent = Agent(
name="ShoppingAgent",
instructions="Help the user shop. Create the cart first, confirm items with the user, then check out. Handle refund requests with an ACP refund.",
tools=[acp_create_cart, acp_checkout, acp_check_order_status, acp_initiate_refund],
)

The question to ask here is whether your use case needs a commerce lifecycle at all. If it does (cart, refund, dispute), pick ACP for ChatGPT reach or UCP for Google reach, or both. If it doesn't, skip this layer and go straight from authorization to settlement. The two ways to get this wrong: forcing a consumer-commerce protocol onto a machine-to-machine call (too much), or skipping commerce on a real consumer purchase and then re-implementing disputes badly later (too little).

Bottom line of Concept 6: Commerce handles the full purchase lifecycle: cart, checkout, fulfillment, dispute, refund. ACP and UCP compete for consumer flows; machine-to-machine access has no commerce layer at all. Refund and dispute mechanics are the hidden weight that separates a real commerce protocol from a payment-only one. The SDK wires commerce as a sequence of tools (cart, checkout, status, refund). Pick by use case: lifecycle needed (ACP/UCP) or skipped (direct API).

Concept 7: Layer 4, Settlement (money actually moves)

In one line: Settlement is where dollars actually change custody, and the pick is mostly about the transaction's economics.

Move value from buyer to seller. Everything above this layer is choreography; settlement is where dollars (or stablecoins, or whatever the unit is) actually change hands. The agent has only completed a transaction once settlement completes.

Four serious options, each with its own economics and limits:

ProtocolHow it worksEconomicsBest for
x402HTTP-native; settles via a stablecoin transfer on Base/Solana/EVM, signed with EIP-3009Sub-cent gas, 1-2 second finality, no protocol feesMachine-to-machine micropayments; high-frequency, low-value (API access, per-call billing)
MPPSessions: the agent pre-authorizes a cap and duration, then streams metered payments. Multi-rail (stablecoin on Tempo, Lightning, cards)Stripe fees on card rails; near-zero on stablecoin; subscription-friendlyEnterprise and multi-rail flows; recurring subscriptions; cases needing both fiat and crypto in one envelope
Card rails (Stripe/Adyen/Worldpay)Visa, Mastercard, Amex via Stripe; the agent presents an SPT (ACP) or Payment Mandate (AP2); the processor charges the cardAround 2.9% plus $0.30 for cards; established dispute machineryConsumer flows; transactions with chargeback exposure; international card acceptance
Bank transfer / LightningACH, SEPA, Bitcoin LightningACH ~$0.25 fixed; Lightning sub-cent; SEPA ~€0.20High-value flows where 2.9% card fees hurt; cross-border micropayments via Lightning

The settlement pick is mostly about money. Sub-dollar payments go to x402, where stablecoin gas is sub-cent. Consumer purchases up to about $1,000 go to card rails via ACP, where chargeback protection is worth the 2.9%. Recurring subscriptions go to MPP sessions. Large B2B transfers go to bank rails or Lightning. And the layers above usually force the pick: ACP at commerce mostly means card rails at settlement; an x402-paywalled MCP server at discovery mostly means x402 at settlement.

Watch for the headline-number trap. Pieces citing x402's transaction volume or MPP's integration count can mislead. The right question isn't "which has the most volume?" It's "which fits this transaction's economics?" A $0.001 API call settled on card rails costs more in fees than the call itself. A $5,000 procurement settled on x402 stablecoin throws away the chargeback protection a card would give you.

Settlement usually fires as a side effect of a higher-layer tool: the agent rarely calls settle_payment() directly; settlement happens inside acp_checkout() or x402_fetch(). But spend limits at the SDK level still matter:

The x402_client.get call below is illustrative; it stands in for a buyer-side x402 fetch. The @function_tool wiring and the spend check are real Python.

from agents import function_tool, RunContextWrapper
from decimal import Decimal
from .models import X402PaymentResult, PaymentToolResult

@function_tool
async def x402_fetch(
ctx: RunContextWrapper,
url: str,
max_payment_usdc: Decimal,
) -> X402PaymentResult | PaymentToolResult:
"""Fetch a paid URL via x402. Settlement is automatic if cost <= max_payment_usdc."""
# Check the SDK-level spend tracker before initiating the request
spent_so_far = Decimal(str(ctx.context.get("session_x402_spend_usdc", Decimal(0))))
session_cap = ctx.context["user_session"].x402_session_cap_usdc
if spent_so_far + max_payment_usdc > session_cap:
return PaymentToolResult(
status="rejected",
error=f"Would exceed session spend cap (already spent ${spent_so_far})",
)

# Initiate the x402 flow: the server returns 402, the agent retries with a signed payment
response = await x402_client.get(url, max_payment_usdc=max_payment_usdc)

# Update the spend tracker, kept in ctx.context for cross-tool visibility
ctx.context["session_x402_spend_usdc"] = spent_so_far + response.amount_paid_usdc
return X402PaymentResult(
content=response.content,
amount_paid_usdc=response.amount_paid_usdc,
tx_hash=response.tx_hash,
)

One thing to be clear about: the if spent_so_far + ... check is a soft guard inside the tool body, useful for a fast, friendly failure but not the real safety. The safety that actually protects you is the agent's smart-contract wallet. Even if you deleted the in-tool check, the wallet caps from Concept 15 would still reject the on-chain transfer. The tool's check is for UX; the wallet caps are the safety.

The move here is to pick the rail that matches the transaction's economics, then confirm it's compatible with your commerce choice. Sub-dollar machine-to-machine goes to x402. Consumer purchases go to card rails via ACP. Enterprise subscriptions go to MPP. Don't pick settlement in isolation; pick it as the bottom of a composed stack.

Bottom line of Concept 7: Layer 4 (Settlement) is where money actually moves. Four options compete. x402 for machine-to-machine stablecoin payments. MPP for multi-rail sessions. Card rails for consumer purchases that need chargebacks. Bank or Lightning for high-value or very-low-fee cases. The choice is mostly about money: sub-dollar goes to x402, consumer goes to cards, enterprise goes to MPP. The SDK runs settlement as a side effect of higher-layer tools. It caps spend at the run and session level with RunContextWrapper, and the wallet's on-chain caps are the layer nothing can bypass.


Part 3: The four protocols in depth, with OpenAI Agents SDK integration

Parts 1 and 2 set the framing: four layers, several protocols per layer, the SDK as the universal client. Part 3 walks each of the four headline protocols up close. For each one you get what it is, its key primitives, the SDK integration code, and short notes on how teams deploy and run it.

Read these as four parallel deep-dives. Each protocol has the same shape, so you can compare them side by side.

How the OpenAI Agents SDK wires into the four protocol layers. At the top, the SDK with three tools: the function-tool decorator, the run-context wrapper, and the tool input guardrail. In the middle, four boxes show which SDK piece connects to each layer: discovery through MCP servers and tools, authorization through function tools and run context, commerce through sequences of function tools, settlement as a side effect of commerce tools plus guardrail spend caps. At the bottom, the three-level spend-limit stack: Level 1 wallet and payment-method caps that nothing can bypass, Level 2 the SDK tool input guardrail that runs before the tool executes (not the output guardrail, which fires too late), Level 3 application business rules. The SDK is not a protocol; it is the orchestrator that composes protocols cleanly.

Every protocol in Concepts 8 through 11 wires into the SDK through one of the patterns in this diagram. The three-level spend-limit stack at the bottom previews Concept 15 in Part 6. Keep it in the corner of your eye as you read: the safety discipline it shows is what turns this code into something you can actually deploy.

🧰 Pydantic as the contract layer: read once, applies to every protocol below

Every code sample in Part 3 uses Pydantic models, not plain Python dicts, for protocol payloads, tool returns, FastAPI request and response bodies, and Inngest event payloads. This is the part you can't skip. It is what holds the whole system together.

Four boundaries get crossed in a typical agent-commerce flow:

  1. The SDK's @function_tool returns a value to the agent's reasoner.
  2. That value crosses the wire to a protocol endpoint (ACP, AP2, x402, MPP).
  3. The protocol returns a response that crosses back.
  4. Sometimes a webhook arrives later and crosses into a FastAPI handler.

At each boundary, an untyped dict quietly loses fields, drops coercions, and ships the wrong shape. Pydantic models catch all four kinds of failure right at the boundary, with errors that point at the exact field.

Here is the pattern that recurs in every concept below:

from pydantic import BaseModel, Field
from decimal import Decimal
from typing import Literal
from agents import function_tool, RunContextWrapper

class CartItem(BaseModel):
sku: str
quantity: int = Field(ge=1)
unit_price_usd: Decimal

class CheckoutRequest(BaseModel):
merchant_id: str
items: list[CartItem]
max_total_usd: Decimal = Field(gt=0)

class CheckoutResult(BaseModel):
order_id: str
status: Literal["confirmed", "failed", "pending_user_confirmation"]
total_charged_usd: Decimal
estimated_delivery: str | None = None

@function_tool
async def acp_checkout(ctx: RunContextWrapper, request: CheckoutRequest) -> CheckoutResult:
# Pydantic has already validated the request shape before this line runs.
# Returning a CheckoutResult means the agent's reasoner gets typed feedback.
...

Three concrete reasons it matters at this scale:

  1. The reasoner uses the return type as feedback. When acp_checkout returns a typed CheckoutResult, the agent's next reasoning step gets clean field names and types, not a stringified dict. Tool-selection accuracy goes up measurably.
  2. FastAPI uses Pydantic natively. Stripe webhooks, AP2 mandate callbacks, and MPP session events all deserialize into Pydantic models in the FastAPI handler: the same models the agent's tools return. One contract, two endpoints.
  3. Inngest events carry Pydantic payloads. When a FastAPI webhook handler fires an Inngest event, the payload is a Pydantic model. The suspended step.wait_for_event receives the typed payload directly. No JSON parsing in the workflow.

Decimal for money, always. Every monetary amount in this course uses Decimal, never float. Floating-point math on money loses precision in ways that compound across thousands of micropayments. The Stripe SDK accepts both but reports back in Decimal. x402 amounts arrive as on-chain integers (USDC has 6 decimal places) that you wrap in Decimal. Money stays Decimal everywhere it isn't being serialized to a wire format.

The shared result models. Rather than redefine result types in every concept, the course defines a small set of result models once. Every code block below imports them by name. Put these in your models.py:

from pydantic import BaseModel, Field
from decimal import Decimal
from typing import Literal
from datetime import datetime

# --- Tool-result models (returned by @function_tool functions) ---

class PaymentToolResult(BaseModel):
"""Generic envelope for any payment-related tool action."""
status: Literal["success", "failed", "rejected", "pending"]
error: str | None = None
details: dict | None = None # protocol-specific details

class MandateResult(BaseModel):
"""Result of creating an AP2 mandate (Intent / Cart / Payment)."""
mandate_id: str | None = None
status: Literal["signed", "declined", "pending", "failed"]
expires_at: str | None = None # ISO 8601
error: str | None = None

class OrderStatusResult(BaseModel):
"""Result of fetching ACP order status."""
order_id: str
status: Literal["confirmed", "shipped", "delivered", "cancelled", "refunded", "pending"]
tracking_url: str | None = None
estimated_delivery: str | None = None

class RefundResult(BaseModel):
"""Result of initiating an ACP refund."""
refund_id: str
order_id: str
status: Literal["initiated", "processing", "completed", "failed"]
amount_refunded_usd: Decimal | None = None

class DiscoveryResult(BaseModel):
"""Result of an Agent.market or similar agent-directory search."""
service_id: str
name: str
description: str
price_per_call_usdc: Decimal
endpoint_url: str

class X402PaymentResult(BaseModel):
"""Result of an x402-paid fetch."""
content: str
amount_paid_usdc: Decimal
tx_hash: str | None = None

class MPPSessionResult(BaseModel):
"""Result of creating or closing an MPP session."""
session_id: str
status: Literal["active", "closed", "expired", "failed"]
total_charged_usd: Decimal | None = None # only populated on close
expires_at: datetime | None = None
rail_breakdown: dict[str, Decimal] | None = None # only on close

class MPPMeteredCallResult(BaseModel):
"""Result of a metered call within an active MPP session."""
session_id: str
cost_usd: Decimal
response_payload: dict
accumulated_session_spend_usd: Decimal

# --- FastAPI handler models (request/response bodies for webhooks/callbacks) ---

class WebhookAck(BaseModel):
"""Generic webhook handler ack."""
received: bool = True
event_id: str | None = None

# --- Inngest workflow result models (return types of @inngest_client functions) ---

class WorkflowResult(BaseModel):
"""Generic workflow completion envelope."""
status: Literal["completed", "abandoned", "failed", "partial"]
reason: str | None = None
output: dict | None = None

These models recur through Part 3 and Part 6. The code blocks below import them by name and never redefine them. You will see the import pattern repeated in every concept; this sidebar does not repeat.

Concept 8: ACP (Agentic Commerce Protocol), the consumer-shopping protocol

In one line: ACP is how an agent completes a real checkout at a real merchant on a person's behalf, with the merchant still on the hook for the sale.

What it is. ACP is an open specification built by OpenAI and Stripe, launched September 29, 2025 with Etsy and Shopify as launch partners. By early 2026 it powers ChatGPT Instant Checkout across Shopify-integrated merchants and several large retail brands. Merchant counts and brand lists vary by source, so verify against the ACP integration directory for current adoption. The protocol is Apache 2.0, governed through a Specification Enhancement Proposal process at github.com/agentic-commerce-protocol/agentic-commerce-protocol. The repository marks the spec as beta, so expect drift between course examples and the live spec.

Where it sits. ACP lives mostly at Layer 3 (Commerce) and reaches into Layer 2 (Authorization) through its token mechanism. It covers cart formation, checkout, order management, fulfillment status, and refund mechanics. The merchant stays the merchant of record: chargebacks, returns, and customer service all flow through the merchant's existing systems. (Merchant of record is the business legally on the hook for a transaction.)

The two primitives that matter.

  1. Shared Payment Token (SPT). A one-time token from the payment processor (Stripe in the reference build), locked to one merchant, one amount cap, one short time window, and usually a single use. If an agent cleared for $50 tries to spend $1,000, the SPT just fails at the protocol level. The SPT is how ACP keeps the agent inside the user's authorized scope.
  2. Cart Mandate (via the AP2 extension). ACP can compose with AP2 for extra audit rigor: the user signs a Cart Mandate before the agent submits the SPT. This is optional in ACP but increasingly common in regulated flows. (A mandate is a signed proof that a human authorized a specific kind of spending.)

The OpenAI Agents SDK integration. ACP is the most production-ready protocol for the SDK, because OpenAI co-built it. The Stripe Python SDK plus a thin ACP client wrapper give you the full integration.

The acp_client calls and stripe.PaymentTokens.create(...) below are illustrative: they show the shape an ACP integration takes. The real SPT minting goes through live Stripe ACP endpoints, not a stripe.PaymentTokens method. The agent wiring, the guardrail, and the result models are real and run today. See the note after the block.

from agents import Agent, Runner, function_tool, RunContextWrapper
from agents.tool_guardrails import (
tool_input_guardrail,
ToolInputGuardrailData,
ToolGuardrailFunctionOutput,
)
from pydantic import BaseModel
from decimal import Decimal
from typing import Literal
import json
import stripe
import time
from .models import OrderStatusResult, RefundResult

class CartItem(BaseModel):
sku: str
name: str
quantity: int
unit_price_usd: Decimal

class CheckoutResult(BaseModel):
order_id: str
status: Literal["confirmed", "failed", "pending_user_confirmation"]
total_charged_usd: Decimal
estimated_delivery: str | None = None

# Tool input guardrail: refuse the checkout if the user can't authorize the spend
@tool_input_guardrail
def verify_user_can_spend(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
args = json.loads(data.context.tool_arguments or "{}")
max_total = Decimal(str(args.get("max_total_usd", 0)))
merchant_id = args.get("merchant_id", "")
user_session = data.context.context["user_session"]
if not user_session.can_spend(max_total, merchant_id):
return ToolGuardrailFunctionOutput.reject_content(
f"User cannot authorize ${max_total} at merchant {merchant_id}"
)
return ToolGuardrailFunctionOutput.allow()

@function_tool
async def acp_browse_merchant(merchant_id: str, query: str) -> list[dict]:
"""Search a merchant's catalog via their ACP catalog endpoint."""
response = await acp_client.catalog.search(
merchant_id=merchant_id,
query=query,
limit=20,
)
return [item.model_dump() for item in response.items]

@function_tool(tool_input_guardrails=[verify_user_can_spend])
async def acp_create_cart_and_checkout(
ctx: RunContextWrapper,
merchant_id: str,
items: list[CartItem],
max_total_usd: Decimal,
) -> CheckoutResult:
"""Create a cart at the merchant and complete checkout in one transaction.
Mints an SPT scoped to max_total_usd; the merchant verifies and charges.
The verify_user_can_spend guardrail above has already validated authorization."""

user_session = ctx.context["user_session"]

# Mint the Shared Payment Token via Stripe (Decimal to cents via quantize)
cents = int((max_total_usd * 100).quantize(Decimal("1")))
spt = stripe.PaymentTokens.create(
amount=cents,
currency="usd",
merchant_id=merchant_id,
user_session_id=user_session.id,
max_uses=1,
expires_at=int(time.time()) + 600, # 10-minute window
)

# Submit the cart and SPT to the merchant's ACP endpoint
result = await acp_client.checkout.complete(
merchant_id=merchant_id,
items=[i.model_dump() for i in items],
spt_token=spt.token,
)

# Update the user's spend tracker (Decimal in, Decimal out)
user_session.record_spend(
amount_usd=Decimal(str(result.total_charged_usd)),
merchant_id=merchant_id,
order_id=result.order_id,
)

return CheckoutResult(**result.model_dump())

@function_tool
async def acp_check_order(order_id: str) -> OrderStatusResult:
"""Get the current status of an ACP order (fulfillment, shipping, delivery)."""
raw = await acp_client.orders.get(order_id=order_id)
return OrderStatusResult(
order_id=raw.order_id,
status=raw.status,
tracking_url=raw.tracking_url,
estimated_delivery=raw.estimated_delivery,
)

@function_tool
async def acp_refund(
order_id: str,
reason: str,
amount_usd: Decimal | None = None,
) -> RefundResult:
"""Start a refund for an ACP order. Returns refund_id for follow-up.
If amount_usd is None, a full refund is requested."""
raw = await acp_client.refunds.create(
order_id=order_id,
reason=reason,
amount_usd=amount_usd,
)
return RefundResult(
refund_id=raw.refund_id,
order_id=order_id,
status=raw.status,
amount_refunded_usd=raw.amount_refunded_usd,
)

shopping_agent = Agent(
name="ShoppingAgent",
instructions="""Help the user shop at ACP-enabled merchants. Workflow:
1. Use acp_browse_merchant to find products matching the user's request
2. Present the matched items to the user (via reasoning, not a tool)
3. When the user confirms, use acp_create_cart_and_checkout to complete the purchase
4. Use acp_check_order to report order status when the user asks
5. Use acp_refund only when the user explicitly requests a return""",
tools=[acp_browse_merchant, acp_create_cart_and_checkout, acp_check_order, acp_refund],
model="gpt-5.5",
)

What is real here: the Agent, Runner, @function_tool, and tool_input_guardrail scaffolding, plus the typed result models. That is the part you reuse for every protocol. What is a stand-in: the acp_client and the stripe.PaymentTokens.create call, which represent the ACP-Stripe backend. In production you swap that backend for live Stripe ACP endpoints and the rest stays put. Here is the same shape with a runnable mock so you can see it work:

class MockACPClient:
"""Illustrative backend. Real ACP uses live Stripe ACP endpoints, not stripe.PaymentTokens."""
async def checkout(self, merchant_id: str, amount_usd) -> dict:
return {"order_id": "ord_mock_1", "status": "confirmed", "total_charged_usd": str(amount_usd)}

acp_client = MockACPClient() # stands in for the ACP/Stripe backend

The standard harness (stated once here, referenced by the next three concepts). ACP runs on the plain cloud harness with no extra parts. The Stripe SDK runs inside the FastAPI handler; ACP calls are outbound HTTPS; SPTs live in the agent's run-scoped context, passed through RunContextWrapper. The one rule: keep Stripe API keys in a secret store (a key vault), not in environment variables, and rotate them on Stripe's schedule. No sandbox is needed (ACP runs no code) and no special storage is needed (orders persist in Stripe and the merchant's system, with optional shadow records for audit). This deployment baseline is the same for AP2, x402, and MPP. The next three concepts say only what each one adds. Cloud deployment is covered in the deploying-agents crash course (in progress).

Running it durably (Inngest). ACP transactions are short, usually 5 to 30 seconds end to end. They fit cleanly into Inngest step.run blocks. The step.wait_for_event primitive earns its keep when the agent must pause for the user to confirm the cart between cart creation and checkout (the human-in-the-loop pattern). The Production Worker crash course covers this durability layer in depth; here is the shape:

import inngest
from datetime import timedelta
from .models import WorkflowResult

@inngest_client.create_function(
fn_id="shopping-workflow",
trigger=inngest.TriggerEvent(event="shopping/checkout.requested"),
concurrency=[inngest.Concurrency(limit=5, key="event.data.user_id")],
)
async def shopping_workflow(ctx: inngest.Context) -> dict:
# Run the agent to produce the cart proposal
cart = await ctx.step.run(
"agent-builds-cart", build_cart_fn, ctx.event.data["user_query"],
)

# Wait for the user to confirm the proposed cart (human-in-the-loop)
confirmation = await ctx.step.wait_for_event(
"wait-for-user-confirm",
event="shopping/cart.confirmed",
if_exp=f"async.data.cart_id == '{cart['cart_id']}'",
timeout=timedelta(minutes=15),
)
if confirmation is None: # timeout returns None
return {"status": "abandoned", "reason": "user did not confirm in time"}

# The user confirmed; complete checkout with the now-valid SPT
result = await ctx.step.run("complete-checkout", complete_checkout_fn, cart["cart_id"])
return {"status": "completed", "output": result}

A couple of details that trip people up: wait_for_event matches the incoming payload with if_exp (a small expression), and the awaited fields live under the async. prefix. A timeout returns None, so check for it before continuing.

Where teams get this wrong. They skip the user-confirmation step. ACP supports both "user confirms each cart" and "user pre-authorized this kind of purchase." Teams often pick the second one for speed, then find that a small SPT misconfiguration lets the agent buy slightly-wrong items with no way to recover. Default to confirming the cart for the first month in production. Relax it only once you have measured how often the agent gets the cart right.

Bottom line of Concept 8: ACP is the production-ready commerce protocol for consumer shopping with the OpenAI Agents SDK. The SPT scopes the agent's spending; the merchant stays the merchant of record. You integrate it as four @function_tool functions (browse, checkout, status, refund) over the Stripe SDK and a thin ACP client. Inngest's step.wait_for_event builds the user-confirmation gate, and the plain cloud harness is enough. Default to confirming carts until you have measured the agent's cart accuracy.

Concept 9: AP2 (Agent Payments Protocol), the authorization layer

In one line: AP2 produces signed proofs that a human allowed the spending; it does not move the money itself, it proves the money was allowed to move.

What it is. AP2 is an open specification from Google with 60+ partners, launched September 2025 (latest version v0.2.0, April 2026). Apache 2.0, maintained at github.com/google-agentic-commerce/AP2, with reference implementations in Python, TypeScript, Kotlin, and Go. AP2 is the authorization layer, not a commerce or settlement protocol. It produces signed mandates that prove an agent is authorized to spend, then leaves the actual settlement to whatever rail fits (cards, bank, or x402 through the a2a-x402 extension).

Where it sits. AP2 lives at Layer 2 (Identity and Authorization). It builds on two protocols underneath: A2A (Agent2Agent, for agent-to-agent messaging) and MCP (for tool exposure). An AP2 mandate travels as a signed credential over A2A or attached to an MCP tool call.

The three primitives that matter, the three mandate types.

MandateWhen it's createdWhat it proves
Intent MandateAt the start of the task, signed by the user in their UIThe user allowed the agent to act within set rules (price limits, time windows, allowed merchants)
Cart MandateAfter the agent has built a specific cart, signed by the user before checkout (human-present flows)The user approved this exact cart at this exact price
Payment MandateAt the moment of payment, signed by the user or auto-generated against the Intent MandateThe user authorized this exact payment on this exact rail

The audit trail. The three mandates form a chain the signer cannot later deny: Intent ("buy shoes under $120") leads to Cart ("these shoes at $110") leads to Payment ("charge this stablecoin wallet"). Each mandate points back to the one before it. If any step is challenged later in a dispute or a fraud claim, the whole chain is auditable. That property is non-repudiable: "I never authorized this" does not hold up against the user's own signature. This is why AP2 fits regulated industries like healthcare and financial services, where "did the user actually authorize this?" carries legal weight.

What changes for the SDK. AP2 has no first-class place in the OpenAI Agents SDK; its reference builds use Google's Agent Development Kit. You wire it in as @function_tool functions that create, sign, validate, and dispatch mandates. The harness is the same one from Concept 8. The one thing AP2 adds: a signing surface where the user actually signs each mandate.

The from ap2 import ... imports, the MandateSigner, and the ap2_x402 calls below are illustrative. AP2's real Python package is ap2, with mandate models under ap2.types.mandate, and its real fields differ from the simplified principal_did / agent_did / rules shown here. There is no MandateSigner class; real signing uses verifiable credentials over A2A. The mandate-chain concept is real; this exact code is a teaching stand-in. The SDK scaffolding and the typed results are real.

from agents import Agent, function_tool, RunContextWrapper
from agents.tool_guardrails import (
tool_input_guardrail,
ToolInputGuardrailData,
ToolGuardrailFunctionOutput,
)
from ap2 import IntentMandate, CartMandate, PaymentMandate, MandateSigner
from pydantic import BaseModel
from decimal import Decimal
from datetime import datetime
from .models import MandateResult, PaymentToolResult

class IntentRules(BaseModel):
max_total_usd: Decimal
allowed_merchants: list[str] | None = None
allowed_categories: list[str] | None = None
expires_at: str # ISO 8601 datetime

# Guardrail: refuse any cart that has no preceding Intent Mandate
@tool_input_guardrail
def require_intent_mandate(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
intent = data.context.context.get("intent_mandate")
if not intent:
return ToolGuardrailFunctionOutput.reject_content(
"No Intent Mandate found. Create one via ap2_create_intent_mandate first."
)
return ToolGuardrailFunctionOutput.allow()

@function_tool
async def ap2_create_intent_mandate(
ctx: RunContextWrapper,
task_description: str,
rules: IntentRules,
) -> MandateResult:
"""Create an Intent Mandate at the start of a purchasing task.
Needs the user's signature, so call this BEFORE the agent shops."""

user_session = ctx.context["user_session"]
mandate = IntentMandate(
principal_did=user_session.did, # decentralized identifier
agent_did=ctx.context["agent_did"],
task=task_description,
rules=rules.model_dump(),
issued_at=datetime.utcnow().isoformat(),
)

# Send to the user's signing UI; block until the user signs or rejects
signed = await user_session.signer.request_signature(
mandate,
ui_prompt="Approve this shopping task?",
timeout_seconds=300,
)
if not signed:
return MandateResult(status="declined", error="User declined to sign Intent Mandate")

# Store the mandate for later reference by Cart/Payment mandates
ctx.context["intent_mandate"] = signed
return MandateResult(mandate_id=signed.id, status="signed", expires_at=rules.expires_at)

@function_tool(tool_input_guardrails=[require_intent_mandate])
async def ap2_create_cart_mandate(
ctx: RunContextWrapper,
cart_items: list[dict],
total_usd: Decimal,
merchant_id: str,
) -> MandateResult:
"""Create a Cart Mandate that references the current Intent Mandate.
Needs the user's signature in human-present flows.
The require_intent_mandate guardrail above has verified an Intent Mandate exists."""

intent = ctx.context["intent_mandate"] # guaranteed present by the guardrail

# Check that the cart fits inside the Intent Mandate's rules
if total_usd > Decimal(str(intent.rules["max_total_usd"])):
return MandateResult(
status="failed",
error=f"Cart total ${total_usd} exceeds Intent Mandate cap ${intent.rules['max_total_usd']}",
)

cart_mandate = CartMandate(
parent_intent_id=intent.id,
cart_items=cart_items,
total_usd=str(total_usd), # serialize Decimal as a string on the wire
merchant_id=merchant_id,
issued_at=datetime.utcnow().isoformat(),
)

user_session = ctx.context["user_session"]
signed = await user_session.signer.request_signature(
cart_mandate,
ui_prompt=f"Approve this cart for ${total_usd}?",
timeout_seconds=300,
)
if not signed:
return MandateResult(status="declined", error="User declined to sign Cart Mandate")

ctx.context["cart_mandate"] = signed
return MandateResult(mandate_id=signed.id, status="signed")

@function_tool
async def ap2_settle_via_x402(
ctx: RunContextWrapper,
merchant_x402_url: str,
) -> PaymentToolResult:
"""Use the AP2 a2a-x402 extension to settle the current Cart Mandate via an x402 stablecoin payment."""
cart = ctx.context.get("cart_mandate")
if not cart:
return PaymentToolResult(status="failed", error="No Cart Mandate to settle")

# The a2a-x402 extension generates a Payment Mandate authorizing the x402 transfer
payment_mandate = await ap2_x402.create_payment_mandate(
cart_mandate=cart,
rail="x402",
chain="eip155:8453", # Base
asset="USDC",
)

# Dispatch the x402 payment with the Payment Mandate attached as proof
result = await x402_client.pay(
url=merchant_x402_url,
amount_usdc=Decimal(str(cart.total_usd)),
payment_mandate=payment_mandate,
)
return PaymentToolResult(status="success", details=result.model_dump())

procurement_agent = Agent(
name="ProcurementAgent",
instructions="""Enterprise procurement workflow:
1. ALWAYS create an Intent Mandate first via ap2_create_intent_mandate
2. Search merchants for matching items
3. Build a cart and create a Cart Mandate via ap2_create_cart_mandate
4. Settle via x402 (stablecoin) using ap2_settle_via_x402, or via card rails
5. Record every mandate ID in the procurement system for audit""",
tools=[ap2_create_intent_mandate, ap2_create_cart_mandate, ap2_settle_via_x402],
model="gpt-5.5",
)

What is real: the same SDK scaffolding and typed results as Concept 8. What is a stand-in: the ap2 mandate classes, the MandateSigner, and the ap2_x402 extension. The mandate chain itself is a real idea you can build; the exact API here is simplified for teaching. Here is a runnable mock so the pattern executes:

class MockSignedMandate:
def __init__(self, mid, rules=None): self.id, self.rules = mid, (rules or {})
class MockSigner:
"""Illustrative. Real AP2 signing uses verifiable credentials over A2A; no MandateSigner class exists."""
async def request_signature(self, mandate, ui_prompt="", timeout_seconds=300):
return MockSignedMandate(getattr(mandate, "id", "mandate_mock_1"))
ap2_x402 = type("Mock", (), {"create_payment_mandate": staticmethod(lambda **k: MockSignedMandate("pm_mock_1"))})()

What it adds to the harness. Two things beyond the Concept 8 baseline: a signing surface for the user to sign mandates (a web app, a mobile app, or a notification-based signing tool), and durable storage for signed mandates that lasts as long as your dispute windows (7-year retention is standard for financial mandates). The signing surface is the real delta from ACP: ACP can reuse existing login sessions, but AP2 needs a dedicated signature flow.

Running it durably (Inngest). Mandate signing is the textbook use of step.wait_for_event. The agent's function fires a "signing requested" event, the function suspends, the user signs in the UI which fires "signing signed," and the function resumes. No compute is burned while it waits, which matters because an enterprise signature can take hours. The Production Worker crash course goes deep on this resume pattern.

Where teams get this wrong. They treat mandate creation as a checkout-time concern and sign the mandate after the agent has already done a lot of work. Create the Intent Mandate first, before the agent shops at all. That catches a scope mismatch early (the user wanted shoes but the Intent Mandate only allows office supplies) instead of after the agent has spent compute building a cart it can never fund.

Bottom line of Concept 9: AP2 is the authorization layer for audit-heavy agent commerce. Three mandate types (Intent, Cart, Payment) form a chain the user cannot later deny, proving consent at every step. Reference implementations ship in Python, TypeScript, Kotlin, and Go via Google's Agent Development Kit; you integrate with the OpenAI Agents SDK as @function_tool functions that create, sign, and validate mandates. Inngest's step.wait_for_event is the natural fit for the signing wait. Create Intent Mandates first, before shopping begins, never as a checkout afterthought.

Concept 10: x402, the HTTP-native settlement protocol

In one line: x402 lets an agent pay for an API call in one to two seconds with a stablecoin, by reviving the old HTTP 402 "Payment Required" status code.

What it is. x402 turns the dormant HTTP 402 status code into a working payment layer for APIs and machine-to-machine commerce. It was created by Coinbase (May 2025), with V2 launched in December 2025, and is now stewarded under the Linux Foundation's x402 Foundation (April 2026), with Cloudflare, Stripe, AWS, Google, and others as members. Apache 2.0. As of early 2026, x402 reports over 100 million payments to date across Base and Solana, with a growing facilitator ecosystem; numbers move fast, so verify current figures at x402.org and Coinbase's x402 launch pages.

Where it sits. x402 lives mostly at Layer 4 (Settlement) for machine-to-machine flows, but it reaches into Layer 1 (through Agent.market and similar directories) and Layer 3 (it works as a full commerce layer for plain API access where there is no real purchase lifecycle). For pure machine-to-machine flows, x402 is often the only protocol you need.

The four primitives that matter.

  1. The HTTP 402 status code. When an unpaid client requests a paid resource, the server returns 402 Payment Required plus a header carrying the payment requirements: the scheme (exact for a fixed amount), the network (in CAIP-2 form like eip155:8453, which just means "Base"), the asset (USDC), the recipient address, the max amount, and an expiry. (CAIP-2 is a standard way to name a blockchain so the protocol stays chain-agnostic.)
  2. The payment authorization header. The client retries with a header carrying a signed payment authorization. The signature is made off-chain, so the buyer pays no gas.
  3. EIP-3009 (transferWithAuthorization). The Ethereum standard x402 is built on. It lets the buyer sign a payment off-chain that someone else submits on-chain, so the buyer never touches the blockchain directly or pays gas.
  4. Facilitator. An optional third party that checks the signature and submits the payment on-chain for the merchant, so the merchant runs no blockchain plumbing. Coinbase and Cloudflare both run facilitators.
A note on header naming: verify against current docs

x402 went through V1 and V2, and header names differ across spec versions, facilitators, and SDKs. Cloudflare's current docs use PAYMENT-REQUIRED on the 402 response, PAYMENT-SIGNATURE on the client's retry, and PAYMENT-RESPONSE on the success reply. Some earlier examples use X-PAYMENT and X-PAYMENT-PROOF. The flow is identical; only the wire names differ. Check the exact names against x402.gitbook.io and your facilitator's docs before writing code. The trace below uses the roles (request, 402 with requirements, signed retry, success with proof) without picking one naming convention.

The flow in one trace.

1. Agent: GET https://api.example.com/data
2. Server: 402 Payment Required
<payment-required header>: { network: "eip155:8453", asset: USDC,
recipient: 0xMerchant..., max_amount: 100000,
expiry: 1716304800 }
3. Agent (signs an EIP-3009 authorization off-chain):
GET https://api.example.com/data
<payment-signature header>: <base64-encoded signed authorization>
4. Server: 200 OK
<payment-response header>: <transaction hash>
{ data: ... }

The whole transaction takes one to two seconds. No account creation, no API key, no session, no human in the loop.

What changes for the SDK. x402 has the simplest story of the four. Cloudflare ships a helper that wraps an MCP client with x402 payment ability; for non-MCP use, a Python buyer-side library plus the canonical pattern below covers it. The harness is the same Concept 8 baseline.

The from x402_client import ... constructor shown here, the from cloudflare_agents import withX402Client import, and the response.amount_paid_usdc field are illustrative. The real buyer-side package is x402-client (from x402_client import X402Client, real constructor X402Client(account=...), .get(url) returns an httpx.Response). The richer wallet constructor shown here is a teaching stand-in, and a live call needs a funded account and a real 402 endpoint, which this course does not do. Cloudflare's withX402Client is a JavaScript helper; there is no Python cloudflare_agents package. The MCP server (MCPServerStreamableHttp) is real Python; wrapping it with payment is illustrative here. No real funds move.

from agents import Agent, function_tool, RunContextWrapper
from agents.mcp import MCPServerStreamableHttp
from agents.tool_guardrails import (
tool_input_guardrail,
ToolInputGuardrailData,
ToolGuardrailFunctionOutput,
)
from x402_client import X402Client, X402Wallet
from cloudflare_agents import withX402Client
from decimal import Decimal
import json
from .models import DiscoveryResult, X402PaymentResult, PaymentToolResult

# Pattern 1: wrap an MCP server's tools with x402 payment ability
research_mcp = MCPServerStreamableHttp(
name="research-services",
params={"url": "https://research-services.example.com/mcp"},
)
research_mcp_with_payments = withX402Client(
research_mcp,
wallet=agent_wallet, # smart-contract wallet the agent controls
max_per_call_usdc=Decimal("0.10"),
max_per_session_usdc=Decimal("10.00"),
)

# Pattern 2: direct x402 calls as @function_tool
x402_client = X402Client(wallet=agent_wallet)

# Guardrail: refuse any x402 call that would exceed the session cap
@tool_input_guardrail
def enforce_x402_session_cap(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
args = json.loads(data.context.tool_arguments or "{}")
max_payment = Decimal(str(args.get("max_payment_usdc", 0)))
ctx = data.context.context
spent = Decimal(str(ctx.get("session_x402_spend_usdc", 0)))
cap = Decimal(str(ctx["user_session"].x402_session_cap_usdc))
if spent + max_payment > cap:
return ToolGuardrailFunctionOutput.reject_content(
f"x402 session cap would be exceeded: ${spent} spent + ${max_payment} requested > ${cap}"
)
return ToolGuardrailFunctionOutput.allow()

@function_tool(tool_input_guardrails=[enforce_x402_session_cap])
async def x402_fetch(
ctx: RunContextWrapper,
url: str,
max_payment_usdc: Decimal = Decimal("0.10"),
) -> X402PaymentResult | PaymentToolResult:
"""Fetch a URL that may require an x402 payment up to max_payment_usdc.
Signs the EIP-3009 authorization and retries with the payment-signature header.
The enforce_x402_session_cap guardrail above has already checked the session bound."""

try:
response = await x402_client.get(url, max_payment_usdc=max_payment_usdc)
# Update the spend tracker for later guardrail checks
current = Decimal(str(ctx.context.get("session_x402_spend_usdc", 0)))
ctx.context["session_x402_spend_usdc"] = current + Decimal(str(response.amount_paid_usdc))
return X402PaymentResult(
content=response.content,
amount_paid_usdc=Decimal(str(response.amount_paid_usdc)),
tx_hash=response.tx_hash,
)
except X402PaymentRequired as e:
return PaymentToolResult(
status="rejected",
error=f"Resource requires ${e.required_usdc}, exceeds max ${max_payment_usdc}",
)

@function_tool
async def x402_search_agent_market(
query: str,
max_price_per_call_usdc: Decimal = Decimal("0.05"),
) -> list[DiscoveryResult]:
"""Search Agent.market for x402-paywalled services matching the query."""
results = await agent_market_client.search(
query=query,
max_price_per_call_usdc=max_price_per_call_usdc,
)
return [
DiscoveryResult(
service_id=r.service_id,
name=r.name,
description=r.description,
price_per_call_usdc=Decimal(str(r.price_per_call_usdc)),
endpoint_url=r.endpoint_url,
)
for r in results
]

research_agent = Agent(
name="ResearchAgent",
instructions="""Research user queries by paying for data sources via x402.
Workflow:
1. Use x402_search_agent_market to find relevant paid services
2. Use x402_fetch to pull data from selected services (max $0.10 per call)
3. Synthesize the findings into a final report
Stay under $10 per research session.""",
tools=[x402_fetch, x402_search_agent_market],
mcp_servers=[research_mcp_with_payments],
)

What is real: the SDK scaffolding, the typed results, and MCPServerStreamableHttp (the MCP server is genuine Python). What is a stand-in: the buyer-side x402 client as written here, the withX402Client wrapper (JavaScript only in reality), and agent_market_client. Here are runnable mocks so the harness pattern executes without real funds:

from decimal import Decimal

class MockX402Response:
def __init__(self, content, amount, tx):
self.content, self.amount_paid_usdc, self.tx_hash = content, amount, tx
class MockX402Client:
"""Illustrative. Real buyer-side package: x402-client (X402Client(account=...), .get() -> httpx.Response)."""
async def get(self, url: str, max_payment_usdc: Decimal = Decimal("0.10")):
return MockX402Response(content="<paid data>", amount=Decimal("0.02"), tx="0xmocktx")
x402_client = MockX402Client()
agent_wallet = object() # stands in for the wallet handle

def withX402Client(mcp_server, **kwargs):
"""Illustrative: JavaScript-only in reality. Returns the server unchanged for the Python demo."""
return mcp_server

What it adds to the harness. Almost nothing. x402 needs no extra infrastructure beyond the Concept 8 baseline. The agent's smart-contract wallet lives at an address on the chain it transacts on (Base in most cases), the wallet's signing key sits in the key vault, and the buyer-side library handles signing and the HTTP retry. (A smart-contract wallet is a crypto wallet whose spending caps are enforced by code on the blockchain itself, so they hold even if the agent's code goes haywire.)

Running it durably (Inngest). x402 calls are short (one to two seconds) and idempotent: the same request and signature always produce the same outcome. They fit cleanly into step.run blocks, and the memoization pays off here. If the run crashes after paying for 5 of 10 API calls, the 5 paid calls are memoized and the retry pays only for the remaining 5. Without it, the retry would pay for all 10 again.

The wallet cap is the safety that protects you

The trap is treating x402 like handing the agent a credit card. The safety that actually protects you is the wallet's on-chain spend limit, not the per-request max_payment_usdc. Skip the on-chain limit and use a hot wallet with no cap, and a single stuck agent loop can drain it. Configure on-chain spend limits per agent identity, per session, and per merchant, three independent layers, so a bug in one does not empty the wallet.

Bottom line of Concept 10: x402 is the production-ready HTTP-native settlement protocol for machine-to-machine micropayments. The 402 status code, a signed payment-authorization header, and the EIP-3009 signature flow settle in one to two seconds with no account creation. Cloudflare's helper wraps MCP servers with payment ability; direct calls work through the Python buyer-side library. Inngest's step.run memoization saves duplicate payments on retry. The wallet's on-chain spend limit is the safety that protects you, not the per-request cap. Verify header names against the spec version and facilitator you integrate with.

Concept 11: MPP (Machine Payments Protocol), the sessions-based settlement protocol

In one line: MPP lets the agent open a prepaid tab with a spending cap, then stream many small payments against it across several rails until it closes the tab.

What it is. MPP is built by Stripe and Tempo, with public launch announcements in March 2026. Tempo is a layer-1 blockchain incubated by Stripe and Paradigm for high-frequency machine payments. MPP launched with partners across Stripe, Visa, Lightspark (the Lightning Network), and other ecosystem players; the partner list grows, so verify against mpp.dev and Cloudflare's MPP docs for current participants. Apache 2.0. MPP is Stripe's settlement-layer answer to x402: overlapping use case, different philosophy.

Where it sits. MPP lives at Layer 4 (Settlement) and competes directly with x402 for machine-to-machine payments. The key difference: MPP is multi-rail and supports both per-charge and session-based authentication, while x402 is stablecoin-only and per-request.

The HTTP shape. Where x402 uses the 402 code and custom headers, MPP layers payment onto standard HTTP authentication. The server returns WWW-Authenticate: Payment with the requirements, the client retries with Authorization: Payment <signed-payload>, and the server replies with Payment-Receipt carrying the settlement proof. This makes MPP feel like a familiar HTTP auth flow rather than a custom protocol.

The two intent types.

IntentLifecycleBest for
chargeA one-off transfer, authorized and settled in one round-tripSingle purchases, one-time API access, replacing individual x402 calls when you need multi-rail
sessionThe agent pre-authorizes a cap and a duration, then streams metered micropayments until it closesHigh-frequency micropayments where signing each request is expensive; recurring subscriptions; the "Stripe Subscription for agents" model

The trade against x402.

Dimensionx402MPP
Authorization frequencyPer request (an EIP-3009 signature each time)Per charge or per session (one auth, many metered calls)
HTTP shapeCustom 402 plus payment headersStandard WWW-Authenticate / Authorization / Payment-Receipt
RailsStablecoin only (USDC on Base/Solana/EVM)Multi-rail (Tempo stablecoin, Lightning, cards, ACH)
FeesZero protocol fee plus sub-cent gasStripe processing fees on card rails; near-zero on Tempo stablecoin
Recurring supportLimited (needs per-period signing)Native via the session intent
Best fitOne-off API calls, public infrastructureRecurring subscriptions, enterprise and Stripe-integrated merchants, flows needing a fiat fallback

What changes for the SDK. MPP integrates through the Stripe Python SDK plus Stripe's MPP surface. The integration leans on session management. The harness is the same Concept 8 baseline.

The from stripe.mpp import MPPSession import and the stripe.MPPSession calls below are illustrative. stripe.MPPSession and stripe.mpp are not in the Stripe Python SDK; MPP is the Stripe and Tempo standard (mainnet March 18, 2026), integrated through Stripe's MPP surface. The session concept (pre-authorize a cap, stream metered calls) is real; the exact API here is a teaching stand-in. The SDK scaffolding and typed results are real.

from agents import Agent, function_tool, RunContextWrapper
from agents.tool_guardrails import (
tool_input_guardrail,
ToolInputGuardrailData,
ToolGuardrailFunctionOutput,
)
from decimal import Decimal
import json
import stripe
from stripe.mpp import MPPSession
from .models import MPPSessionResult, MPPMeteredCallResult, PaymentToolResult

@tool_input_guardrail
def verify_mpp_session_authorized(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
"""Refuse session creation if the user has not pre-authorized this service."""
args = json.loads(data.context.tool_arguments or "{}")
service_id = args.get("service_id", "")
max_total = Decimal(str(args.get("max_total_usd", 0)))
user_session = data.context.context["user_session"]
if not user_session.can_authorize_mpp_session(max_total, service_id):
return ToolGuardrailFunctionOutput.reject_content(
f"User has not authorized MPP sessions of ${max_total} for service {service_id}"
)
return ToolGuardrailFunctionOutput.allow()

@function_tool(tool_input_guardrails=[verify_mpp_session_authorized])
async def mpp_create_session(
ctx: RunContextWrapper,
service_id: str,
max_total_usd: Decimal,
duration_seconds: int = 3600,
) -> MPPSessionResult:
"""Create an MPP session for one service with a spending cap and duration.
Returns session_id for the metered calls that follow.
The verify_mpp_session_authorized guardrail above has already validated consent."""

user_session = ctx.context["user_session"]
cents = int((max_total_usd * 100).quantize(Decimal("1")))
session = stripe.MPPSession.create(
service_id=service_id,
max_total_usd=cents,
duration_seconds=duration_seconds,
user_session_id=user_session.id,
# The MPP server picks the rail (stablecoin/Lightning/card) by service preference
)

ctx.context.setdefault("mpp_sessions", {})[service_id] = session.id
return MPPSessionResult(
session_id=session.id,
status="active",
expires_at=session.expires_at,
)

@function_tool
async def mpp_metered_call(
ctx: RunContextWrapper,
service_url: str,
payload: dict,
cost_estimate_usd: Decimal,
) -> MPPMeteredCallResult | PaymentToolResult:
"""Make a metered call inside an active MPP session.
The session's running spend updates automatically; the cap is enforced server-side."""

service_id = extract_service_id(service_url)
sessions = ctx.context.get("mpp_sessions", {})
session_id = sessions.get(service_id)
if not session_id:
return PaymentToolResult(
status="failed",
error=f"No active MPP session for service {service_id}. Create one first.",
)

response = await mpp_client.metered_call(
url=service_url,
payload=payload,
session_id=session_id,
cost_estimate_usd=cost_estimate_usd,
)
return MPPMeteredCallResult(
session_id=session_id,
cost_usd=Decimal(str(response.cost_usd)),
response_payload=response.payload,
accumulated_session_spend_usd=Decimal(str(response.accumulated_session_spend_usd)),
)

@function_tool
async def mpp_close_session(ctx: RunContextWrapper, session_id: str) -> MPPSessionResult:
"""Close an MPP session and finalize payment. Returns the total charged and the breakdown by rail."""
closed = await stripe.MPPSession.close(session_id)
return MPPSessionResult(
session_id=session_id,
status="closed",
total_charged_usd=Decimal(str(closed.total_charged_usd)),
rail_breakdown={
rail: Decimal(str(amount))
for rail, amount in closed.rail_breakdown.items()
},
)

api_consumer_agent = Agent(
name="APIConsumerAgent",
instructions="""Consume third-party APIs efficiently using MPP sessions.
Workflow:
1. Identify the service to consume
2. Create an MPP session with mpp_create_session ($X cap, Y seconds duration)
3. Make metered calls via mpp_metered_call
4. Close the session with mpp_close_session when done
Sessions are cheaper than per-request payment for high-frequency calls.""",
tools=[mpp_create_session, mpp_metered_call, mpp_close_session],
model="gpt-5.5",
)

What is real: the SDK scaffolding and the typed results. What is a stand-in: stripe.MPPSession and mpp_client. The session lifecycle is a real pattern; the exact Stripe API here is simplified for teaching. Here is a runnable mock:

from decimal import Decimal

class MockMPPSession:
@staticmethod
def create(**kw):
return type("S", (), {"id": "mpp_sess_1", "expires_at": None})()
@staticmethod
async def close(session_id):
return type("S", (), {"total_charged_usd": Decimal("4.20"),
"rail_breakdown": {"stablecoin": Decimal("4.20")}})()
mpp_client = type("Mock", (), {"metered_call": staticmethod(
lambda **k: type("R", (), {"cost_usd": Decimal("0.05"), "payload": {},
"accumulated_session_spend_usd": Decimal("0.05")})())})()

What it adds to the harness. One configuration step: your Stripe account needs MPP enabled. The Tempo blockchain integration is handled by Stripe's MPP server, so the agent never touches Tempo directly. Beyond the Stripe SDK and key management (same as ACP), there is nothing extra.

Running it durably (Inngest). MPP sessions map cleanly onto Inngest's long-running function pattern. The session lifecycle (create, use, close) becomes a sequence of step.run blocks, with step.sleep available for time-based expiry. The session model and durable execution compose well: both are built for stateful, multi-step work.

Where teams get this wrong. They create sessions that are too large or too long. The session cap is your loss limit if anything goes wrong. A $1,000 session set "for convenience" exposes you to $1,000 of agent-loop-gone-wrong losses. Right-size sessions to the actual expected work: for a 30-call task, a $5 session lasting 5 minutes is safer than a $50 session lasting an hour.

Bottom line of Concept 11: MPP is Stripe's settlement-layer answer to x402, tuned for sessions-based metering and multi-rail dispatch. Two intents, charge for one-offs and session for pre-authorized streaming, cover both single purchases and recurring metering. The session primitive is cheaper than x402 for high-frequency calls, and the multi-rail dispatch handles stablecoin, Lightning, and cards in one envelope. The HTTP shape uses standard WWW-Authenticate: Payment and Authorization: Payment instead of x402's custom headers, which makes it feel like familiar auth. You integrate via the Stripe SDK and Stripe's MPP surface, and Inngest's durable execution composes well with the session lifecycle. Right-size session caps to the expected work, not to "convenience."


Part 4: Composition rules, when to use which protocols together

Part 3 walked the four protocols one at a time. Now we put them back together. You met the idea in Part 1: a real system uses several protocols at once, one per layer. Part 4 gives you the rules for which combinations work, which fail, and how to keep the stack as small as the job allows.

Concept 12: The minimum viable agent-payment stack

In one line: The right stack is the smallest set of protocols that ships value for one use case, not all four protocols at every layer.

Before you compose anything, ask one question: what is the smallest stack that ships value here? The answer is almost never "all four protocols, every layer." Most production systems start with one layer fully wired and add the rest only when the use case forces it.

Here is the smallest stack for each common use case. It also previews the five decisions in Part 5.

Use caseLayer 1 (Discovery)Layer 2 (Auth)Layer 3 (Commerce)Layer 4 (Settlement)Why this is the MVP
Consumer shopping (agent buys retail goods for a person)AI shopping surface, or an MCP server with a merchant catalogACP SPT (or an AP2 mandate in regulated fields)ACPCard rails via StripeMost buyers want chargeback cover. ACP plus Stripe is the path that ships today.
API-paying agent (agent calls third-party APIs that charge)MCP server with x402 support, or an Agent.market directoryEIP-3009 signature (the x402 default)None: direct API callx402 on Base or SolanaFor machine-to-machine, Layers 2 and 4 collapse. There is no purchase lifecycle.
Enterprise procurement (agent buys from approved suppliers under rules)A2A discovery inside the partner networkAP2 Intent Mandate (required for audit)ACP or UCP for catalog suppliers; direct API for service buysMPP sessions for recurring; ACP SPT via card rails for one-offThe audit trail is the part you can't skip. AP2 mandates are required.
Multi-agent marketplace (agent hires other agents)A2A or an agent directoryAP2 mandate plus an ERC-8004 reputation checkNone: direct agent-to-agentx402 (most common), or MPP if Stripe is already wiredTrust runs both ways. Each side needs verifiable identity and payment proof.

The composition rule. Pick the protocol at each layer the use case demands. Do not add protocols at layers the use case never touches. A pure API-paying agent does not need ACP. A consumer-shopping agent should not reach for x402 on a $50 t-shirt, because the chargeback cover on cards is worth the 2.9% fee.

The trap. Some teams try to wire ACP, AP2, x402, and MPP all at once "for flexibility." You get four times the integration surface and no clear answer to "which protocol fires when." Pick one stack. Ship it. Add a second stack only when a second use case demands it.

Bottom line of Concept 12: The minimum viable stack is the smallest protocol set that ships value for one use case. Four common stacks cover most of it. Consumer shopping is ACP plus card rails. API-paying is x402 alone. Enterprise procurement is AP2 plus ACP or UCP plus MPP or cards. A multi-agent marketplace is AP2 plus ERC-8004 plus x402. Do not build a universal stack; pick one composition per use case and ship it.

Concept 13: When protocols compose across layers and compete within one

In one line: Protocols at different layers are built to stack together; protocols at the same layer are built to replace each other, so the only question is whether two protocols sit at the same layer.

Two protocols relate in one of two ways. They either compose, because they sit at different layers and were built to stack, or they compete, because they sit at the same layer and were built to substitute. Most architectural confusion comes from reading the second case as the first.

Across layers, built to compose:

CompositionLayer mappingWhere it ships
AP2 + ACPAP2 at Layer 2 (audit-grade auth), ACP at Layer 3 (commerce)Regulated fields where ACP's normal flow needs extra audit
AP2 + x402AP2 at Layer 2 (mandate auth), x402 at Layer 4 (stablecoin settlement)Crypto-native flows that still need audit, via the a2a-x402 extension
ACP + x402ACP at Layer 3 (commerce), x402 at Layer 4 for machine-to-machine sub-flows inside a consumer purchaseHybrid platforms where a consumer buy includes some API spend
MCP + x402 (via withX402Client)MCP at Layer 1 (discovery), x402 at Layer 4 (settlement)Cloudflare's standard pattern for paid MCP tools

Within one layer, built to compete:

CompetitionLayerWhen each wins
AP2 vs. ACP SPT vs. TAPLayer 2AP2 for audit-grade flows; ACP SPT for Stripe-wired consumer flows; TAP for identity-only checks
ACP vs. UCPLayer 3ACP for ChatGPT reach; UCP for Gemini reach; both for cross-surface sellers
x402 vs. MPPLayer 4x402 for one-off micropayments and pure stablecoin flows; MPP for sessions, subscriptions, and multi-rail flows

The test. When you are stuck between two protocols, ask one thing: are they at the same layer? If yes, you pick one, or you pay to support both for different sub-flows. If no, they probably compose, and the right design often uses both.

A worked composition: the AP2 + x402 stack. This is the crypto-native pattern, common now in B2B and agent-to-agent flows:

Layer 1 (Discovery): A2A directory inside the partner network
Layer 2 (Auth): AP2 Intent + Cart + Payment Mandates
Layer 3 (Commerce): Often none (direct service request), or ACP for a catalog
Layer 4 (Settlement): x402 via the a2a-x402 extension

The SDK assembly just composes the per-protocol tools from Part 3. These tools (a2a_discover_partners, ap2_create_intent_mandate, and the rest) are defined back in Concepts 8 through 11; here you only wire them onto one agent.

agent = Agent(
name="EnterpriseB2BAgent",
instructions="...",
tools=[
# Layer 1: Discovery
a2a_discover_partners,
# Layer 2: Authorization
ap2_create_intent_mandate,
ap2_create_cart_mandate,
# Layer 4: Settlement (a2a-x402 composes Layers 2 and 4)
ap2_settle_via_x402,
],
model="gpt-5.5",
# No commerce tools: direct B2B procurement.
)

Bottom line of Concept 13: Protocols at different layers compose: AP2 plus x402, ACP plus x402, MCP plus x402. Protocols at the same layer compete: AP2 against ACP SPT, ACP against UCP, x402 against MPP. The test is "are these at the same layer?" If yes, pick one. If no, they likely compose. The AP2 plus x402 stack is the standard crypto-native B2B composition.

Concept 14: Cost and latency, what forces the choice

In one line: Transaction size and the wait you can accept decide the settlement protocol, because card fees crush small payments and slow checkouts break tight loops.

Every composition has a price and a speed. A consumer-shopping flow on ACP plus card rails costs about 2.9% + $0.30 per transaction and takes 5 to 30 seconds end to end. An API-paying agent on x402 costs under a cent and takes 1 to 2 seconds. The right composition is partly set by what cost and what wait your use case can absorb.

Cost per transaction by composition.

CompositionTypical cost per transactionTypical latency
ACP + card rails (consumer shopping)2.9% + $0.30 (Stripe rate)5 to 30 seconds
ACP + MPP sessions (subscriptions)2.9% on cards, or about 0.5% on Tempo stablecoin, per session1 to 3 seconds per metered call
AP2 + x402 (B2B stablecoin)Sub-cent gas, zero protocol fees2 to 5 seconds (mandate signing adds 1 to 3)
x402 only (API-paying)Sub-cent gas, zero protocol fees1 to 2 seconds
MPP sessions only (recurring API)Near zero on Tempo stablecoin; Stripe rate on cards50 to 500 ms per metered call inside an active session

What money forces. Fees above about 5% of the transaction are a problem. A $0.05 API call that pays 2.9% + $0.30 in card fees costs more in fees than the call is worth, which is a clear signal to use x402 or MPP stablecoin instead. A $50 t-shirt paying the same 2.9% + $0.30 is fine. The dollar line where card rails stop making sense is around $5 to $10. Below it, machine-payment rails win; above it, the chargeback cover on cards is usually worth the fee.

What latency forces. A wait above 5 seconds per user-facing step is a problem. AP2 mandate signing can add 1 to 3 seconds, much longer if it waits on a human signature; ACP checkout adds 5 to 30. For agent-to-agent flows with no human in the loop, the budget is tighter still, often sub-second, which makes x402 on Base and MPP on Tempo the default picks.

The decision tree, compressed.

What is the transaction value?
├── Sub-dollar (per-call API, per-token billing)
│ → x402 only, or MPP sessions
├── $1 to $10 (small, low-stakes buys)
│ → x402 or MPP, with AP2 audit if you need it
├── $10 to $1,000 (consumer purchases)
│ → ACP + card rails (chargeback cover is worth the fee)
└── $1,000+ (B2B, enterprise procurement)
→ AP2 + ACP/UCP + MPP sessions, or bank rails

What is the latency budget?
├── Sub-second (multi-agent loops)
│ → MPP sessions on Tempo, or x402 on Base
├── 1 to 5 seconds (interactive)
│ → x402 or MPP; AP2 only if mandates are pre-signed
└── 5+ seconds (an acceptable user wait)
→ Full ACP checkout works

Bottom line of Concept 14: Composition choices have real cost and latency. Card rails cost 2.9% + $0.30 but give you chargeback cover; stablecoin rails cost under a cent but need crypto-native plumbing. Card rails stop making sense around $5 to $10 per transaction. ACP checkout gets too slow around 5 seconds for user-facing flows. Let cost and latency force the composition, not taste.


Part 5: The decision lab, five worked examples

Parts 2 through 4 gave you the framework: four layers, a few protocols per layer, composition across layers. Part 5 walks five real decisions. Each one shows the full reasoning: what layers the use case touches, which protocol at each layer, why, and what the agent code looks like.

Four use cases mapped to the four layers. Rows: consumer shopping, API-paying agent, enterprise procurement, multi-agent marketplace. Columns: Discovery, Authorization, Commerce, Settlement. Consumer shopping uses an AI shopping surface, ACP shared payment token, ACP, and card rails. API-paying agent uses MCP and a directory, an EIP-3009 signature, no commerce layer, and x402 on Base. Enterprise procurement uses an internal MCP, AP2 mandates, ACP plus direct B2B, and MPP sessions plus cards. Multi-agent marketplace uses A2A, AP2 plus ERC-8004, no commerce layer, and x402. No single protocol works across all four rows; no row uses fewer than three protocols. The right question is not &quot;which protocol&quot; but &quot;which protocol at each layer for this use case.&quot;

Read across a row to see one use case's full stack. Read down a column to see what changes at one layer as the use case changes. The five decisions below walk the rows in detail. The first four cover the matrix; the fifth rebuilds one of them in a completely different toolchain to prove the framework is not tied to any one vendor.

Decision 1: Consumer shopping agent (the ChatGPT Instant Checkout pattern)

The use case. You are building an agent that helps people shop on ACP-enabled merchants like Walmart, Etsy, and Shopify sellers. The user says what they want; the agent searches catalogs, shows options, and checks out on confirmation. Single buyer to single merchant, $5 to $500 per order, refunds and chargebacks required.

Walking the four layers.

  • Layer 1 (Discovery). The agent has to find products across many merchants. You could integrate each merchant's MCP catalog, or use the ChatGPT Shopping surface that already aggregates ACP merchants. Choice: the AI shopping surface, because the discovery work is already done and wiring up a million catalogs yourself is not viable.
  • Layer 2 (Authorization). The user is signed into the surface and confirms each buy, so authorization is simple. Choice: ACP SPT, minted by Stripe per purchase, scoped to one merchant, one amount, and a 10-minute window.
  • Layer 3 (Commerce). The full lifecycle matters: cart, checkout, fulfillment, disputes, refunds. Choice: ACP, the protocol built for exactly this.
  • Layer 4 (Settlement). Values of $5 to $500 sit right in the card-rail sweet spot, and chargeback cover is worth the fee. Choice: card rails via Stripe, the rail ACP assumes by default.

The implementation. This is Concept 8's code; the tools come from Concept 8.

shopping_agent = Agent(
name="ShoppingAgent",
instructions="""Help the user shop. Workflow:
1. Use acp_browse_merchant to find products matching the request
2. Show matched items; wait for the user to confirm
3. On confirm, use acp_create_cart_and_checkout to buy
4. Use acp_check_order for status
5. Use acp_refund only when the user asks""",
tools=[acp_browse_merchant, acp_create_cart_and_checkout, acp_check_order, acp_refund],
model="gpt-5.5",
)

Most likely failure in production. Cart mismatches: the agent builds a cart that does not match the request ("I asked for red, got pink"). Fix: require the user to confirm the cart before the SPT is minted, log cart-accuracy, and tune the instructions when accuracy drops below 95%.

Running it durably (Inngest). Use the step.wait_for_event pattern from Concept 8 for the cart-confirmation gate, plus a per-user concurrency cap.

Choose this one first. This is the live use case today: ChatGPT Instant Checkout, the ACP ecosystem, every Shopify merchant with ACP turned on. If you can only ship one composition, ship this one.

Decision 2: API-paying research agent (the x402-only pattern)

The use case. You are building a research agent that pays third-party APIs for data: financial feeds, news, specialized search. It discovers paid APIs at runtime through a directory like Agent.market, weighs cost against value, and pays per call. High-frequency micropayments of $0.001 to $0.50, no human in the loop after the task starts, no commerce lifecycle.

Walking the four layers.

  • Layer 1 (Discovery). Agent.market and similar x402-paywalled directories let the agent find services at runtime; MCP servers with x402 support handle pre-integrated ones. Choice: Agent.market plus MCP-via-Cloudflare, runtime discovery with a pre-wired fallback set.
  • Layers 2 and 4 (Authorization and Settlement, collapsed). No human after task start. The wallet's on-chain caps bound per-transaction spend; user-level caps run via an SDK tool_input_guardrail on each paying tool. The EIP-3009 signature is both the authorization and the settlement. Choice: x402 on Base (USDC), with no separate mandate protocol.
  • Layer 3 (Commerce). The "purchase" is just an API call. Choice: none, direct API access via x402.

The implementation. This is Concept 10's code; the tools come from Concept 10.

research_agent = Agent(
name="ResearchAgent",
instructions="""Research the user's query by paying for data via x402.
1. Use x402_search_agent_market to find relevant paid services
2. Use x402_fetch to pull data (max $0.10 per call)
3. Write up the findings
Stay under $10 per session.""",
tools=[x402_fetch, x402_search_agent_market],
mcp_servers=[research_mcp_with_payments],
model="gpt-5.5",
# Spend caps live on x402_fetch via tool_input_guardrails
# (Concept 10's enforce_x402_session_cap), not on the agent.
)

Most likely failure in production. Runaway spend from a stuck loop: the agent re-fetches the same data and burns the budget. Fix: the wallet's on-chain caps (the safety that actually protects you), SDK session-spend guardrails, and a dedup cache so identical fetches do not re-pay.

Running it durably (Inngest). step.run memoization pays off here. Crash mid-session and the retry resumes with the already-paid-for data intact. A per-user concurrency cap stops one burst from taking over.

Pure machine-to-machine. Layers 2 and 4 collapse into one signature, and Layer 3 is empty. This stack is structurally simpler than Decision 1: fewer protocols, fewer integration points, lower cost per call. The trade is no chargeback cover and no commerce semantics, which is fine because the use case needs neither.

Decision 3: Enterprise procurement agent (the AP2 plus composed-stack pattern)

The use case. You are building a procurement agent for a regulated enterprise in financial services. Buyers delegate tasks: "buy 50 ergonomic keyboards from our approved suppliers, under $5,000, by Friday." The audit trail is legally required, spend caps run at several levels, and suppliers are pre-approved, so there is no runtime discovery.

Walking the four layers.

  • Layer 1 (Discovery). The supplier list is known in advance. Choice: an internal MCP server with supplier catalogs, because the discovery scope is bounded.
  • Layer 2 (Authorization). A non-repudiable audit trail is the part you can't skip; a non-repudiable record is one the signer cannot later deny. AP2 mandates give exactly that. Choice: AP2, with an Intent Mandate at task creation, a Cart Mandate before checkout, and a Payment Mandate at settlement, each signed by the procurement officer.
  • Layer 3 (Commerce). Large suppliers expose ACP or UCP; smaller ones expose direct B2B APIs. Choice: ACP for ACP-enabled suppliers, direct API for the rest; the agent handles both.
  • Layer 4 (Settlement). Recurring suppliers favor MPP sessions; one-off buys favor card rails for chargeback cover at higher values. Choice: MPP sessions for recurring, ACP SPT plus card rails for one-off, picked by supplier history.

The implementation. This composes the tools from Concepts 8, 9, and 11.

procurement_agent = Agent(
name="ProcurementAgent",
instructions="""Run procurement under audit-grade compliance:
1. ALWAYS create an Intent Mandate first via ap2_create_intent_mandate
2. Search approved suppliers via the internal MCP server
3. Build the cart and create a Cart Mandate via ap2_create_cart_mandate
4. Recurring suppliers: use an MPP session.
One-off buys: use ACP plus card rails via ap2_settle_via_acp
5. Record every mandate ID in the procurement audit log""",
mcp_servers=[approved_suppliers_mcp],
tools=[
ap2_create_intent_mandate,
ap2_create_cart_mandate,
ap2_settle_via_acp,
mpp_create_session,
mpp_metered_call,
mpp_close_session,
],
model="gpt-5.5",
# Caps and mandate rules run via tool_input_guardrails on the paying tools
# (Concepts 9, 11, 15). require_intent_mandate on ap2_create_cart_mandate blocks
# any cart with no prior Intent Mandate; enforce_per_run_spend_cap blocks any
# payment over the user's run cap.
)

Most likely failure in production. Intent Mandate scope mismatches: the officer signs a mandate, the agent does the work, then the cart does not fit the mandate. Fix: validate the mandate scope before the agent starts shopping (Concept 9's "create Intent Mandates first"), and reject tasks whose scope exceeds what the user can authorize.

Running it durably (Inngest). AP2 mandate signing is the natural fit for step.wait_for_event. Multi-stage tasks use step.run per stage, with a per-user concurrency cap.

Regulated industry. Audit rules force AP2 at Layer 2; many suppliers force ACP and direct APIs together at Layer 3; recurring versus one-off forces MPP and cards together at Layer 4. This composition is heavier than Decisions 1 and 2, and the audit requirement is what justifies the weight.

Decision 4: Multi-agent marketplace (the AP2 plus x402 plus ERC-8004 pattern)

The use case. You are building a platform where agents hire other agents. Agent A needs research; Agent B sells research as an x402 service. Neither trusts the other yet, transactions must be verifiable, and payment is pure crypto-native with no cards. Agent to agent, $0.10 to $100 per transaction, both sides need verifiable identity.

Walking the four layers.

  • Layer 1 (Discovery). Agent B publishes its capability over A2A; Agent A finds it. Choice: A2A, the protocol built for this.
  • Layer 2 (Authorization). No human at transaction time, but trust must be verifiable both ways. AP2 mandates prove user consent; ERC-8004 gives Agent B an on-chain reputation Agent A can check first. Choice: AP2 plus ERC-8004, composed for full bilateral verification.
  • Layer 3 (Commerce). No carts, no refunds, just "do this task and deliver the report." Choice: none, direct A2A request and response.
  • Layer 4 (Settlement). Crypto-native, sub-second, no chargeback cover needed. Choice: x402 via the a2a-x402 extension to AP2.

The implementation. This composes the tools from Concepts 9 and 10 with A2A discovery.

researcher_hiring_agent = Agent(
name="ResearcherHiringAgent",
instructions="""Hire research-specialist agents to work for you:
1. Use a2a_discover_researchers to find available agents
2. Check ERC-8004 reputation (>50 successful jobs, no flagged disputes)
3. Create an Intent Mandate scoped by amount and recipient
4. Submit the task via A2A with the mandate attached
5. Receive the result; settle via ap2_settle_via_x402""",
tools=[
a2a_discover_researchers,
erc8004_check_reputation,
ap2_create_intent_mandate,
a2a_submit_task_with_mandate,
ap2_settle_via_x402,
],
model="gpt-5.5",
)

Most likely failure in production. Trusting a reputation that has been gamed: ERC-8004 scores are auditable but an operator can inflate one with small successful jobs. Fix: combine reputation with other signals (operator identity, transaction-volume thresholds, dispute history), and add human review above a set amount for first-time counterparties.

Running it durably (Inngest). Fan out when hiring several specialists at once; use step.wait_for_event for each result and step.run per stage. This decision touches every Inngest primitive.

Pure multi-agent economy. No human in the loop at transaction time; both agents act inside pre-authorized scopes. This is the shape the agent economy is being built around, and it has the most failure modes, because bilateral trust with no prior relationship is hard and the protocol stack only partly solves it.

Decision 5: A non-Stripe, non-OpenAI stack (proving the framework travels)

Every code sample so far used stripe.PaymentTokens.create(...) and the OpenAI Agents SDK. Those are the most mature integrations for ACP and the SDK is this course's runtime, but the four-layer architecture is stack-agnostic by design, and a course that only shows one stack has not proven that. So this decision rebuilds Decision 2, the API-paying research agent, on a completely different toolchain: Google's Agent Development Kit (ADK) for the runtime, a Coinbase smart-contract wallet for on-chain identity, AP2 mandates for authorization, and direct x402 for settlement. Zero Stripe, zero OpenAI.

The use case stays Decision 2's. A research agent paying $0.001 to $0.10 per call, capped at $10 per session. Sub-dollar, no human after authorization, machine-to-machine settlement. The architecture stays Decision 2's too: x402 collapses Layers 2 and 4, no commerce layer, MCP for discovery. Only the library changes.

The block below is illustrative. The Google ADK structure (Agent, the tool decorator) is real and google-adk is a real package. The payment pieces are stand-ins: AP2's real package is ap2 with very different mandate fields and no MandateSigner class, the Coinbase wallet is configured through coinbase_agentkit's SmartWalletProvider, and a live x402 call needs a funded account and a real 402 endpoint. No real money moves here. The point is the shape, not a runnable buyer.

from google.adk import Agent
from google.adk.tools import function_tool # ADK's tool decorator
from coinbase_agentkit import AgentKit, SmartWalletProvider
from decimal import Decimal
from datetime import datetime, timedelta

# Shared result models come from the Pydantic sidebar in Part 3.
from .models import X402PaymentResult, PaymentToolResult, DiscoveryResult

# --- Illustrative stand-ins for the payment rails (real APIs differ) ---
class MockMandate:
def __init__(self, mid, rules=None): self.id, self.rules = mid, (rules or {})
class MockSigner:
async def sign(self, mandate): return mandate # real AP2 signs over A2A
agent_market_client = type("Mock", (), {"search": staticmethod(lambda **k: [])})()
x402_client = type("Mock", (), {})() # real client: x402-client
# -----------------------------------------------------------------------

# Layer 2 (Authorization): a Coinbase smart-contract wallet gives the agent its
# on-chain identity and its spend caps. This is the analog of a Stripe customer
# plus per-customer caps, but enforced by the chain.
wallet_provider = SmartWalletProvider(
config={
"chain": "base-mainnet",
"spend_limits": {
"per_transaction_usdc": Decimal("0.50"),
"per_session_usdc": Decimal("10.00"),
"per_day_usdc": Decimal("100.00"),
},
},
)
agent_kit = AgentKit(wallet_provider=wallet_provider)

# Layer 2 (Authorization): an AP2 Intent Mandate, signed by the user at session start.
# The analog of a signed, revocable Stripe authorization, but declarative.
async def create_research_intent_mandate(user_did, user_signer, session_cap_usdc):
mandate = MockMandate(
"intent_mock_1",
rules={
"max_total_usd": str(session_cap_usdc),
"allowed_categories": ["data-api", "research-service"],
"expires_at": (datetime.utcnow() + timedelta(hours=1)).isoformat(),
},
)
return await user_signer.sign(mandate)

# Layer 1 (Discovery) + Layer 4 (Settlement) as one tool.
# ADK's @function_tool is the analog of the SDK's @function_tool.
@function_tool
async def x402_paid_fetch(url: str, max_payment_usdc: Decimal) -> X402PaymentResult | PaymentToolResult:
"""Fetch a URL that may need x402 payment up to max_payment_usdc.
The wallet handles the signature; the on-chain cap is the safety that protects you."""
# ADK has no tool_input_guardrail, so the check runs in-tool,
# backed by the wallet's on-chain cap (the layer nothing can bypass).
resp = await x402_client.get(url, max_payment_usdc=max_payment_usdc)
return X402PaymentResult(
content=resp.content,
amount_paid_usdc=Decimal(str(resp.amount_paid_usdc)),
tx_hash=resp.tx_hash,
)

@function_tool
async def search_agent_directory(query: str, max_price_per_call_usdc: Decimal) -> list[DiscoveryResult]:
"""Search Agent.market for x402-paywalled services."""
results = await agent_market_client.search(query=query, max_price_per_call_usdc=max_price_per_call_usdc)
return [
DiscoveryResult(
service_id=r.service_id,
name=r.name,
description=r.description,
price_per_call_usdc=Decimal(str(r.price_per_call_usdc)),
endpoint_url=r.endpoint_url,
)
for r in results
]

# The agent itself: a Google ADK Agent, the direct analog of the SDK's Agent.
research_agent = Agent(
name="research-agent",
description="Research the user's query by paying for data via x402.",
instructions="""Research the query.
1. Use search_agent_directory to find relevant paid services
2. Use x402_paid_fetch to pull data ($0.50 max per call)
3. Write up the findings
Stay under $10 per session.""",
model="deepseek-v4-flash", # illustrative; any ADK-compatible model id works
tools=[search_agent_directory, x402_paid_fetch],
)

# Driver: the user signs the Intent Mandate once at session start;
# the agent then runs on its own inside the mandate's scope until the session ends.
async def run_research_session(user_did, user_signer, query):
intent = await create_research_intent_mandate(
user_did=user_did,
user_signer=user_signer,
session_cap_usdc=Decimal("10.00"),
)
return await research_agent.run_async(
query,
context={"intent_mandate": intent, "wallet": agent_kit},
)

The line-by-line translation. Decision 2's OpenAI plus Stripe code on the left, Decision 5's Google ADK plus Coinbase code on the right. This table is the payoff: every row is the same concept under a different name.

Decision 2 (OpenAI + Stripe)Decision 5 (Google ADK + Coinbase)Same concept, different library
from agents import Agent, function_toolfrom google.adk import Agent + from google.adk.tools import function_toolAgent runtime
@function_tool (OpenAI Agents SDK)@function_tool (Google ADK)Tool decorator
RunContextWrappercontext={...} kwarg to run_asyncPer-run state
stripe.Customer.modify(...) for capsSmartWalletProvider(spend_limits={...})Spend caps, chain-native
tool_input_guardrail decoratorin-tool check + wallet capsPre-execution validation
Runner.run(agent, ...)agent.run_async(...)Agent execution

What stays identical. The architecture: Layer 1 is MCP or a directory, Layer 2 is a mandate plus wallet caps, Layer 3 is none (machine-to-machine collapses commerce), Layer 4 is x402. The primitives: Intent Mandate, EIP-3009 signatures, 402 responses, payment-signature headers, on-chain spend caps. The framework is what survives the library swap.

Two real operational differences.

  1. Google ADK has no first-class tool input guardrail (as of mid-2026). The SDK's tool_input_guardrail is genuinely handy for pre-execution checks; ADK's tool decorator has no direct equal yet. The workaround is in-tool validation backed by the wallet's on-chain caps. The caps still protect you; the in-tool check just fails faster. Pick ADK and you trade the guardrail convenience for a different multi-agent story.
  2. AP2 mandate signing is more native here. Google built AP2, so the ADK ecosystem integrates the mandate-signing UI flows more cleanly. If your use case leans hard on mandate-rigorous authorization (Decisions 3 and 4), ADK plus AP2 is a real fit, not just an alternative.

Bottom line of Decision 5: The four-layer architecture is not a Stripe-and-OpenAI story in disguise. Discovery, Authorization, Commerce, Settlement, pick one protocol per layer, justify it against the use case: that survives any library swap. Decision 5 used Google ADK and a Coinbase wallet to build the exact same composition as Decision 2; only the imports changed. If a framework cannot be expressed in a different stack, it is a library tutorial wearing a costume. This one is not.


Part 6: Production concerns: what kills you when the system goes live

Parts 1 through 5 built the framework and walked real decisions. Part 6 covers the things that decide whether your stack survives real users. These are the failures that do not show up in a demo and do show up at 2 a.m.

Reader track

If you are reading to understand, not to ship, you can skim Part 6. If you are building any of this for real, this is where it gets real. The four concepts here are the difference between a system that works and a system that drains a wallet.

Concept 15: Spend-limit enforcement at three architectural levels

In one line: Stop an agent from overspending by enforcing the limit in three independent places, so a bug in one is caught by the other two.

The single biggest way an agent-commerce system fails badly is simple: the agent spends more than it was allowed to. A stuck agent loop can drain a wallet in seconds. Each protocol has its own cap (ACP's SPT amount, MPP's session cap, x402's per-request max), but no single one of them is enough on its own. Production systems enforce spend limits at three separate levels.

Level 1: wallet and payment-method limits. This is the cap that actually protects you. The agent's smart-contract wallet (for x402) or its Stripe customer account (for ACP and MPP) carries spend caps set at the infrastructure level. The chain or Stripe enforces them no matter what the agent code does. This is the only level that holds when the agent loop fails completely.

For x402 with a smart-contract wallet:

The SmartContractWallet.deploy(...) call below is illustrative. It shows where Level 1 caps live, not a real pip package. In practice you set these caps on a real smart-contract wallet (for example through Coinbase AgentKit's SmartWalletProvider). The three-level discipline is the real lesson.

from decimal import Decimal

# Set ONCE when the wallet is deployed. The agent cannot change this.
wallet_spend_limits = {
"max_per_transaction_usdc": Decimal("10.00"), # cap per single transfer
"max_per_day_usdc": Decimal("100.00"), # rolling 24-hour cap
"max_per_merchant_usdc": Decimal("50.00"), # cap to any single recipient
}
agent_wallet = SmartContractWallet.deploy(
owner=user_did,
spend_limits=wallet_spend_limits,
chain="eip155:8453", # Base
)

For ACP or MPP with Stripe, the cap lives on the Stripe customer:

stripe.Customer.modify(...) is a real Stripe API call. It runs against your Stripe account.

# Set once via the Stripe Dashboard or API. The caps live in Stripe's infrastructure.
stripe.Customer.modify(
user_session.stripe_customer_id,
metadata={
"max_per_session_usd": "500",
"max_per_day_usd": "2000",
},
)
# When an SPT or MPP session is minted above these, Stripe rejects it at the API level.

Level 2: SDK tool guardrails. The OpenAI Agents SDK's tool_input_guardrail runs before each payment tool executes and can reject the call. You met this in Concept 5: it is the SDK-native way to stop a payment before it happens, and the family that runs in time. Here it gets the full treatment, because this is the canonical guardrail block for the whole course. The code below is real and runs.

import json
from decimal import Decimal
from agents import Agent, function_tool, RunContextWrapper
from agents.tool_guardrails import (
tool_input_guardrail,
tool_output_guardrail,
ToolInputGuardrailData,
ToolOutputGuardrailData,
ToolGuardrailFunctionOutput,
)

# Tool INPUT guardrail: pre-payment check. Runs BEFORE the tool executes.
@tool_input_guardrail
def enforce_per_run_spend_cap(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
"""Reject any payment tool call where the run's total spend would exceed the user's cap.
Runs before the tool executes, the only guardrail family that can stop a payment in time."""
args = json.loads(data.context.tool_arguments or "{}") # raw JSON args string -> dict
requested = Decimal(str(args.get("max_total_usd") or args.get("max_payment_usdc") or 0))
ctx = data.context.context # the run context (a dict)
cap = Decimal(str(ctx["user_session"].per_run_spend_cap_usd))
spent = Decimal(str(ctx.get("run_spend_usd", 0)))
if spent + requested > cap:
return ToolGuardrailFunctionOutput.reject_content(
f"Refused: would spend ${spent + requested}, run cap is ${cap}"
)
return ToolGuardrailFunctionOutput.allow()

# Tool OUTPUT guardrail: post-payment check. Runs AFTER the tool executes.
# Useful for verifying receipts (paid more than expected, wrong amount, etc.).
@tool_output_guardrail
def verify_receipt_integrity(data: ToolOutputGuardrailData) -> ToolGuardrailFunctionOutput:
output = data.output or {}
if isinstance(output, dict) and "amount_paid_usdc" in output:
# Cross-check the receipt against what we asked for.
pass
return ToolGuardrailFunctionOutput.allow()

# Attach BOTH guardrails to every payment-authorizing tool.
@function_tool(
tool_input_guardrails=[enforce_per_run_spend_cap],
tool_output_guardrails=[verify_receipt_integrity],
)
async def x402_fetch(
ctx: RunContextWrapper,
url: str,
max_payment_usdc: Decimal,
) -> "X402PaymentResult":
...

# An agent-level output_guardrail is fine for final-reply safety
# (like redacting PII in the agent's answer), but it does NOT prevent payments.
agent = Agent(
name="ShoppingAgent",
tools=[x402_fetch],
# output_guardrails=[response_safety_guardrail], # different job, not payment safety
)
Use the right guardrail family

The SDK has three guardrail families. Input guardrails run on the user's first message to the agent. Output guardrails run on the agent's final reply. Tool guardrails (tool_input_guardrail and tool_output_guardrail) run on every custom tool call. For payment safety you want tool_input_guardrail specifically: it is the only family that fires before a payment tool runs and can block it. Reaching for output_guardrail to control spend is the single most common mistake in agent-commerce code. By the time it fires, the money is gone.

Level 3: application and business-logic limits. Your own code enforces the user-specific rules: per-user daily caps, per-category caps, allowed merchants. This is where business rules live. "This user can spend $500 a day at any merchant, but only $50 a day at unverified ones." That rule belongs here, not in a protocol. This code is plain Python and real.

from decimal import Decimal

class UserSession:
def can_spend(self, amount_usd: Decimal, merchant_id: str) -> bool:
# Per-day cap
if self.today_spend_usd + amount_usd > self.daily_cap_usd:
return False
# Per-merchant cap
merchant_cap = self._merchant_cap_for(merchant_id)
if self.merchant_spend_usd[merchant_id] + amount_usd > merchant_cap:
return False
# Per-category cap (for example, "office supplies" vs "personal")
category = self._category_for(merchant_id)
if self.category_spend_usd[category] + amount_usd > self.category_caps[category]:
return False
return True

Each level lives in different infrastructure. Level 1 is in the chain or Stripe. Level 2 is in the agent SDK. Level 3 is in your application code. A bug in one is caught by the other two. Skip Level 1 and one agent-loop bug can drain the whole wallet. Skip Level 2 and you lose the power to abort a run mid-flight. Skip Level 3 and you cannot enforce per-user or per-category policy.

The trap is trusting the protocol caps alone. ACP's SPT cap, MPP's session cap, and x402's per-request max are protocol-level limits. They stop specific protocol abuse, but they do not add up across protocols. A team using only ACP SPTs with a $50 cap each has no protection against an agent minting 100 of them in a row, for $5,000 total. The three levels above exist precisely because protocol caps do not aggregate.

Bottom line of Concept 15: Enforce spend limits at three independent levels: wallet or payment-method infrastructure, SDK tool guardrails (tool_input_guardrail specifically, since it runs before each tool and can block it), and application business logic. Each level uses different infrastructure, so a bug in one is caught by the others. The most common mistake is using agent-level output_guardrail for spend control. It runs on the final reply, too late, the payment already happened. Protocol-level caps (SPT, MPP session, x402 per-request) are necessary but not enough; they do not aggregate across protocols or across runs. Production systems enforce at all three.

Concept 16: Agent identity hygiene: keys, wallets, and audit logs

In one line: The agent's signing key is the only thing separating authorized spending from fraud, so you protect the key, separate it per agent, rotate it, and log every spend to durable storage.

Agent commerce brings a failure that is unique to autonomous systems: the agent's cryptographic identity is all that stands between real spending and fraud. If the signing key leaks, the wallet (or Stripe customer, or AP2 mandate signer) can be drained or impersonated until you rotate the key. Identity hygiene is the set of habits that prevents this. There are four.

1. Per-agent wallet separation. Each agent, or each class of agent, gets its own wallet or payment handle. Never share signing keys across agents with different jobs. If your shopping agent and your procurement agent share one wallet, a compromise of either drains both. Separate wallets cost almost nothing (a one-time deployment cost) and the security gain is real.

SmartContractWallet.deploy(...) is illustrative, as in Concept 15. The pattern is what matters: one wallet per agent class, never shared.

# Wrong: one wallet shared across agents
shared_wallet = SmartContractWallet.deploy(...)
shopping_agent.wallet = shared_wallet
procurement_agent.wallet = shared_wallet
research_agent.wallet = shared_wallet # one compromise drains all three

# Right: a separate wallet per agent class
shopping_agent.wallet = SmartContractWallet.deploy(
spend_limits={"max_per_day_usdc": 100},
)
procurement_agent.wallet = SmartContractWallet.deploy(
spend_limits={"max_per_day_usdc": 1000, "allowed_recipients": [...]},
)
research_agent.wallet = SmartContractWallet.deploy(
spend_limits={"max_per_day_usdc": 50, "max_per_call_usdc": 0.50},
)

2. Key rotation, on a schedule and on demand. Rotate signing keys every 90 days as a baseline (matching Stripe's advice for API keys). Rotate them right away when an agent operator leaves the team, when a deployment touches the signing surface, or when something looks wrong. The habit of rotating matters more than the exact number of days.

The pattern below is real. The client name azure_key_vault is illustrative; use your provider's vault SDK. The point is that the key lives in a vault and you read the current version at use time.

# Read the current key version from the vault. The version changes when the key rotates.
def get_signing_key(agent_class: str) -> SigningKey:
return azure_key_vault.get_latest_version(
secret_name=f"agent-wallet-signing-key-{agent_class}",
)

# Old transactions, signed with the previous version, stay valid until they expire.
# New transactions use the current version.

3. Audit logs that survive a crash. Every authorization decision gets logged to durable storage that lives apart from the agent's runtime: every SPT minted, every mandate signed, every x402 signature, every MPP session opened. If the agent crashes, the audit log must still be there. Neon Postgres plus Inngest's step memoization gives you this; you can also write straight to object storage (S3 or equivalent) for maximum durability.

The audit-log pattern is the teaching. The client name neon_client is illustrative; use your own database client. The rule is to log the decision before the payment happens.

# Every payment-authorizing action logs to durable storage BEFORE the action completes.
@function_tool
async def acp_create_cart_and_checkout(
ctx: RunContextWrapper,
merchant_id: str,
items: list["CartItem"],
max_total_usd: Decimal,
) -> "CheckoutResult":
audit_id = str(uuid4())

# Log the authorization decision FIRST, before any payment happens.
await neon_client.audit_log.insert({
"audit_id": audit_id,
"agent_class": ctx.context["agent_class"],
"user_did": ctx.context["user_session"].did,
"action": "acp_create_cart_and_checkout",
"merchant_id": merchant_id,
"max_total_usd": max_total_usd,
"timestamp": datetime.utcnow().isoformat(),
"status": "initiated",
})

try:
result = await _actually_complete_checkout(merchant_id, items, max_total_usd)
await neon_client.audit_log.update(audit_id, {
"status": "completed",
"actual_total_usd": result.total_charged_usd,
"order_id": result.order_id,
})
return result
except Exception as e:
await neon_client.audit_log.update(audit_id, {"status": "failed", "error": str(e)})
raise

4. Distributed traces across the whole transaction. The audit log tells you what succeeded. Traces tell you what happened, including the calls that failed, retried, or stalled. In agent commerce the full trace is often the only way to debug a failed transaction, because one user request can fan out into an SDK run, 5 to 10 tool calls, 2 or 3 protocol HTTP requests, a Stripe webhook that arrives later, and an Inngest function that resumes hours afterward. Without one trace ID tying all of that together, a post-mortem is impossible. The OpenTelemetry code below is the real, stable OTel API.

from opentelemetry import trace
from opentelemetry.trace import Status, StatusCode

tracer = trace.get_tracer("agent-commerce")

@function_tool
async def acp_create_cart_and_checkout(
ctx: RunContextWrapper,
merchant_id: str,
items: list["CartItem"],
max_total_usd: Decimal,
) -> "CheckoutResult":
# The span name is the protocol action; attributes capture what you filter by
# in your observability tool (Datadog, Honeycomb, Grafana, and so on).
with tracer.start_as_current_span(
"acp.checkout",
attributes={
"agent.class": ctx.context["agent_class"],
"user.did": ctx.context["user_session"].did,
"acp.merchant_id": merchant_id,
"acp.max_total_usd": float(max_total_usd),
"acp.item_count": len(items),
},
) as span:
try:
result = await _actually_complete_checkout(merchant_id, items, max_total_usd)
span.set_attribute("acp.order_id", result.order_id)
span.set_attribute("acp.actual_total_usd", float(result.total_charged_usd))
span.set_status(Status(StatusCode.OK))
return result
except Exception as e:
span.set_status(Status(StatusCode.ERROR, str(e)))
span.record_exception(e)
raise

The trace context flows automatically through httpx, openai-agents, and Inngest with the right instrumentation. When a Stripe webhook arrives 20 minutes later for a dispute on this order, the webhook handler joins the same trace through the trace_id carried in the order's metadata. One trace ID covers the whole life of the transaction, from the first request to dispute resolution.

Audit logs and traces answer different questions, and you need both. Audit logs answer business questions ("how much did this user spend on Tuesday?"). Traces answer debugging questions ("why did the checkout for order abc123 fail?"). The audit log only records the successful path; it never captures the call that retried five times before working, or the protocol call that stalled for 30 seconds before timing out. Those failure shapes only appear in traces. Skipping traces because you have audit logs is a mistake about what each tool is for.

The most common identity mistake is treating the wallet's address as the agent's identity. The address is public: anyone can send to it and anyone can verify a transaction came from it. The private signing key is the identity, and you protect it as one. Teams that keep signing keys in environment variables (or worse, in source code) have handed the agent's identity to anyone with access to those secrets. Keys go in a vault, never in env vars.

Bottom line of Concept 16: Agent identity is cryptographic; the signing key is the only thing separating authorized spending from fraud. Four habits protect it. Per-agent wallet separation, so one compromise does not drain every agent. Key rotation on a schedule and on demand (90-day baseline, immediate on a trigger). Audit logs to durable storage that lives apart from the runtime (Neon Postgres or object storage). Distributed traces with one trace ID spanning the whole transaction (OpenTelemetry through the SDK, httpx, Inngest, and FastAPI). Signing keys go in a key vault, never in environment variables or source code. Audit logs answer business questions; traces answer debugging questions; you need both.

Concept 17: Dispute and refund mechanics across the four protocols

In one line: Each protocol handles disputes and refunds differently, and the dispute model your use case needs is often the strongest thing that decides which protocols you compose.

The four protocols each treat disputes and refunds in their own way, and that difference often decides the protocol choice for a use case. A use case that needs chargeback protection cannot run on pure x402. A use case where the seller has no customer-service setup cannot use ACP. Here is how each one handles a dispute, so you can match the protocol to the dispute model you actually need.

ACP: disputes through the card network. Because the merchant stays the merchant of record in ACP, every standard card-network dispute path works. The buyer's bank starts the chargeback; Stripe (or whichever processor) handles the merchant's defense; the merchant follows its existing refund policy. This is ACP's biggest practical advantage. Every retail buyer's expectation about returns simply works.

For an ACP refund the agent starts:

acp_client.refunds.create(...) is illustrative, like the other ACP client calls in this course. The result model and the tool wiring are real.

@function_tool
async def acp_refund(
order_id: str,
reason: str,
amount_usd: Decimal | None = None,
) -> "RefundResult":
"""Start a refund through ACP. The merchant's standard refund policy applies."""
raw = await acp_client.refunds.create(
order_id=order_id,
reason=reason,
amount_usd=amount_usd, # None means full refund
)
return RefundResult(
refund_id=raw.refund_id,
order_id=order_id,
status=raw.status,
amount_refunded_usd=raw.amount_refunded_usd,
)

AP2: disputes settled through the audit trail. AP2's contribution to a dispute is the mandate chain (Intent, then Cart, then Payment). When a dispute comes up, that chain is evidence of what the user actually authorized, and it holds up legally. It does not replace the underlying rail's dispute path: if the AP2 mandate authorized a card payment through Stripe, Stripe's dispute process still applies. AP2 adds the proof of what the user agreed to.

The dispute flow:

1. User claims: "I never authorized this purchase."
2. Merchant retrieves the signed AP2 Cart Mandate from the transaction record.
3. Merchant presents the Cart Mandate (with the user's signature) to the
payment processor as part of the dispute defense.
4. The card network or processor checks the signature against the user's
registered public key. If it is valid, the dispute is resolved for the merchant.

x402: no formal dispute mechanism. Pure x402 payments are non-refundable by design. The payment settles on-chain in one to two seconds; there is no chargeback. This is x402's biggest practical limit. It is fine for a $0.001 API call (a dispute would cost more than the payment) and wrong for anything where the buyer might fairly want a refund.

Three ways to soften x402's no-refund property:

  • Escrow. For higher-value x402 payments, use a smart-contract escrow that holds the funds until the buyer signals acceptance. ERC-8004 includes escrow primitives for multi-agent transactions.
  • Compose with AP2 and a different rail. If you need x402's speed but also need dispute support, the AP2 plus x402 composition gives you the mandate chain as evidence while x402 keeps settlement fast. The settlement itself is still non-reversible; the mandate just proves what was agreed.
  • Seller guarantees. For paid APIs, the seller's terms often include refund rules enforced off-chain (the seller voluntarily sends USDC back if the service failed). This works for reputable sellers and falls apart with anonymous ones.

MPP: disputes through Stripe. MPP sessions settled on card rails inherit Stripe's standard dispute machinery, the same as ACP. MPP sessions settled by stablecoin on Tempo, or by Lightning, go through Stripe's seller-side dispute resolution. Stripe holds the merchant accountable for the outcome regardless of the rail.

The dispute model you need often drives the composition more than cost or latency do. A consumer-shopping platform needs chargebacks, so ACP fits. A pure machine-to-machine API marketplace needs no disputes at all, so x402 fits. An enterprise procurement platform needs audit-grade evidence, so AP2 with any settlement rail fits.

Bottom line of Concept 17: Dispute and refund mechanics differ sharply across protocols. ACP inherits card-network chargebacks, so the merchant's existing return policy works. AP2 provides the mandate chain as legal evidence but does not replace the underlying rail's dispute path. x402 is non-refundable by design: fine for micropayments, wrong where refunds matter. MPP inherits Stripe's dispute machinery across rails. The dispute model your use case needs is often the strongest thing forcing the protocol composition.

Concept 18: The FastAPI and Inngest webhook plumbing: closing the request and response loop

In one line: Some payment events arrive on their own schedule (disputes, mandate signatures, seller-side payment requests), so you need a thin FastAPI handler to catch them and an Inngest event to carry the work into a durable workflow.

Parts 1 through 5, and Concepts 15 through 17, treated the agent as a buyer: it sends a request, the protocol replies, the SDK reasons about the result. But agent commerce runs both ways. Stripe sends charge.dispute.created webhooks. AP2 mandate signing happens off your server, on the user's device, and posts back later. x402 sellers need a server-side middleware that returns 402 Payment Required and checks X-PAYMENT headers. None of these fit inside a single Runner.run() call. They need FastAPI handlers as the HTTP boundary and Inngest events as the bridge back into durable workflows.

This concept walks the three patterns you will need in production. Without them, the system has cracks where async events fall through.

Pattern 1: a Stripe webhook flowing into a suspended Inngest function. When a user files a chargeback on an ACP order, Stripe sends charge.dispute.created to your endpoint. It might arrive five minutes after the order or 60 days later. The workflow that placed the order has long since exited, but the agent still needs to react: notify the merchant, log to audit, maybe build a defense. The FastAPI handler turns the webhook into an Inngest event, and an Inngest function picks it up and runs an agent to handle the dispute.

stripe.Webhook.construct_event(...) and stripe.error.SignatureVerificationError are real Stripe APIs. The Inngest wiring is real. Note the firing API: events go out as send(events=[inngest.Event(...)]), not a bare dict.

from fastapi import FastAPI, Request, HTTPException
from pydantic import BaseModel
from decimal import Decimal
import stripe, inngest

app = FastAPI()
inngest_client = inngest.Inngest(app_id="agent-commerce", is_production=False)

class StripeDisputeEventPayload(BaseModel):
order_id: str
dispute_id: str
amount_usd: Decimal
reason: str
raw_event_id: str

@app.post("/webhooks/stripe")
async def stripe_webhook(request: Request):
# 1. Verify the Stripe signature. This security gate is required.
signature = request.headers.get("Stripe-Signature")
payload = await request.body()
try:
event = stripe.Webhook.construct_event(
payload=payload, sig_header=signature, secret=settings.stripe_webhook_secret,
)
except stripe.error.SignatureVerificationError:
raise HTTPException(status_code=400, detail="Invalid signature")

# 2. Route by event type. This handler does NO business logic.
# It only fires Inngest events so the durable workflow does the work.
if event.type == "charge.dispute.created":
await inngest_client.send(events=[
inngest.Event(
name="stripe/dispute.created",
data=StripeDisputeEventPayload(
order_id=event.data.object.metadata.get("order_id"),
dispute_id=event.data.object.id,
amount_usd=Decimal(event.data.object.amount) / 100,
reason=event.data.object.reason,
raw_event_id=event.id,
).model_dump(),
id=event.id, # idempotency seed
),
])

# 3. ACK Stripe right away. The real work runs in Inngest, durably.
return {"received": True, "event_id": event.id}

# The Inngest function that handles the dispute: fully durable and retryable.
@inngest_client.create_function(
fn_id="handle-stripe-dispute",
trigger=inngest.TriggerEvent(event="stripe/dispute.created"),
# Idempotency by raw_event_id makes sure Stripe retries do not process twice.
idempotency="event.data.raw_event_id",
)
async def handle_stripe_dispute(ctx: inngest.Context) -> dict:
payload = StripeDisputeEventPayload(**ctx.event.data)

# Log to audit immediately.
await ctx.step.run("audit-dispute-received", log_dispute_to_neon, payload)

# Run an agent to assemble the dispute defense.
defense_agent = Agent(
name="DisputeDefenseAgent",
instructions="Assemble dispute defense materials: order receipt, AP2 mandate if any, "
"delivery confirmation, customer communication history.",
tools=[fetch_order_details, fetch_mandate_chain, fetch_delivery_proof, submit_dispute_response],
)
defense = await ctx.step.run(
"build-and-submit-defense",
Runner.run, defense_agent, f"Build defense for dispute {payload.dispute_id}",
)
return {"status": "completed", "output": {"defense_submitted": defense.final_output.model_dump()}}

The FastAPI handler stays thin (verify, then fire an event). The Inngest function is durable (idempotency, retries, step memoization). The agent's reasoning happens inside the Inngest function, never inside the webhook handler. This split matters because Stripe expects a 2xx response within about five seconds, and an agent run can take 30 or more.

Pattern 2: an AP2 mandate-signing callback that resumes a suspended step.wait_for_event. Concept 9 showed the AP2 signing tool asking for a signature and waiting. In production that signing happens off your server: the user opens their phone app, sees the mandate, taps Approve, and the signed mandate posts back. The agent's Inngest workflow was suspended on step.wait_for_event; the FastAPI callback fires the event that wakes it.

The signing callback is your own FastAPI route. The Inngest correlation uses if_exp= (a CEL expression), and the awaited payload is under the async. prefix. wait_for_event returns None on timeout. ap2_verify_signature and the persistence client are illustrative; the workflow shape is real.

from datetime import timedelta
from pydantic import BaseModel

class MandateSignedPayload(BaseModel):
mandate_id: str
user_did: str
signature: str # the user's signature over the mandate hash
signed_at: str

@app.post("/callbacks/ap2/mandate-signed")
async def mandate_signed_callback(payload: MandateSignedPayload, request: Request):
# 1. Verify the signature against this user's registered public key.
is_valid = await ap2_verify_signature(
mandate_id=payload.mandate_id,
user_did=payload.user_did,
signature=payload.signature,
)
if not is_valid:
raise HTTPException(status_code=400, detail="Invalid mandate signature")

# 2. Persist the signed mandate. Mandates have a 7-year retention requirement.
await neon_client.mandates.insert({
"mandate_id": payload.mandate_id,
"user_did": payload.user_did,
"signature": payload.signature,
"signed_at": payload.signed_at,
"status": "signed",
})

# 3. Fire the Inngest event that resumes the agent's workflow.
await inngest_client.send(events=[
inngest.Event(name="ap2/mandate.signed", data=payload.model_dump(), id=payload.mandate_id),
])
return {"received": True, "event_id": payload.mandate_id}

# The Inngest function that was waiting. if_exp correlates the wait to this mandate.
@inngest_client.create_function(
fn_id="agent-procurement-workflow",
trigger=inngest.TriggerEvent(event="procurement/task.created"),
)
async def procurement_workflow(ctx: inngest.Context) -> dict:
# ... agent creates the Intent Mandate, fires "ap2/mandate.signing.requested" ...

# Suspend until the user signs. Zero compute is used during the wait.
signed = await ctx.step.wait_for_event(
"wait-for-intent-mandate-signature",
event="ap2/mandate.signed",
if_exp=f"async.data.mandate_id == '{ctx.event.data['mandate_id']}'",
timeout=timedelta(hours=24), # users can take real time
)

if signed is None: # timeout returns None
return {"status": "abandoned", "reason": "user did not sign within 24h"}

# Resume with the signed mandate. The agent continues from exactly where it left off.
cont = await ctx.step.run("continue-procurement", continue_with_signed_mandate, signed)
return {"status": "completed", "output": {"procurement_continuation": cont}}

Inngest's step.wait_for_event with if_exp is built for this. The FastAPI handler is a one-way bridge from HTTP into Inngest's event bus. The workflow that suspended hours ago resumes with the signed payload, and the agent picks up where it stopped.

Pattern 3: x402 seller-side middleware (when you expose a paid API, not just consume one). In a multi-agent marketplace (Decision 4), your agent is sometimes the buyer and sometimes the seller, where other agents pay yours for research, analysis, or code. The seller side needs a FastAPI middleware that returns 402 Payment Required, checks X-PAYMENT headers, and serves the resource only once a facilitator has verified payment.

The seller-side X402Middleware below is illustrative; there is no x402_server PyPI package. Real server-side x402 comes through the x402 package's server and facilitator helpers plus a framework integration. The symmetric buyer-and-seller idea is real, and the FastAPI route is real.

from fastapi import FastAPI, Request, Response
from decimal import Decimal
# Illustrative seller-side imports (see note above).
from x402_server import X402Middleware, PaymentRequirement

# Configure the middleware once at app startup.
app = FastAPI()
app.add_middleware(
X402Middleware,
payment_requirements_by_route={
"/api/research": PaymentRequirement(
scheme="exact",
network="eip155:8453", # Base
asset=USDC_BASE_CONTRACT,
recipient=settings.merchant_wallet_address,
max_amount_usdc=Decimal("0.50"),
expiry_seconds=300,
),
"/api/code-review": PaymentRequirement(
scheme="exact",
network="eip155:8453",
asset=USDC_BASE_CONTRACT,
recipient=settings.merchant_wallet_address,
max_amount_usdc=Decimal("2.00"),
expiry_seconds=300,
),
},
facilitator_url="https://facilitator.cloudflare.com/x402",
)

# Your business logic. The middleware enforces payment before this runs.
@app.post("/api/research")
async def research_endpoint(request: Request):
# By the time we reach here, the X-PAYMENT header has been verified and settled.
# request.state.x402_proof carries the on-chain transaction hash for audit.
query = (await request.json())["query"]

research_agent = Agent(
name="ResearchAgent",
instructions="Conduct deep research on the query and return a structured report.",
tools=[search_web, fetch_papers, summarize],
)
result = await Runner.run(research_agent, query)

return Response(
content=result.final_output,
headers={"X-PAYMENT-PROOF": request.state.x402_proof.tx_hash},
)

x402 is symmetric. The buyer-side code from Concept 10 and the seller-side middleware here are the two halves of the same protocol. A multi-agent marketplace runs both: its agents buy from outside services and sell to outside agents.

Three failures recur in production:

  1. Webhook handlers doing business logic inline. A FastAPI handler that runs the agent inside the webhook response will blow past Stripe's five-second timeout. Stripe retries, the agent runs twice, and you charge the user twice. The handler stays thin; the Inngest function is durable.
  2. Forgetting webhook idempotency. Stripe retries failed deliveries with the same event.id. Without an idempotency key on the Inngest function, every retry creates a duplicate. Use "event.data.raw_event_id" as the idempotency key.
  3. No signature check on callbacks. AP2 mandate-signed callbacks must verify the user's signature against the registered public key. Otherwise any caller can forge mandate-signed events. A failed check returns a 400, not a logged warning.

Bottom line of Concept 18: Agent commerce is bidirectional. Protocol responses arrive synchronously and the SDK handles them, but disputes, mandate signatures, refunds, and seller-side payment requests arrive asynchronously through webhooks and callbacks. Three patterns cover the operational need. A Stripe webhook handler fires an Inngest event into a durable agent workflow for disputes and refunds. An AP2 mandate-signed callback fires an Inngest event that resumes a suspended step.wait_for_event for out-of-band signing. An x402 seller-side FastAPI middleware exposes paid APIs with verified payment. FastAPI handlers stay thin (verify and fire); Inngest functions are durable (idempotent and retried). The buyer-side framing from Parts 1 to 5 is incomplete without these three; production systems run all three.


Part 7: Closing: what this course was really teaching

Concept 19: The discipline of layered composition

In one line: Everything in this course reduces to one job: read the use case, break it into four layers, and pick the right protocol at each layer.

This course has 19 Concepts and 5 Decisions. All of them are scaffolding for one claim: agent commerce in 2026 is not a single protocol but a layered architecture, and your job is to pick the right protocol at each layer for the use case in front of you. Everything else follows from that.

The same shape shows up at three scales.

At the protocol scale, the four headline protocols solve different problems at different layers: ACP at Layer 3, AP2 at Layer 2, x402 and MPP at Layer 4. Treating them as rivals at the same layer is the most common architectural mistake. They compete only where their layers overlap (x402 against MPP at settlement; AP2 against ACP's SPT at authorization for some flows). Everywhere else they compose.

At the system scale, a production system has one protocol from each layer, wired together through the OpenAI Agents SDK as the universal client. The SDK's @function_tool, RunContextWrapper, and tool_input_guardrail map cleanly to the concerns at each layer. The SDK is not a protocol. It is the orchestrator that lets you compose protocols cleanly.

At the discipline scale, your job is to read a use case, break it into the four layers, pick the right protocol at each one, and justify each choice against the use case's real constraints: transaction value, latency budget, dispute model, audit requirements. The job is not "pick a favorite protocol." It is "for this use case, what does each layer demand?"

The five Decisions in Part 5 walked real use cases through this discipline. Decision 1 (consumer shopping) landed on ACP plus Stripe card rails, because the use case needed chargeback protection. Decision 2 (an API-paying agent) landed on x402 only, because Layers 2 and 4 collapse for machine-to-machine flows. Decision 3 (enterprise procurement) reached the most complex composition (AP2 plus ACP plus MPP), because the audit and recurrence needs forced it. Decision 4 (a multi-agent marketplace) landed on AP2 plus ERC-8004 plus x402, because the bilateral-trust need could not be met any other way. Decision 5 rebuilt Decision 2 on a completely different stack (Google ADK plus a Coinbase wallet plus AP2 plus x402), and reached the same four-layer shape, proving the architecture is not a wrapper around Stripe and OpenAI.

The composition you reach for is set by the use case, not by taste. A team that picks x402 because "stablecoins are the future" but is building a consumer shopping experience has the wrong composition. A team that picks ACP because "Stripe is enterprise-grade" but is paying for API access at $0.001 a call has the wrong composition. Let the use case drive the choice.

Bottom line of Concept 19, and of this course: The four protocols (ACP, AP2, x402, MPP) are layers, not alternatives. Your job is to read the use case, break it into the four layers (discovery, authorization, commerce, settlement), pick the right protocol at each layer, and justify the choice against the use case's real constraints. The OpenAI Agents SDK is the universal client that composes the chosen protocols. The discipline survives whichever protocols win in the next 24 months. The layers are stable, even as the protocols at each layer evolve.


Cheat sheet: the framework in one page

Print this. Tape it to the wall. Use it when reviewing any agent-commerce design.

The four layers (memorize this stack)

Layer 1: DISCOVERY      ->  "What's available to buy?"
Layer 2: AUTHORIZATION -> "Am I allowed to spend this?"
Layer 3: COMMERCE -> "What's the full purchase lifecycle?"
Layer 4: SETTLEMENT -> "Where does the money actually move?"

The protocols at each layer (top picks in 2026)

LayerTop picksPick by
DiscoveryMCP, A2A, agent directories, AI shopping surfacesWhere the agent's needed services live
AuthorizationAP2 mandates, ACP SPT, TAP, ERC-8004Trust model: audit-rigorous, Stripe-native, identity-only, or multi-agent
CommerceACP, UCP, direct API (none)Does the use case need a commerce lifecycle?
Settlementx402, MPP, card rails, bank/LightningEconomics: transaction value drives the rail

The four canonical compositions

Use caseStack
Consumer shoppingAI surface + ACP SPT + ACP + Stripe cards
API-paying agentMCP/directory + EIP-3009 + (none) + x402
Enterprise procurementA2A/MCP + AP2 + ACP/UCP + MPP/cards
Multi-agent marketplaceA2A + AP2 + ERC-8004 + (none) + x402

Spend-limit enforcement at three levels (required)

Level 1: Wallet / payment-method limits  (smart-contract caps OR Stripe customer caps)
Level 2: SDK tool guardrails (tool_input_guardrail on each payment tool)
Level 3: Application business logic (per-user, per-category, per-merchant policies)

Skip any of these and you are one bug away from a total drain. The most common mistake is using agent-level output_guardrail for Level 2. It fires on the final agent reply, too late to stop a payment. Use tool_input_guardrail instead.

The economic threshold (memorize this)

Transaction value
├── < $5 → x402 or MPP stablecoin (card fees exceed the transaction)
├── $5 - $1,000 → ACP + card rails (chargeback protection worth the 2.9%)
└── > $1,000 → AP2 + composed stack (audit + dispute defense + multi-rail)

The dispute model (often the strongest constraint)

Use case needs chargeback protection?
├── Yes → ACP + card rails (or MPP card mode)
└── No → x402 acceptable (faster, cheaper, no refunds)

Use case needs audit evidence?
├── Yes → AP2 at Layer 2 (mandates as legally-admissible evidence)
└── No → SPT or EIP-3009 sufficient

Production checklist before launch

  • Wallet/payment-method spend limits configured at Level 1
  • SDK tool_input_guardrail attached to every payment-authorizing tool (Level 2)
  • Application enforces per-user, per-category, and per-merchant caps (Level 3)
  • Signing keys in a key vault (Azure Key Vault or equivalent), never in env vars
  • Per-agent wallet separation (no shared keys across agent classes)
  • Key rotation schedule defined (90-day baseline)
  • Audit logs to durable storage independent of the agent runtime
  • OpenTelemetry traces span SDK, httpx, Inngest, and FastAPI; one trace ID per transaction
  • Pydantic models at every boundary (tool returns, protocol payloads, FastAPI bodies, Inngest events)
  • Decimal for money everywhere (never float)
  • Stripe webhook handler verifies the signature and fires an Inngest event (thin handler, durable function)
  • AP2 mandate-signed callback verifies the signature and resumes step.wait_for_event
  • If exposing paid APIs: x402 seller-side middleware configured per route
  • Inngest idempotency key set on every webhook-triggered function
  • Dispute and refund mechanism understood and documented per protocol
  • Human-in-the-loop confirmation gate for the first 30 days of production
  • Cart-accuracy metric measured before relaxing the confirmation gate

Quick reference: the decision tree, compressed

When you have a new use case, walk this tree.

1. What's the agent buying?
├── Retail goods for a user → Decision 1 pattern (ACP + cards)
├── API access for itself → Decision 2 pattern (x402-only)
├── Supplier goods/services for an org → Decision 3 pattern (AP2 + composed)
└── Work from another agent → Decision 4 pattern (AP2 + ERC-8004 + x402)

2. What's the transaction value?
├── < $5 → settlement = x402 or MPP stablecoin
├── $5-$1,000 → settlement = card rails via ACP/MPP
└── > $1,000 → settlement = MPP sessions OR bank rails

3. What's the dispute model?
├── Need chargeback protection → must include card rails somewhere
├── Need audit evidence → must include AP2 at Layer 2
└── Neither → x402 / direct rail sufficient

4. What's the latency budget?
├── < 1 sec → only stablecoin rails work
├── 1-5 sec → x402, MPP, or pre-signed AP2 mandates
└── > 5 sec → full ACP checkout works

5. Compose: pick one protocol from each layer, justified against the constraints above.

10-minute refresher: the 19 concepts in one sentence each

Read this when you have forgotten what each concept said. Each entry is the bottom-line that closes the concept; collected here for fast lookup.

1. The assumption that broke

Payment systems were built on the assumption that a human clicks the buy button. Agents break that in three ways: no traditional identity, behavior that looks anomalous, and no channel to clear up a dispute. Each break needs a protocol-level fix. Wrapping old payment rails in nicer agent UX did not work.

2. Why one protocol can't solve all the breaks

One protocol cannot solve all four breaks, because the breaks happen at different layers, each with its own strong incumbents. The four protocols specialize: ACP at commerce, AP2 at authorization, x402 and MPP at settlement. Within a layer you pick one; across layers you compose several.

3. The OpenAI Agents SDK as the universal client

The SDK is the universal client for all four protocols. Each protocol becomes one or more @function_tool functions the agent calls. The SDK's output_type, context, and tool_input_guardrail map cleanly to protocol concerns: structured payment results, per-user payment context, and pre-execution spend-limit checks.

4. Layer 1, Discovery (how agents find what they can buy)

Layer 1 answers "what's available?" Four options compete: MCP for internal tool servers, A2A for multi-agent ecosystems, agent directories for third-party services, AI shopping surfaces for consumer products. Pick the one that matches where the agent's needed services live; they are not mutually exclusive.

5. Layer 2, Identity and Authorization (proving the agent is allowed)

Layer 2 answers two questions: did the human authorize this, and is the agent who it claims to be? Four protocols compete: AP2 mandates (audit-rigorous), ACP SPT (Stripe-native), TAP (identity-only), ERC-8004 (multi-agent trust). Integrate through RunContextWrapper for per-tool auth and tool_input_guardrail for spend enforcement. Pick by trust model; they are not interchangeable.

6. Layer 3, Commerce (the full purchase lifecycle)

Layer 3 handles cart, checkout, fulfillment, dispute, and refund. ACP and UCP compete for consumer flows; machine-to-machine API access has no commerce layer. Refund and dispute mechanics are the hidden complexity that separates a real commerce protocol from a payment-only one. Pick by use case: lifecycle needed (ACP/UCP) or lifecycle skipped (direct API).

7. Layer 4, Settlement (money actually moves)

Layer 4 is where money moves. Four options compete: x402 for machine-to-machine stablecoin, MPP for multi-rail sessions, card rails for consumer purchases with chargebacks, bank or Lightning for high-value or very-low-fee cases. The choice is mostly about money: sub-dollar to x402, consumer to cards, enterprise to MPP. Settlement is usually set by your commerce choice; pick the bottom of the stack, not isolated rails.

8. ACP, the consumer-shopping protocol

ACP is the production-ready commerce protocol for consumer shopping. The SPT scopes the agent's spending; the merchant stays the merchant of record. You integrate it through the Stripe SDK plus a thin ACP client, wired in as four @function_tool functions (browse, checkout, status, refund). Default to user confirmation on carts until the agent's cart-accuracy is measured.

9. AP2, the authorization layer

AP2 is the authorization layer for audit-rigorous agent commerce. Three mandate types (Intent, Cart, Payment) form a non-repudiable chain proving the user consented. step.wait_for_event is the natural fit for mandate-signing flows. Create Intent Mandates first, before shopping begins, not as a checkout afterthought.

10. x402, the HTTP-native settlement protocol

x402 is the HTTP-native settlement protocol for machine-to-machine micropayments. The 402 status code plus a signed payment header plus an EIP-3009 signature settles in one to two seconds with no account creation. The wallet's smart-contract spend limits are the safety that actually protects you, not the per-request cap. Check header naming against the spec version and facilitator you integrate with; it differs across V1 and V2.

11. MPP, the sessions-based settlement protocol

MPP is Stripe's settlement answer to x402, built for sessions-based metering and multi-rail dispatch. Two intents, charge (one-off) and session (pre-authorized streaming), cover single purchases and recurring metering. The HTTP shape uses standard WWW-Authenticate: Payment, Authorization: Payment, and Payment-Receipt headers, so it feels like familiar HTTP auth. Right-size session caps to the expected work, not to convenience.

12. The minimum viable agent-payment stack

The minimum viable stack is the smallest composition that ships value for the use case. Four common ones: consumer shopping (ACP plus cards), API-paying (x402 only), enterprise procurement (AP2 plus ACP/UCP plus MPP/cards), multi-agent marketplace (AP2 plus ERC-8004 plus x402). Do not build universal stacks; pick one composition per use case and ship.

13. When protocols compose across layers vs compete within a layer

Protocols across layers compose (AP2 plus x402, ACP plus x402, MCP plus x402); protocols within a layer compete (AP2 vs ACP SPT, ACP vs UCP, x402 vs MPP). The test is "are these at the same layer?" If yes, pick one; if no, they likely compose.

14. Cost and latency implications of composition choices

Card rails cost 2.9% plus $0.30 per transaction but provide chargeback protection; stablecoin rails cost sub-cent but need crypto-native infrastructure. The dollar point where card rails stop making sense is around $5 to $10. The latency point where ACP checkout becomes too slow is around five seconds for user-facing flows. Use cost and latency to decide the composition, not abstract preference.

15. Spend-limit enforcement at three architectural levels

Enforce spend limits at three independent levels: wallet or payment-method infrastructure, SDK tool guardrails (tool_input_guardrail specifically, since it runs before each tool and can block it), and application business logic. Each level uses different infrastructure, so a bug in one is caught by the others. Using output_guardrail for spend control is the most common mistake; it fires on the final reply, too late to stop a payment.

16. Agent identity hygiene: keys, wallets, and audit logs

Agent identity is cryptographic; the signing key is the only thing separating authorized spending from fraud. Four habits: per-agent wallet separation, key rotation on a schedule and on demand, audit logs to durable storage independent of the runtime, and distributed traces with one trace ID spanning the whole transaction. Signing keys go in a key vault, never in env vars or source code.

17. Dispute and refund mechanics across the four protocols

ACP inherits card-network chargebacks. AP2 provides the mandate chain as legal evidence but does not replace the underlying rail's dispute path. x402 is non-refundable by design: fine for micropayments, wrong where refunds matter. MPP inherits Stripe's dispute machinery across rails. The dispute model your use case needs is often the strongest thing forcing the protocol composition.

18. FastAPI and Inngest webhook plumbing: closing the request and response loop

Agent commerce is bidirectional: disputes, mandate signatures, refunds, and seller-side payment requests arrive asynchronously through webhooks. Three patterns cover the need: a Stripe webhook fires an Inngest event into a durable workflow; an AP2 mandate-signed callback fires an event that resumes step.wait_for_event; an x402 seller-side FastAPI middleware exposes paid APIs. FastAPI handlers stay thin; Inngest functions are durable. Production systems run all three.

19. The discipline of layered composition

The four protocols (ACP, AP2, x402, MPP) are layers, not alternatives. Your job is to read the use case, break it into the four layers, pick the right protocol at each one, and justify the choice against the use case's real constraints. The OpenAI Agents SDK is the universal client. The discipline survives whichever protocols win in the next 24 months; the layers are stable, even as the protocols at each layer evolve.


Design-review template: questions to ask when reviewing any agent-commerce architecture

When you review a colleague's design (or your own draft, with a day's distance), walk these questions in order. Each maps to a section of this course; if the answer does not satisfy the question, go back to that section.

Architecture questions

  1. What use case is this serving? If "multiple," what's the primary one? A design that tries to serve all four canonical use cases is the anti-pattern from Concept 12; flag it.
  2. Walk the four layers explicitly. Discovery? Authorization? Commerce? Settlement? Which protocol at each layer? Get the four-protocol composition stated out loud.
  3. Justify each layer's choice against the use case. Why this protocol at this layer? What would change with a different one? Concept 13's across-vs-within test applies here.
  4. Is there protocol overlap at any layer? Two protocols competing at Layer 4? Two at Layer 2? If yes, that is complexity that needs a use case to justify it.

Economic questions

  1. What's the transaction value distribution? Sub-dollar? $10 to $100? $1,000-plus? Concept 14's economic threshold should drive the settlement choice.
  2. What's the latency budget? Sub-second? 5 to 30 seconds? The latency budget should constrain settlement and mandate choices.
  3. What's the cost per transaction at expected volume? Add protocol fees plus processor fees plus per-transaction infrastructure cost. Use it to check the composition is economically viable.

Operational questions

  1. Spend-limit enforcement at all three levels? Wallet/payment-method, SDK, application business logic. Concept 15: skip any and you are exposed.
  2. Identity hygiene? Per-agent wallet separation, key rotation, audit logs to durable storage, distributed traces with one trace ID per transaction. Concept 16's four habits.
  3. Dispute and refund mechanics? Does the composition handle the disputes your use case will actually get? Concept 17: this is often the strongest constraint.
  4. Operational envelope? Where does Inngest (or equivalent durable execution) handle long-running flows, retries, and idempotency? Covered in the Production Worker crash course.
  5. Human-in-the-loop gates? Where in the flow does the user or operator confirm? Default to a confirmation gate for the first 30 days of production.

Webhook and async-callback questions (Concept 18)

  1. Stripe webhook handler exists and is thin? Verifies the signature, fires an Inngest event, ACKs in under five seconds. Business logic lives in the Inngest function, not the handler.
  2. AP2 mandate-signing callback exists and verifies user signatures? Without it, the agent's step.wait_for_event never resumes. Without signature verification, anyone can forge mandate-signed events.
  3. If exposing paid APIs: x402 seller-side middleware configured per route? Multi-agent marketplaces run both buyer-side and seller-side code; the seller side needs the middleware.
  4. Inngest idempotency keys on every webhook-triggered function? Stripe retries with the same event.id; without idempotency you charge or refund twice.
  5. Contract layer: Pydantic models at every boundary? Tool returns, protocol payloads, FastAPI request and response bodies, Inngest event payloads, all typed. Decimal for money, never float.

Failure-mode questions

  1. What's the most likely failure mode in production? Each canonical decision had one: cart mismatch (Decision 1), runaway spend (Decision 2), Intent Mandate scope mismatch (Decision 3), gamed reputation (Decision 4). What's yours?
  2. What's the mitigation? Each failure mode in Part 5 had an explicit mitigation. Does the design include it, or is it relying on "the protocol will handle it"?

Track-readiness questions (for production deployment)

  1. What learning track is the team operating from? Reader, Beginner, Intermediate, Advanced. Be honest. Production deployment needs Advanced-track depth, not Reader-track familiarity.
  2. What's the rollback plan if the agent misbehaves? Wallet kill switch? SPT revocation? Disable the agent at the SDK level? Have this defined before launch, not after the incident.

References

Primary sources for the four protocols. Prefer these over secondary commentary; the specs evolve faster than write-ups about them.

ACP (Agentic Commerce Protocol)

  • Specification repository: github.com/agentic-commerce-protocol/agentic-commerce-protocol (Apache 2.0)
  • Developer site: agenticcommerce.dev
  • Co-maintainers: OpenAI and Stripe
  • Stripe integration docs: stripe.com/docs/agentic-commerce
  • Latest spec version cited: 2026-04-17

AP2 (Agent Payments Protocol)

  • Specification repository: github.com/google-agentic-commerce/AP2 (Apache 2.0)
  • Documentation site: ap2-protocol.org
  • Coalition members: Google plus 60-plus partners (Salesforce, ServiceNow, Adobe, Shopee, Etsy, Adyen, American Express, JCB, UnionPay International, PayPal, Mastercard, Coinbase, others)
  • a2a-x402 extension repository: github.com/google-a2a/a2a-x402
  • Latest version cited: v0.2.0 (April 2026)

x402

  • Specification repository: github.com/coinbase/x402 (Apache 2.0)
  • Documentation site: x402.gitbook.io (also x402.org)
  • Created by Coinbase and now stewarded under the Linux Foundation's x402 Foundation (April 2026), with Cloudflare, Stripe, AWS, Google, and others.
  • Cloudflare OpenAI Agents SDK integration: developers.cloudflare.com/agents/x402
  • Adoption directory: agent.market
  • x402 Foundation members include: Adyen, AWS, American Express, Base, Circle, Cloudflare, Coinbase, Google, Mastercard, Microsoft, Polygon Labs, Shopify, Solana Foundation, Stripe, Visa

MPP (Machine Payments Protocol)

  • Specification site: mpp.dev
  • Co-developers: Stripe and Tempo (Stripe's L1 blockchain, incubated with Paradigm)
  • Stripe announcement: stripe.com/blog/machine-payments-protocol
  • Launch date: March 18, 2026 (mainnet)
  • Partners at launch: Stripe, Visa, Lightspark, Anthropic, OpenAI, Shopify, and 100-plus services

Adjacent protocols

  • A2A (Agent2Agent, Google): github.com/google-a2a/A2A, the protocol AP2 extends
  • MCP (Model Context Protocol, Anthropic): modelcontextprotocol.io, the tool and context layer agents discover services through
  • UCP (Universal Commerce Protocol, Google): announced as ACP's peer for Google shopping surfaces
  • TAP (Trusted Agent Protocol, Visa and Cloudflare): identity verification protocol launched October 14, 2025
  • ERC-8004: on-chain trust standard for multi-agent transactions

OpenAI Agents SDK

  • Python SDK: pip install openai-agents (latest release May 19, 2026)
  • Documentation: openai.github.io/openai-agents-python
  • JavaScript/TypeScript SDK: github.com/openai/openai-agents-js

Operational and contract-layer tools (used heavily in Concepts 16 and 18)

  • Inngest: inngest.com, the durable execution platform that runs the agent workflows in this course
  • FastAPI: fastapi.tiangolo.com, the async HTTP layer where webhook endpoints and seller-side x402 middleware live
  • Pydantic: pydantic.dev, the type contract at every boundary (tool returns, protocol payloads, FastAPI bodies, Inngest events)
  • OpenTelemetry: opentelemetry.io, distributed tracing with auto-instrumentation for httpx, openai-agents, FastAPI, and Inngest; one trace ID spans the whole transaction
  • httpx: python-httpx.org, the async HTTP client underneath x402-client, the ACP client, and the AP2 a2a-x402 extension
  • Azure Key Vault: where signing keys live in production, rotated per Concept 16's discipline
  • Stripe webhooks documentation: stripe.com/docs/webhooks, signature verification, event types, and retry semantics

You now have the whole framework: the four layers, the four protocols, the rules for composing them, five worked decisions, the production concerns that decide whether the system survives, and a one-page reference to keep. The protocols will keep moving. The four-layer discipline will not. Build from the layers, and you will still be right when the protocol names change.