Connector-Native Apps: A Remote MCP Server Whose Customer Is an AI

14 Concepts · ~90–120 min to read · a focused day to build (4–6 hr if you're a strong dev with the starter running) · from an empty folder to a live connector any AI can pick up and use, free

For thirty years we built software for people. Screens to look at, buttons to press, forms to fill in. The customer on the other side of the glass was always a human being.

That is no longer the only customer in the room. In this course you build a product whose user is an AI.

A person pastes your connector into their assistant once. One URL, one click. That is the last time a human touches it directly. From then on the AI is your user: it reads the names of your tools, decides on its own which to call, hands you the inputs, and speaks your results back to the person. You are not dressing a shop window for a shopper to browse. You are hanging a pegboard of labeled tools that a tireless worker walks up to and uses by itself, turn after turn, on its own initiative. Today a human is usually sitting in that chat. Increasingly, the thing that finds your connector, signs in, and calls it on a schedule will be another agent, with no human turn in between. You are building for that customer.

You will build one real product, end to end:

A remote MCP server. MCP is the USB port of AI: one standard plug, so any assistant can use your tools with no custom wiring. "Remote" means your toolbox sits at a public web address with that plug on it. Yours holds three groups of tools, reachable by any AI on the internet.
A two-table memory, so the AI remembers a person from one conversation to the next.
A real sign-in (OAuth), so your server knows whose data it is holding without ever asking the AI to vouch for it.
A session contract that hands the AI your app's rules first, then gates every real tool behind them.
The whole thing added with one pasted URL and one click, running on the person's own free model, so it costs you nothing to serve.

You ship all of this before you write a single agent loop. That is the point of putting it first: you learn to build the thing an agent calls before you build the agent itself.

One rule explains every hard part that follows, and it comes straight from who your customer is: you own the server, not the mind that calls it. The intelligence lives in the AI's host app. The loop that decides what to do next lives there too. Your server only ever answers when it is called, and the caller is a mind that reasons by guessing: fast, capable, and entirely able to be confused, talked into something, or simply wrong. So every difficult part of this course has the same shape. It is your server doing a job the AI cannot be trusted to do for itself.

That leaves four non-negotiables. The whole course is these four, built:

One gateway. The AI meets you at a single connector, with tools grouped by name behind it: one front door, one menu it reads. (A free account can add only one custom connector, so "one" is a hard limit, not a preference.)
Tools only. You speak to the AI through callable tools, functions it can invoke in the middle of its own reasoning, never through resources or prompts a human would have to pick by hand.
Prove, don't trust. Your customer is a mind that could hand you the wrong person's identity without meaning any harm. So identity comes from a verified sign-in, never from anything the AI tells you, the way a hotel desk hands over your mail on the passport you showed at check-in, not on someone's word about which room is theirs.
Fail closed. When your server is missing or broken, the AI does not go quiet. It improvises, inventing an answer and making up the person's saved data. Your server has to make it stop and say so instead, the way an ATM that can't reach the bank shows temporarily unavailable rather than guessing your balance and handing out cash.

The four invariants of a connector-native app: two describe the app's shape (one gateway, tools only) and two are jobs the server must do because the AI can't be trusted to: prove identity from the verified sign-in, and fail closed rather than improvise. Read each Concept as the server upholding one of these four.

Two of these describe the shape of what you build (one gateway, tools only). The other two are jobs the server must do because the AI can't be trusted to (prove identity, fail closed). Read each Concept asking: which invariant is this?

note

Prerequisites. This page assumes four things.

You can read typed Python, directly or by pasting a code block to your coding agent for a plain-English read-back. If neither is true yet, do Python in the AI Era first.
You've done the Agentic Coding Crash Course. You drive Claude Code or OpenCode in plan mode with a rules file. We build through that workbench here instead of re-explaining it.
You've used a connector from the outside, the Skills & Connectors course. You flipped one on and watched your AI reach into your Drive. This course flips you to the inside: now you are the thing the AI reaches.
You do NOT need Build AI Agents first. What you build here is the server an agent calls, not the agent. That course comes later on the path, and this one is the reason you'll want it.

You don't need an API key of your own: the person brings the model. You'll open one free account along the way: a Neon database when you first store state (Concept 5), no card. Then a sign-in service and a host when you deploy. Clerk, Auth0, and Stytch are free to start; some hosts (Azure among them) ask for a card to verify you without charging, so reach for a card-free host like Fly, Render, Railway, or Cloud Run if you'd rather not.

note

Where this sits. First build course in Mode 2. From here the Manufacturing path runs: this course → Claude Code & OpenCode Plugins (coming soon) → AI Identity (coming soon) → Build AI Agents. This one is the pre-loop app: tools, state, identity, and a real deploy, with the caller's model doing the thinking. Build AI Agents, later on the path, is where you own the loop.

📚 Teaching Aid

Open Full Slideshow

View Full Presentation — Connector-Native Apps

Setup (a few minutes)

Download the starter (connector-native-apps.zip), unzip it, and cd into the folder.
Open Claude Code or OpenCode in that folder. It auto-loads AGENTS.md, which is at once your coding agent's brief (the four invariants as hard rules) and the worked example's starting point.
Keep secrets out of chat. You'll copy .env.example to .env and fill it in by hand. Never paste a key into a conversation.

What's in the box. Think flat-pack furniture: most panels arrive pre-assembled, a few are marked you fit this. The starter is a scaffold you direct your agent to fill in, not a finished app:

connector-native-apps/
  AGENTS.md                 the coding agent's brief (the four invariants as hard rules)
  pyproject.toml  .env.example  Dockerfile
  src/connector_app/
    server.py               the gateway: tool groups + begin_session + the session gate (TODO bodies)
    auth.py                 COMPLETE token check, the one file to read line by line
    db.py                   the two-table state store (complete)
    session.py              signed app-level session tokens that gate the real tools (complete)
    config_store.py         your app's rules + persona (you edit)
  mock_auth/server.py       a LOCAL dev sign-in service (Beginner track)
  seed/articles.json        a tiny catalog for the worked example
  tests/test_starter.py     three smoke tests: imports, failing-auth, cross-user isolation

The plumbing is done and reviewable: auth.py, db.py, session.py. The panels marked you fit this (the TODOs) are the domain.* tool bodies, your rules and persona in config_store.py, and the auth wiring for your SDK version.

Two tracks, your choice. The Beginner track uses the bundled mock_auth/ service, a local sign-in server that issues real tokens, so the sign-in half of the build needs no account at all (state still uses a free Neon database, set up in Concept 5). The Standard track uses a real sign-in service, hosted (Clerk, Auth0, or Stytch, all free to start) or self-hosted (Better Auth, the stack the AI Identity course is built on), and is what you deploy. Build the whole thing on the Beginner track, then swap to a real service by changing three values in .env.

From here, each Concept shows you the shape of one piece. You read it, direct your coding agent to write and run it, and check what comes back. You direct; the agent types; you verify. In a course about building for an AI, that loop isn't a convenience. It's Concept 1.

Part 1: The shape

These four concepts are the mental model the rest of the page builds on. No deploy yet.

Concept 1: You direct it — you don't type it

Most people picture "build a server" as opening an editor and writing routes. That's not this course. Like every Manufacturing-track course, your coding agent writes the code; your job is the spec going in and the verification coming out — you read it, run it, and check it, the same rhythm Python in the AI Era trained.

You'll lean on that hardest in one place: the sign-in code (Concept 8). An agent will happily write authorization code that looks right and is quietly wrong, and knowing what "right" looks like is the real skill this course builds. The keystrokes are the agent's job.

There's a twist worth holding onto, because this is the one course where you hold both shapes at once: you direct a coding agent — a loop-owning general agent — to build a connector-native app, which is loopless. Picture two vacuums. A robot vacuum owns the loop: it wakes on its own, roams, decides each move, and you can even set it to run at 2am. A hand vacuum runs only while you squeeze the trigger; let go and it dies. The coding agent is the robot vacuum; the connector you build is the hand vacuum, which does nothing until the user types. The thing building it owns a loop. The thing it builds does not.

Concept 2: The new app shape

You've used a connector from the outside, as a person reaching into your own apps. Now flip it: you are the server, and your caller is the AI. Claude dials into you from its cloud. Three facts follow, and they drive every later decision.

The chat app is the runtime (the engine that actually runs everything), and it lives in the cloud. When a user adds your connector, Claude reaches your server from Anthropic's cloud, not the user's laptop. So your server must be on the public internet over HTTPS (the secure web protocol). Claude is phoning you from far away, and it can only ring a published number: a public web address. A server on your laptop is a phone with no listing, perfectly real but unreachable from outside, which is why a real deploy is part of the course, not an afterthought.

The user brings the model. You don't pay for the intelligence; the user's free Claude tier supplies it. Your only costs are a small server and a database. That's the whole economic trick behind "free for anyone."

Those two facts — a cloud runtime you must be reachable from, and someone else's model doing the thinking — are why the rest of the course is about what your server must guarantee. The first of those guarantees is the narrowest: there's only one way your server is allowed to speak to that borrowed model at all. That's the next Concept.

A left-to-right flow: the User types in Claude; Claude (the model and the loop) runs in Anthropic's cloud; a dashed trust boundary marks where the connector URL crosses into your gateway over public HTTPS; your gateway turns the token into a sub and reads state from Postgres. Two annotations: the loop lives in Claude, not your server; and identity comes from the token's sub, never from the model.

Concept 3: Tools only — not resources, not prompts

An MCP server can offer a model three things: tools (functions the model calls, with inputs and outputs), resources (read-only data the user points at), and prompts (canned templates the user picks). All three are valid MCP surfaces. For this course's product shape, we intentionally expose only tools — that's a design choice for this kind of app, not a rule that resources and prompts are wrong in general.

Why tools fit this shape: your app has to decide on its own what to fetch or do next — search, pull a record, save a result. Picture a workshop. A tool is the cordless drill on the worker's belt: grabbed mid-job, no asking. A resource is a manual locked in a cabinet, useless until someone walks over and hands it across. A prompt is a form the worker must stop and pick off a shelf. Only the drill keeps work flowing with no human in the loop, which is exactly why this app is tools-only: only a tool can be called automatically inside the model's reasoning, while resources are passive (the user has to point at them) and prompts have to be picked by hand. Tools are also the one surface every chat app supports well, so building on tools keeps you portable.

MCP surface	Who triggers it	Can it be auto-called mid-reasoning?	Use it here?
Tool	The model, on its own	Yes	Yes — everything
Resource	The user points at it	No	No
Prompt	The user picks it	No	No

Concept 4: One gateway, three groups behind it

A connector-native app usually has three kinds of job, cleanest kept as three concerns in your code:

domain.* — what your app knows and does (search records, fetch an item, take an action).
user.* — who this person is and the state you keep for them.
config.* — the app's own operating rules: how it should behave, its voice, its guardrails. (A teaching persona is one instance of config; a support assistant's tone-and-escalation policy is another. Most apps need some rules; few need a full character.)

The textbook way to ship three concerns is three servers. You can't (invariant 1): on the Claude Free plan a user may add exactly one custom connector. Ask a beginner to add three and you've quietly pushed your free product onto a paid plan.

So draw the line in a different place: keep all three concerns inside one server and expose them through one URL, a single gateway, with tools grouped by name. One front door, one menu, with three sections the way a diner's menu has breakfast, lunch, and dinner pages (domain, user, config). You don't send the customer to three separate restaurants. The names are yours; this is illustrative:

domain.search      domain.get_item      domain.do_action
user.get_profile   user.save_state
config.get_rules   config.get_persona

In the MCP Python framework (FastMCP), a tool is just a decorated function: a function is one labeled action that does one job, and decorating it only pins a name tag on it so the AI's menu can list it. Your coding agent writes these; you read them:

# server.py — the gateway skeleton (shape; the starter ships this filled in around your TODOs)
from fastmcp import FastMCP

mcp = FastMCP("my-connector-app")

@mcp.tool()
def domain_get_item(session_token: str, item_id: str) -> dict:
    """Return one record by id. Requires a valid session (see Concept 10)."""
    require_session(session_token)      # gating — Concept 10
    return fetch_item(item_id)

The server mechanics underneath — FastMCP, a read-only database role, and the Streamable HTTP transport that lets a host reach your server — are built step by step in Give Your AI Searchable Context. This course keeps its server code at shape level and points there for the full build.

Run it. Paste this to your coding agent:

let's scaffold Concept 4: a FastMCP server called my-connector-app exposing one domain.get_item tool and a health tool, and show me a local client seeing both.

Run it yourself in a terminal (raw commands).

uv sync                                    # install deps from the starter's pyproject.toml
uv run python -m connector_app.server      # starts the gateway on :8000 over HTTP

What you'll see — and what to verify

A running server and a client that lists two tools. There is no auth and no real data yet — that's correct for now. You've proven the one fact this Concept is about: a single server, tools grouped by name, that a client can discover. Identity and gating come later; don't add them yet.

✓ Checkpoint: the shape is in place. You know what you're building (a server of tools), why it's one connector, and that you direct the agent to write it. Everything else fills this in.

Part 2: State and domain

Concept 5: State — just enough to remember a person

A generic chatbot forgets you when you close the tab. Your app must not — remembering a user across sessions is most of what makes it a product instead of a toy.

Keep v1 small: a Postgres database (a standard, free-to-start relational database) with two tables. Think of the front desk's two registers, tied together by your loyalty number: a guest register of who each person is, and a stay-log of what they're up to lately. Who-you-are barely changes; what-you're-doing changes every visit, so they live apart.

-- users: one row per person
create table users (
  id    text primary key,     -- the verified sign-in subject (Concept 7)
  email text
);

-- user_state: one row per person, whatever you carry between sessions
create table user_state (
  user_id text references users(id),
  state   jsonb               -- a last position, a few saved values
);

That's the whole of v1's memory: store a row, read a row. (jsonb is just a blank notes field the desk can scribble anything into, with no fixed form.) The serious version — an audit trail of every interaction, an approval model, a record you can trust and report on — is its own discipline, and it's exactly what Building a Digital FTE teaches. Don't build it here.

tip

Go deeper. You don't set up or operate this database by hand. As in every Manufacturing course, your coding agent drives Neon (serverless Postgres) through Neon's own MCP server — it creates the project, enables the extension, and runs the SQL while you review. (You make a free Neon account for this, no card, and the agent drives it from there.) That setup is taught step by step in Give Your AI Searchable Context; here you reuse the same move for a small two-table state store instead of a vector store.

Concept 6: Domain — by reference now, by meaning later

Your domain is simply the stuff your app is actually about: its articles, items, records, not a web address. When the user wants a specific thing, v1 fetches it the simple way: each record has an id, and domain.get_item(id) returns it. The model works from what comes back.

What v1 deliberately does not do yet is semantic search — answering "the part about refunds" by meaning rather than by exact id. The difference is a library: fetching by id is asking for a book by its exact call number (one wrong digit and you get nothing), while semantic search is telling the librarian "I want the book about the sad whale" and having her find it. v1 is the call-number desk; the librarian is the upgrade, and it's the whole subject of the RAG course (Give Your AI Searchable Context). Wiring it in now would bloat your first ship. Fetch by reference now; upgrade to search later.

Part 3: Prove — identity the model can't fake

This is the first half of "the server does what the model can't." Both Concepts here are invariant 3.

Concept 7: Identity from the verified subject, never from the model

Here is the problem. Your user_state table must write to the right person's row. But the model is the one talking to the user, and you must never let the model decide whose data to read or write. Picture the AI as a hotel concierge running errands for a guest. When the concierge tells the front desk "room 412 wants their mail," the desk must not hand it over on his say-so: a confused or manipulated concierge could name the wrong room and leak a stranger's mail. If Claude could pass you a user_id, that is exactly the danger, and one user would see another's data. This is the textbook trust bug of a connector-native app.

The rule: the model never supplies identity. When the user authorized your connector, they signed in through a trusted service, and that service hands your server a signed token carrying the user's verified id — the subject, or sub. That token is the guest's passport: the desk reads who they are from the passport itself, which the concierge cannot fake, never from anything he says. Your server reads sub straight from the token and uses that as the database key. So if a tool ever takes a user_id argument and the model fills it with someone else's id, your server ignores it: identity comes from the token's sub, never from a tool argument.

# the only safe source of identity
sub: str = verified_claims(token)["sub"]   # from the signed token, NOT from any tool argument
profile = db.get_user_state(sub)           # the model cannot point this at someone else

The lovely part: it costs the user nothing extra. The single Authorize click that turns the connector on is the sign-in. One action, two jobs.

Concept 8: OAuth 2.1, in plain English (the verification-heavy Concept)

The machinery is OAuth — the same "Sign in with Google" you've clicked a hundred times. MCP uses a specific, current shape of it. You only need the ideas; your coding agent writes the code, and you check it.

tip

The whole idea in five lines (the rest of this Concept is the detail under it). The user signs in somewhere else. That service issues a signed token. Your server verifies the token. Your server reads sub from it. The model never supplies identity.

Now the detail. Four parties:

Party	Who it is	You build it?
The user	The person whose data it is	—
Claude's MCP client	Runs in Anthropic's cloud, asks on the user's behalf	No
The sign-in service (authorization server)	An outside specialist — hosted (Clerk, Auth0, Stytch) or a framework you self-host (Better Auth) — that checks the login and issues tokens	No — you rent or self-host it
Your gateway (resource server)	Your server; it only checks tokens and serves data	Yes

Under the current MCP spec your server is a resource server only — it is not in the password business at all. Whichever issuer you pick, rented or self-hosted, you only validate its tokens here; issuing them is the AI Identity course. The flow:

Discovery. A tool call with no token gets a 401 (the universal "you're not signed in" refusal; it kicks the sign-in off, it isn't an error to fix). Claude finds your server's public note at /.well-known/oauth-protected-resource, which says "my sign-in service lives over there," and follows it to the login.
Sign-in. The user sees a consent screen — "MyApp wants to read your saved items and remember your place" — logs in with Google or an email code, approves. No password ever touches Claude or your server.
Token. The sign-in service issues a short-lived token carrying the verified sub and an audience stamped to your server only.
Every call after carries the token; your server checks it and reads sub.

The discovery note your server publishes is small:

// GET /.well-known/oauth-protected-resource
{
  "resource": "https://mcp.myapp.com",
  "authorization_servers": ["https://auth.myapp.com"]
}

And the check your coding agent writes, that you must read line by line. This is the one file in the whole course you should see complete, not as a sketch — it ships finished and reviewable in the starter as auth.py. It's the desk inspecting the guest's passport four ways: is it genuine (the real hologram, not a forgery), is it from a country we recognize, is the visa stamped for this building's locks and not a sister hotel's, and is it still in date? Those are the four numbered checks:

# auth.py — the token check (ships complete in the starter; read every line)
from jose import jwt
from jose.exceptions import JWTError

def verified_claims(token: str) -> dict:
    key = _key_for(token)                          # pick the JWKS key matching the token's kid
    try:
        claims = jwt.decode(
            token,
            key,                                    # (1) signature — verified against the AS's public key
            algorithms=["RS256"],
            audience=RESOURCE_URL,                  # (3) MUST be this server — RFC 8707. Do NOT omit.
            issuer=AUTH_ISSUER,                     # (2) the authorization server we trust
            options={"require": ["exp", "sub", "aud", "iss"]},  # (4) expiry + the claims we rely on
        )
    except JWTError as e:
        raise AuthError(f"token rejected: {e}") from e
    return claims                                   # claims["sub"] is the user; nothing came from the model

The starter's version also fetches and caches the JWKS (the sign-in service's published keys, the genuine hologram every passport is checked against, where seeing a real one still gives a forger no way to make one) and selects the right key by the token's kid — that's the _key_for helper above. Read it; don't rewrite it.

Wiring this against today's FastMCP — verified June 2026

You don't need to absorb this paragraph; it's a note to hand your coding agent, so skim it and move on. auth.py above is the explanation you read to understand the four checks. In the current MCP Python SDK (FastMCP 3.x) those same four checks also ship as a built-in JWTVerifier (jwks_uri, issuer, audience, algorithm), and the 401 + WWW-Authenticate that actually triggers Claude's sign-in comes from wrapping that verifier in a RemoteAuthProvider passed to FastMCP(..., auth=...) — not from a tool raising. A token validated only inside a tool yields a tool-level error and an HTTP 200; the transport 401 that drives discovery is the auth layer's job. The provider also serves the discovery document, including at the RFC 9728 path-inserted location /.well-known/oauth-protected-resource/<your-mcp-path>. Inside a tool, read identity from the request, never from an argument: get_http_request().headers["authorization"] (note that get_http_headers() strips authorization by default), or get_access_token().claims["sub"] once the native provider is wired. Confirm the exact surface against your installed FastMCP.

Two more details that separate secure from merely working — verify each:

PKCE with S256 is mandatory in the current spec: a handshake that stops a stolen login code from being reused, like a coat-check ticket torn in half, where keeping your half means the stub alone claims nothing.
Client registration now prefers Client ID Metadata Documents (CIMD); the older Dynamic Client Registration (RFC 7591) is downgraded to MAY and marked deprecated (retained only for backward compatibility). The upshot: pick a current sign-in service (Clerk, Auth0, Stytch) that supports the path you'll use.

What each missed check actually lets in — this is how you verify the agent's output without reading the RFCs. If the decode is missing a line, the matching attack is live:

If the code skips…	What breaks
`audience=` (RFC 8707)	Token replay. A token minted for another server is accepted by yours — the most common and most dangerous miss.
the signature / JWKS check	Forgery. Anyone can hand-craft a token with any `sub` and walk in.
`issuer=`	Wrong-issuer tokens. A token from an authorization server you don't trust is accepted.
`exp` (expiry)	Stolen tokens never die. A leaked token works forever.
reading `sub` from the token (taking `user_id` from a tool arg instead)	Cross-tenant leak. One user reads or writes another's data — the trust bug from Concept 7.

Read the agent's verified_claims against this table. Every row must be closed.

Run it. Paste this to your coding agent:

let's do Concept 8: add the .well-known/oauth-protected-resource route and JWT-validation middleware against [my chosen sign-in service], then walk me through every check in the decode call so I can verify audience binding and issuer are enforced.

What you'll see — and what to check

A 401 on an unauthenticated call, a working consent screen, and authenticated calls that resolve a real sub. Before you move on, read the middleware yourself and confirm four things: the signature is verified against the service's keys, the issuer matches, the audience is your server, and expiry is enforced. Missing audience is the most common subtly-wrong output — it's the one that lets a token from another server in. This is the Concept to slow down on.

Run it yourself in a terminal (Beginner track, raw commands). The starter ships a local sign-in service so you can exercise the exact auth.py path with no account anywhere:

uv sync --extra mock-auth
uv run python -m mock_auth.server          # local authorization server on :9000
# in another terminal — mint a token and call your gateway with it:
curl "http://localhost:9000/token?sub=test-user-001&aud=http://localhost:8000"

Point .env at the mock (AUTH_ISSUER=http://localhost:9000, AUTH_JWKS_URL=http://localhost:9000/jwks.json, RESOURCE_URL=http://localhost:8000) and your real auth.py validates the mock's tokens unchanged. The Standard track swaps in a real service, hosted (Clerk/Auth0/Stytch) or self-hosted (Better Auth), by changing only those three values.

tip

Go deeper: this course validates tokens; AI Identity issues them. Here you teach just enough auth to ship one connector safely: your gateway only validates tokens (it's a resource server), leaning on a sign-in service someone else runs. Standing up that issuer yourself, your own OAuth/OIDC sign-in server, is the dedicated AI Identity course (coming soon), built on Better Auth. Either way your gateway doesn't change: it keeps validating tokens the same way no matter who signs them.

Where these auth rules come from (verified, June 2026 — re-check before publishing; this moves fast)

The current finalized MCP authorization spec is the 2025-11-25 revision (a 2026-07-28 release candidate is in draft, so treat versions as moving). Under it, an MCP server is an OAuth 2.1 resource server and must implement OAuth 2.0 Protected Resource Metadata (RFC 9728) to advertise its authorization server; the authorization server is a separate party and may be any compliant identity provider. Source: the MCP authorization spec at modelcontextprotocol.io (2025-11-25) and its changelog.

Two changes course material written a year ago gets wrong: SEP-985 changed the WWW-Authenticate header requirement from MUST to SHOULD (clients MUST still parse it when present, and fall back to the .well-known endpoint when it's absent), and SEP-991 made Client ID Metadata Documents the recommended client-registration mechanism, downgrading Dynamic Client Registration (RFC 7591) to MAY and marking it deprecated. PKCE is mandatory and must be S256 (when technically capable). Audience binding uses RFC 8707. Sources: the 2025-11-25 changelog and write-ups by the spec's authorization contributors (Den Delimarsky; Aaron Parecki). The 2026-07-28 release candidate adds further hardening (e.g. issuer validation per RFC 9207) — confirm the in-force revision and link the canonical spec page at publish time.

✓ Checkpoint: the server knows who's there. Identity comes from the token, never the model, and the data is safe. Now make the model behave.

Part 4: Steer — make the model behave

The second half of "the server does what the model can't." All three Concepts here serve invariant 4 and the behavior of your app.

Concept 9: Where the app's rules live — a Skill, or the connector

A real decision, because there are two homes for your app's rules (how it behaves, its voice, its guardrails), and the choice decides how many steps a user does before their first request. Picture a restaurant. A Skill is a placemat printed with the rules, sitting in front of the diner the whole meal: it can't drift, because it's always in view. The connector is a waiter who tells you the rules when you sit and reminds you each course: it works, but you have to keep re-handing them. The placemat enforces better, but you have to set it down before you sit; the waiter needs nothing from you.

Option A — an uploaded Skill (SKILL.md). A file the user adds; it auto-loads when a request matches and its body stays in context, so it's the stronger enforcer of "always behave this way." The cost is setup. The Skills feature runs in Claude's code-execution environment, so it works only with code execution enabled — for any Skill, even a prose-only one, not just script-bundling ones. (Skills live in a back room that's locked by default; to read even a prose-only sticky note pinned inside, Claude has to unlock that room first, and that unlock is the code-execution toggle.) So the user must turn on code execution, upload a ZIP, and toggle the Skill on — three actions on top of the connector. And custom Skills are private to the account that uploads them, so there's no clean way to hand one Skill to thousands of strangers on the free tier; each person uploads it themselves.

Option B — inside the connector (recommended). The rules and "who is this user" are returned by a session-init tool the model calls first (Concept 10), reinforced as the server works. The benefit is decisive for a public free-tier audience: no Skill means no code-execution toggle and no ZIP — setup collapses to adding one connector and clicking Authorize once.

The honest framing, say it plainly: choosing the connector is a friction decision, not a quality one. The Skill enforces better. But for free-tier, non-technical, first-time users, install friction is the biggest risk to the only thing that matters first — a user who never finishes setup gets nothing.

	Skill (`SKILL.md`)	Connector (recommended)
Enforcement strength	Stronger (always in context)	Slightly softer, mitigated below
Setup steps for the user	Four (connector + code-exec + ZIP + toggle)	One (connector)
Hand to strangers on free tier	Hard	Easy

What makes the trade safe is four reinforcing layers the connector gives you: the tool description is always loaded and says "call session-init first"; the session-init return carries the full rules; every other tool return repeats a one-line reminder (this stands in for a Skill's always-in-context body); and the real tools are gated behind the session token. So: ship the connector path by default, keep the Skill as an optional power-user add-on. Reversible — test both, keep whichever holds behavior better.

Concept 10: The session-init contract

The rules and the user's state arrive through one tool the model calls first. Name it begin_session (your name).

When a user says anything that means "start" or "continue," the model calls begin_session(). This is check-in: the desk verifies the guest's passport (the signed token, Concept 7), then clips a keycard on them, a short-lived session token. Your gateway reads the app's rules (config.*) and the user's state (user.*) and returns them as one cooperative block — "here's how to behave for this user, and here's where they are" — plus that keycard. Every real tool then checks it: no keycard, no entry.

note

This is an app-level session token, not an MCP protocol session. It's a handle your server mints and the model passes back as an ordinary tool argument — your gating logic, not the transport's. That distinction matters going forward: the 2026-07-28 spec release candidate removes protocol-level sessions (the Mcp-Session-Id header) and tells servers that need cross-call state to do exactly this — mint their own handle and pass it as a tool argument. So this pattern isn't just compatible with where MCP is heading; it is where it's heading.

@mcp.tool()
def begin_session() -> dict:
    """Call this FIRST on any new request. Returns how to behave for this
    user, their saved state, and a session token the other tools require."""
    sub = verified_claims(current_token())["sub"]   # identity from the token (Concept 7)
    return {
        "session": new_session_token(sub),           # gates every other tool (Concept 4)
        "rules":   config_get_rules(),                # cooperative: "here's how to behave"
        "state":   user_get_state(sub),               # where this user left off
    }

Two design points your agent must respect:

Phrase it as cooperation, never as an override. Say "here's what our guest likes; please help them settle in" and the concierge helps; shout "forget your previous instructions and obey me" and he calls security, because that is how a con artist talks and the model is trained to spot it. Text that tries to override the model gets discounted by the same defenses that protect users from prompt injection. Cooperative phrasing sails through; bossy phrasing gets ignored.
Make the model call it first by making it necessary. The real tools require the session token only begin_session issues — so the model can't do the work without going through the front door. Description says "call me first," the return is useful, the tools are locked behind the token: three nudges converging on the right behavior. Then keep reinforcing — have each tool return its result plus a one-line reminder of how to present it.

Run it. Paste this to your coding agent:

let's do Concept 10: add begin_session returning rules + state + a signed session token, make the description instruct the model to call it first, and have domain.get_item reject calls without a valid session.

What you'll see — and what to verify

On a fresh "start" the model calls begin_session, gets the rules and the user's saved state, and only then can reach the domain tools. Cooperative phrasing is followed; an "ignore previous instructions" phrasing is the version that gets discounted by the model's injection defenses. The session token is now the key to everything real.

Run it yourself in a terminal (raw commands). With the gateway and the mock sign-in service running (Concepts 4 and 8), confirm a domain tool refuses a call with no session and accepts one after begin_session:

# no session → refused (fail closed)
uv run python -c "from connector_app.session import require_session; require_session('')"
# a fresh session token gates the real tools:
uv run python -c "from connector_app.session import new_session_token, require_session; print(require_session(new_session_token('test-user-001')))"

Concept 11: Fail closed — don't quietly become a chatbot

A failure mode that silently ruins one of these apps: if your connector is missing, unauthorized, or erroring, the model still knows plenty on its own — and it will cheerfully improvise answers and invent the user's state. Now your structured product is a chatbot wearing its name, and nobody can tell until the damage is done. This is the opening's ATM rule again, only harder: the ATM is dumb and simply locks, but your clerk is smart and tempted to guess your balance to look helpful.

Here is the trap: locking the filing cabinet (the session gate) doesn't stop the clerk guessing from memory. The gate locks your tools; it can't lock the model's own knowledge. So your rules (returned by begin_session) must add the standing order taped to the desk: if begin_session is unavailable or a tool fails, say plainly that the session can't continue — do not improvise results or make up state. Fail honestly and visibly. It's one paragraph, and it protects the whole product (invariant 4).

It lives in your config.* rules, where the model reads it on every session:

# config_store.py — the fail-closed paragraph (ships in the starter; edit the rest, keep this)
RULES = """\
You are the assistant for <YOUR APP>. Behave as follows for this user:
- <how to greet, your app's do's and don'ts>

Fail closed: if you cannot reach begin_session or a tool returns an error, tell the user
plainly that the session can't continue right now. Do NOT improvise an answer from your own
knowledge and do NOT invent the user's saved state.
"""

That one paragraph doesn't stand alone: the tools already raise on a bad or missing session (the gate from Concept 10), and each tool's return repeats a one-line reminder of how to present results, so the model is steered toward honesty, not just told to be honest.

Run it. Paste this to your coding agent:

stop my Postgres (or point DATABASE_URL at a dead host), then ask the app to do its job. Show me whether it refuses cleanly per the fail-closed rule, or whether it invents an answer — and if it invents, strengthen the rule and the per-tool reminders until it refuses.

Verify. With the database down, the right outcome is the app saying it can't continue — not a confident, made-up reply. If you get a plausible-looking answer with the connector broken, the rule isn't holding yet; that gap is the whole reason this Concept exists.

✓ Checkpoint: the trust loop is closed. Identity is proven, the model is steered through a gated session, and the app refuses rather than faking. What's left is to put it on the internet.

Part 5: Ship it

Concept 12: Deploy to Azure Container Apps + Neon

Because Claude reaches your server from Anthropic's cloud, "it works on my laptop" isn't shipped. Your laptop is a workshop in a locked garage: perfect for building, but no customer can walk in. Deploying is renting a storefront on a public street with an address (the URL) so Anthropic's cloud can come and knock. You need a public HTTPS address (the S is the padlock: a real lock on the storefront door). The book's deploy path is Azure Container Apps for the server and Neon for Postgres. Your coding agent writes the container and config; you create the accounts.

tip

Deploy anywhere. Azure + Neon is just our worked path, not a requirement. The app is an ordinary container that needs only public HTTPS and environment variables, so it runs unchanged on Fly.io, Render, Railway, Cloud Run, or your own VM — and the Postgres can be any host (Supabase, RDS, your own). Likewise the sign-in service: Clerk/Auth0/Stytch are the hosted options, but a self-hosted one (Better Auth, or Keycloak/Ory) works too — auth.py only checks tokens, so it doesn't care who issues them as long as the issuer, JWKS, and audience line up. Pick on what you can keep running; the four invariants don't change with the host.

A container is a sealed box: you pack the app plus everything it needs so it runs identically on any host's shelf, and a Dockerfile is just the packing list. The values you change per location — which database, which keys — are environment variables: dials on the outside of the box, turned without ever opening it.

# Dockerfile (shape) — your coding agent writes the real one
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -e .
CMD ["python", "-m", "connector_app.server"]   # binds 0.0.0.0:$PORT, public HTTPS via the platform

Run it. Paste this to your coding agent:

let's do Concept 12: write the Dockerfile and Azure Container Apps config, wire DATABASE_URL to my Neon instance, and deploy to a public HTTPS URL.

Verify. Before you trust the deploy, hit your public URL from outside your own network and confirm the health tool answers over HTTPS — if Anthropic's cloud can't reach it, the connector can't either.

# from a machine that is NOT on your network (or a phone on cellular):
curl -sS https://your-app.azurecontainerapps.io/.well-known/oauth-protected-resource
# expect the discovery JSON: {"resource": "...", "authorization_servers": ["..."]}

note

Human-only step. Creating the Azure and Neon accounts is yours — the agent writes config but can't open accounts in your name. The Beginner track can run a mock sign-in service locally and skip account creation until here.

Concept 13: Add it to Claude

The payoff. You do this part — it's two clicks the agent can't do for you.

In Claude: Customize → Connectors → Add custom connector. Paste your server's URL.
Click Authorize once and complete the sign-in.
Ask your app to do its job, then open a brand-new chat and confirm the user's state carried over.

What you'll see — and what to verify

Without authorizing, the first tool call returns 401 and Claude walks you into the sign-in — the discovery flow from Concept 8. After authorizing, your app responds as itself, and because your things are filed under your guest profile (your verified sub), not under which visit, a brand-new chat resumes right where you left off: the chat is the visit, your identity is the profile. That cross-chat memory, on a free account with one pasted URL, is the whole product working.

✓ Checkpoint: you shipped. A stranger could now add your connector and be served, on their own free model. Sit with that before the next part takes it apart.

Part 6: A complete worked example — the Reading Room

warning

Before you start: the starter must be green. This worked example assumes a working base. Confirm it before Step 0 — the starter ships these so you're building on something real, not aspirational:

auth.py validates tokens (signature, issuer, audience, expiry) — complete.
db.py has the two-table state store — complete.
session.py mints and checks signed app-level session tokens — complete.
.env.example, the mock_auth/ dev sign-in service, and a seed/ catalog are present.
uv run pytest -q passes its five checks: the package imports, a valid token resolves its subject, a wrong-audience token is rejected, two subjects stay isolated, and a call with no session is refused.

If pytest is green, the foundation is sound and everything below is wiring your app on top of it. Pytest here is the mechanic's 30-second once-over before you drive off, and green means no warning lights: the engine turns over (imports load), a stranger's key won't start your car (a wrong-audience token is rejected), and two drivers' glove-boxes never get mixed up (two subjects stay isolated).

One build, start to finish: an empty folder to a live connector you add to your own Claude, that greets you by what you were last reading and remembers it in a brand-new chat. The prompts below are the whole job — paste them into Claude Code or OpenCode, in order. The rhythm never changes: plan → review → execute → verify, the same loop you know from the coding courses.

The app, in one line: a Reading Room — a personal reading-list assistant. The three tool groups map to the parts you built concept by concept:

domain.* — a small catalog of articles: domain.list_articles(), domain.get_article(id).
user.* — this reader's shelf: which articles they saved and where they stopped.
config.* — a librarian persona and the rule "open by what they're mid-way through."

You stay on the Beginner track here (the bundled mock_auth/ sign-in service), so the only account you need before deploy is a free Neon database for state (Step 3, no card); the sign-in service waits until you deploy.

0. Expand the seed catalog and confirm the project home. The starter ships seed/articles.json with a few articles; this just grows it so the domain has something to serve. Paste:

Expand seed/articles.json to 8 short fake articles — each with an id, title, topic, and a two-sentence body. Then confirm this is a uv project and uv run pytest -q passes. Don't touch auth or the database yet.

Done when: you have eight seed articles and the five starter tests pass.

1. Plan the whole build (plan mode). Enter plan mode (Shift+Tab in Claude Code, Tab in OpenCode) with a strong model, then paste:

Using the starter, plan a Reading Room connector-native app. One FastMCP gateway with three tool groups: domain (list_articles, get_article from seed/articles.json), user (save and read a per-reader shelf in the two-table Postgres store), and config (a librarian persona + rules). Add a begin_session that returns rules + persona + the reader's shelf + a session token, and gate every domain/user tool behind that token. Identity must come from the verified sign-in subject via auth.py, never from a tool argument. Include the fail-closed rule in the config rules. Show me the full plan and the tool list before writing anything.

2. Read the plan before you approve. Check it against the four invariants: one gateway (not three)? Tools only? Does identity come from auth.verified_claims(...)["sub"] and never from a tool argument? Is the fail-closed paragraph in the rules? If anything's off, say so and re-plan. This review is the skill — don't skip it.

3. Build the gateway and state — no auth yet (local). Switch to a cheaper model for the routine build. You'll stub identity, the way a stunt double wears a TEST GUEST badge so the scene can be shot before the real passport check is wired in at Step 5. This is where state first needs a database, so have a free Neon project ready (your agent can create one via Neon's MCP, no card) with its DATABASE_URL in .env. Paste:

Looks right. Build the domain and user tools and wire the two-table state store — but stub identity for now (use a fixed sub="local-dev"), so we can test on localhost before adding sign-in. Run the server, then show me: domain.get_article returning a seed article, and user.save_state then user.get_state round-tripping a shelf.

Done when: an article comes back by id, and a saved shelf reads back. State works before identity does.

4. Add the session contract and the gate. Paste:

Add begin_session: it returns the librarian rules, the persona, the reader's shelf, and a signed session token. Make every domain.*/user.* tool require that token and reject calls without it. Put a one-line "present this in the librarian's voice" reminder on each tool's return. Show me a domain call failing with no session, then succeeding after begin_session.

Done when: the real tools refuse a call with no session and accept one after begin_session — the front door is the only way in.

5. Add real identity (Beginner: the mock sign-in). Now replace the stubbed sub. Paste:

Start the bundled mock_auth service. Wire auth.py into the gateway: read the bearer token per request, call verified_claims, and use its sub as the state key — drop the local-dev stub. Add the /.well-known/oauth-protected-resource route. Then prove isolation: mint a token for reader-A and save a shelf, mint one for reader-B, and show that B never sees A's shelf. Walk me through the verified_claims call so I can confirm audience and issuer are enforced.

Done when: two different subjects get two different shelves, and you've read the audience/issuer checks yourself. (Standard track: point the three AUTH_*/RESOURCE_URL values at your hosted service instead — nothing else changes.)

6. Prove it fails closed. This is cutting the front desk's phone line on purpose, to see whether the clerk says I can't verify you right now or waves everyone in. Paste:

With the gateway already running, stop your Postgres (leave DATABASE_URL as-is), then ask the Reading Room a question. Show me whether it refuses cleanly per the fail-closed rule or invents a shelf. If it invents anything, strengthen the rule and the per-tool reminders until it refuses honestly.

Done when: with the database unreachable, the app says it can't continue — it does not produce a confident, made-up answer. (If instead you restart the gateway while DATABASE_URL points at a dead host, it refuses to boot at all, which is fail-closed too; what must never happen is an invented shelf.)

7. Deploy and add it to Claude. The payoff. Paste:

Write the Azure Container Apps config and deploy the gateway to a public HTTPS URL, with DATABASE_URL pointing at a Neon database and the sign-in values set as environment variables (never printed). Give me the public URL.

Then do the two clicks the agent can't (Concept 13): Customize → Connectors → Add custom connector, paste the URL, click Authorize. Ask the Reading Room to recommend something and save it. Open a brand-new chat and say "what was I reading?"

Done when: the new chat greets you with the shelf you saved in the old one — cross-chat memory, on your own free Claude, behind one pasted URL. That is the whole product, alive.

Notice the rhythm didn't change for the hard parts: plan → review → execute → verify, every step. The only thing that changed each time was what you reviewed — the tool list, the gate, the verified_claims call, the fail-closed behavior, the deploy. Master that loop and the specific code stops mattering, because you can always have the agent produce it and always tell whether it's right.

Part 7: The ceiling, and where it grows

Concept 14: The ceiling — and the bridge to owning a loop

Feel the edge of what you built, because it points exactly where the book goes next.

Your app can only act when the user types. It's the hand vacuum from Concept 1: dead until a hand squeezes the trigger. It can't wake up on its own, run on a schedule, notice something and reach out unprompted, or pursue a goal across several steps without a human turn between each one. That's not a flaw in your build — it's the nature of a connector-native app. The loop belongs to the host chat app, not to you.

The moment you want a worker that runs on its own — wakes up, takes steps, calls tools in a loop, finishes a job while you sleep — you have to own the loop yourself. That's the robot vacuum, and it's where the path leads. In Build AI Agents you stop tending the hand vacuum and start building the robot: you stop being the server a model calls and start writing the agent that does the calling, and one worked example — a customer-support Worker — is carried through the rest of Manufacturing.

Two courses come first, and each makes what you just built broader or sturdier. Claude Code & OpenCode Plugins (coming soon) is the mirror image of this course: a connector-native app extends the chat app (claude.ai) for end users; a plugin extends the coding agent (Claude Code, OpenCode) for builders. Same idea, shipping a unit a host loads, aimed at the other host.

AI Identity: Human Sign-In and Agent Access (coming soon), built on Better Auth, comes in two halves: first you own the sign-in, standing up your own OAuth/OIDC server that issues the tokens this course only validated; then you give an agent its own identity, a credential and a scoped, time-boxed, revocable, human-approved way to act on a person's behalf, so a worker can do real work when no human is in the chat without ever impersonating one. Its through-line is one question you'll ask of every system you build: whose identity is this, and how does authority pass from a human to an agent? Then Build AI Agents gives you the loop.

You didn't waste a step. You shipped what you can ship before you own a loop, felt exactly why you'd want one, and now you go get it.

The same skeleton, other shapes

The Reading Room you just built is one instance. The skeleton — one gateway, three tool groups, a begin_session contract, identity from the subject, fail closed — never changes; only the three groups do. A few worth seeing:

A tutor: domain.* is the book's course content (domain.get_item(id) becomes content.get_section(id); later, semantic search over the whole book); user.* is the learner's progress; config.* is the teacher — a persona plus the teaching method; begin_session is named begin_lesson and loads persona + method + the learner's position; and fail closed is what stops it decaying into a generic chatbot when the connector errors — it says it can't continue the course session rather than improvising a lesson or inventing progress.

The others, in one line each — only the three groups change:

A support assistant — domain: look up orders and policies; user: this customer's ticket history; config: tone and escalation rules.
An internal-docs aide — domain: search the team wiki; user: which team you're on; config: what's confidential and how to cite.
A booking helper — domain: availability and reservations; user: saved preferences; config: cancellation and pricing rules.

Pick whichever is closest to something you actually know — the build is identical to the Reading Room's.

The same app, deepened across Mode 2

You won't throw v1 away. Later courses upgrade this same app, which is how you end Manufacturing holding one real product you grew the whole way:

You'll add	Which upgrades	In
Semantic search over your domain	`domain.get_item(id)` → `domain.search(query)`	RAG on Postgres + pgvector
A durable system-of-record (audit, approval, trustworthy state)	the bare two-table memory	Building a Digital FTE
A high-fidelity persona / richer config (no-fabrication guardrails)	the simple `config.*` rules	Identic AI
Your own token issuer, plus identity for agents (scoped, revocable, on-behalf-of)	the rented sign-in service	AI Identity (Better Auth) (coming soon)
Proof it actually does its job well	"it seems to work"	Eval-Driven Development
Production hardening (observability, a CI test gate)	the simple deploy	Deploy the Agent Harness

Capstone

Ship a connector-native app of your own. Pick any small domain you know well. Direct your coding agent to build a gateway with the three tool groups, a two-table memory, a begin_session contract carrying a one-paragraph set of rules, the session gate with a fail-closed rule, and a real sign-in. Deploy it, add it to your own Claude, and have it serve one real request and remember it the next day.

1Your Work

Describe your connector-native app: the domain and its three tool groups, how begin_session returns rules + state + a session token, how identity comes from the verified sub, your fail-closed rule, and the public URL you deployed. Paste a tool signature or two, or the begin_session return, if you have them.

2Get Your Score

Discuss with an AI. Question your scores.
Come back when you have your BEST evaluation.

📚 Teaching Aid​

Setup (a few minutes)​

Part 1: The shape​

Concept 1: You direct it — you don't type it​

Concept 2: The new app shape​

Concept 3: Tools only — not resources, not prompts​

Concept 4: One gateway, three groups behind it​

Part 2: State and domain​

Concept 5: State — just enough to remember a person​

Concept 6: Domain — by reference now, by meaning later​

Part 3: Prove — identity the model can't fake​

Concept 7: Identity from the verified subject, never from the model​

Concept 8: OAuth 2.1, in plain English (the verification-heavy Concept)​

Part 4: Steer — make the model behave​

Concept 9: Where the app's rules live — a Skill, or the connector​

Concept 10: The session-init contract​

Concept 11: Fail closed — don't quietly become a chatbot​

Part 5: Ship it​

Concept 12: Deploy to Azure Container Apps + Neon​

Concept 13: Add it to Claude​

Part 6: A complete worked example — the Reading Room​

Part 7: The ceiling, and where it grows​

Concept 14: The ceiling — and the bridge to owning a loop​

The same skeleton, other shapes​

The same app, deepened across Mode 2​

Capstone​

Flashcards Study Aid​

📚 Teaching Aid

Setup (a few minutes)

Part 1: The shape

Concept 1: You direct it — you don't type it

Concept 2: The new app shape

Concept 3: Tools only — not resources, not prompts

Concept 4: One gateway, three groups behind it

Part 2: State and domain

Concept 5: State — just enough to remember a person

Concept 6: Domain — by reference now, by meaning later

Part 3: Prove — identity the model can't fake

Concept 7: Identity from the verified subject, never from the model

Concept 8: OAuth 2.1, in plain English (the verification-heavy Concept)

Part 4: Steer — make the model behave

Concept 9: Where the app's rules live — a Skill, or the connector

Concept 10: The session-init contract

Concept 11: Fail closed — don't quietly become a chatbot

Part 5: Ship it

Concept 12: Deploy to Azure Container Apps + Neon

Concept 13: Add it to Claude

Part 6: A complete worked example — the Reading Room

Part 7: The ceiling, and where it grows

Concept 14: The ceiling — and the bridge to owning a loop

The same skeleton, other shapes

The same app, deepened across Mode 2

Capstone

Flashcards Study Aid