Spec and Build Your First Tool
James had been thinking about what he needed since the previous lesson. He opened a notebook (the paper kind) and wrote three words: register a learner.
"That is the first thing TutorClaw has to do," he said. "Before tracking progress, before assigning lessons, before anything. A learner walks in, we record their name, we give them an ID."
Emma leaned over and read the notebook. "Good. You know what you want. Describe it to Claude Code. What you need, not how to build it." She stood up. "I have a call. When I get back, show me the spec."
James watched her leave. He stared at his terminal. Three words in his notebook. How do you tell an AI what you need without telling it how to build it?
You are doing exactly what James is doing. You describe what you need, Claude Code builds it, Claude Code verifies it works.
In this lesson, you walk through the full cycle: describe a tool, review the spec Claude Code produces, steer the design if needed, and let it build and test. The mcp-builder skill you installed in Lesson 2 guides Claude Code through best practices so you can focus on what the tool should do.
Step 1: Describe What You Need
Open Claude Code in your tutorclaw-mcp project (the one you set up in Lesson 2). Send this message:
I want to build an MCP server called tutorclaw. The first tool should
register a new learner. It takes a learner's name and returns a welcome
message with a unique learner ID.
Spec this out before building.
Notice what this message contains and what it leaves out. It contains:
- What the server does (register a learner)
- What the tool accepts (a name) and returns (welcome message, learner ID)
- An explicit instruction to spec before building
It does not contain any Python, any file paths, any framework choices. Those are implementation decisions. Claude Code and the mcp-builder skill handle them. The transport and port are already configured in CLAUDE.md from Lesson 3 — you do not need to repeat them.
Step 2: Review the Spec
Claude Code responds with a spec before writing any code. The spec typically includes:
| Element | What to Look For |
|---|---|
| Tool name | A clear, lowercase name like register_learner |
| Tool description | The text that tells an agent when to call this tool |
| Input parameters | What the tool accepts and which fields are required |
| Output format | What the tool returns on success (and on error) |
Read the spec carefully. The most important element is not the parameter types or the return format. It is the tool description.
Why the Description Matters Most
When an agent has access to multiple tools, it reads each tool's description to decide which one to call. The description is a job posting. A vague posting attracts the wrong candidates. A specific posting attracts exactly the right one.
Compare these two descriptions for the same tool:
| Quality | Description | Agent Behavior |
|---|---|---|
| Vague | "Registers stuff" | Agent has no idea when to call this tool. It might call it for anything registration-related, or never call it at all. |
| Specific | "Register a new learner in the TutorClaw tutoring system. Call this tool when a learner is using TutorClaw for the first time and needs a unique learner ID. Do NOT call this tool if the learner already has an ID." | Agent knows exactly when to pick this tool and when not to. |
If Claude Code's spec has a vague description, that is your first steering point.
Step 3: Steer if Needed
The spec is a proposal, not a contract. You review it and push back on anything that does not match your requirements. Common steering moves:
Sharpen the description:
Make the tool description more specific. It should say exactly
when an agent should call this tool versus other tools.
Add parameters:
Add an optional email parameter. Not required for registration,
but useful if the learner provides one.
Adjust the output:
The return should include the learner ID, a timestamp of when
they registered, and the welcome message.
You might approve the spec on the first pass. You might steer three times. Both are normal. The point is that you see the plan before any code exists.
Step 4: Build and Verify
Once the spec looks right, tell Claude Code to implement it:
The spec looks good. Build this.
Claude Code does the rest. It creates the project, writes the server code, writes tests, and then verifies everything works:
- Runs the tests to check the tool logic is correct
- Starts the server to confirm it boots without errors
- Makes a real tool call against the running server
- Shuts the server down and reports the results
You do not run any commands yourself. Claude Code handles the full build-and-verify cycle. When it finishes, it tells you whether everything passed and asks if you want it to start the server so you can explore with MCP Inspector.
MCP Inspector is a visual tool that lets you call your server's tools one at a time and see the inputs and outputs. It is a good way to poke at your server before connecting it to your agent. If Claude Code offers to start the server for you, say yes, then in a second terminal run npx @modelcontextprotocol/inspector and connect to the URL Claude Code shows you.
The Describe-Steer-Verify Cycle
Step back and notice the pattern. You did not write a server. You did not learn framework syntax. You did not configure anything. You:
- Described what you needed in plain language
- Steered the spec when the description was not specific enough
- Verified the result — Claude Code ran the tests, started the server, and made a real call
Claude Code handled the implementation. The mcp-builder skill ensured best practices. This is the same spec-driven development pattern from Chapter 16 applied to MCP server creation: you own the requirements, the agent owns the code.
The hardest part was not building. It was knowing what to ask for. Getting the tool description right, choosing the right parameters, deciding what the output should look like. Those are design decisions that require understanding your domain. Claude Code cannot make them for you.
Try With AI
Exercise 1: Evaluate the Tool Description
Review the tool description Claude Code wrote for register_learner:
Read the tool description for register_learner in my MCP server.
Is it specific enough for an agent with ten different tools to know
exactly when to call this one? How would you improve it?
What you are learning: The tool description is the single most important line in your server. A good description makes the agent reliable. A bad one makes it guess.
Exercise 2: Spec a Second Tool
You need a second tool for TutorClaw: tracking a learner's progress. Describe it to Claude Code, but ask for the spec only:
I want to add a second tool: get_learner_progress. It takes a learner
ID and returns their current progress (lessons completed, current
lesson, total lessons). Spec this tool. Do not build it yet.
Compare the spec to the one for register_learner. Are the descriptions specific enough that an agent would never confuse the two?
What you are learning: When you have multiple tools, the descriptions must be distinct. Overlapping descriptions force the agent to guess which tool to call.
Exercise 3: Compare to From-Scratch
Ask Claude Code what the mcp-builder skill added:
If you had built this MCP server without the mcp-builder skill, what
would you have done differently? What best practices did the skill
add that a first-time MCP developer might miss?
What you are learning: Skills encode expertise. The mcp-builder skill carries patterns from well-tested MCP servers. Building without it works, but you miss the patterns you do not know you need until something breaks in production.
When Emma came back from her call, James had the results on screen. Tests passed, server booted, tool call returned a welcome message.
"All I did was describe what I wanted and push back on the description twice," James said.
"How many times did you rewrite the description before you sent it?"
"Three." James grinned. "Same thing used to happen with purchase orders at the warehouse. First draft was always too vague. You learn what details matter by getting it wrong once."
"That is the hard half. Knowing what to ask for." Emma looked at the spec. "Honestly, I still second-guess my own tool descriptions. Describing when an agent should pick one tool over another is harder than writing the server code." She shrugged. "That part is yours. You know the domain."
James looked at the results. "So it works. But nobody else is sending it requests."
"Lesson 5. Connect it to your agent on OpenClaw and test from WhatsApp."