Designing the Tool Surface

Emma dropped a blank document on James's screen. "Nine tools. Design all of them before you write a single line of code."

James scrolled through the empty page. "All nine? I barely got register_learner working in the last chapter."

"Exactly. You know how to describe a tool, steer a spec, and verify the result. That is the building skill. This is the planning skill." She tapped the screen. "Write a contract for every tool. Name, inputs, outputs, who can call it, what data it needs. Like writing job descriptions before you hire anyone."

James thought about that. In his warehouse days, he never hired someone without a job description. The description told you what the person did, what they needed access to, and who they reported to. A function was no different.

"Nine job descriptions," he said.

"Nine job descriptions. Then we build."

You have the same blank sheet and the same constraint: contracts first, code second.

In this lesson, you design all 9 TutorClaw tools as specifications. No code, no terminal, no Claude Code. You define what each tool does, what it accepts, what it returns, who can call it, and what data it depends on. By the end, you have the complete blueprint for Lessons 3 through 7.

The Tool Contract

Every tool gets a contract with five parts:

Part	What It Answers
Name	What is the tool called? Lowercase, underscores, verb-first.
Description	When should the agent call this tool? (The most important part, from Ch57 L4.)
Inputs	What parameters does the tool accept? Types and whether required.
Outputs	What does the tool return on success? On error?
Tier access	Who can call this tool: everyone, gated by tier, or free-tier only?

You already know the first tool. Start there.

The Three State Tools (You Know These)

These manage learner data. All three depend on a single resource: the JSON state file stored locally in the data/ directory.

Tool 1: register_learner

You built this in Chapter 57. Now write the full contract:

Field	Value
Name	`register_learner`
Description	Register a new learner by name. Call this when a user wants to sign up or create a new learning profile.
Inputs	`name` (string, required): the learner's display name
Outputs	`learner_id` (string): unique identifier; `welcome_message` (string): personalized greeting
Tier	All
Depends on	JSON state file

Tool 2: get_learner_state

Field	Value
Name	`get_learner_state`
Description	Retrieve the current learning state for a learner. Call this at the start of every tutoring conversation to know where the learner left off.
Inputs	`learner_id` (string, required): the learner's unique identifier
Outputs	`current_chapter` (integer): chapter in progress; `stage` (string): PRIMM stage (predict, run, investigate); `confidence` (float): 0.0 to 1.0; `exchanges_remaining` (integer): daily exchanges left
Tier	All
Depends on	JSON state file

Tool 3: update_progress

Field	Value
Name	`update_progress`
Description	Record a learning interaction and adjust the learner's confidence score. Call this after every exchange where the learner answers a question or completes an exercise.
Inputs	`learner_id` (string, required); `chapter` (integer, required): chapter of the interaction; `stage` (string, required): PRIMM stage completed; `correct` (boolean, required): whether the learner's response was correct
Outputs	`updated_confidence` (float): new confidence score; `next_stage` (string): recommended next PRIMM stage
Tier	All
Depends on	JSON state file

The Six New Tools

These are the tools you have not seen before. Design each one carefully.

Tool 4: get_chapter_content

Field	Value
Name	`get_chapter_content`
Description	Fetch the content for a specific chapter. Call this when the learner asks to study a topic or the agent needs chapter material for a teaching interaction. Free-tier learners can access chapters 1 through 5 only.
Inputs	`learner_id` (string, required); `chapter_number` (integer, required): which chapter to fetch
Outputs	`title` (string): chapter title; `content` (string): full chapter text; `exercises_available` (boolean): whether exercises exist for this chapter
Tier	Gated
Depends on	JSON state file (for tier check), local content directory

Tool 5: generate_guidance

Field	Value
Name	`generate_guidance`
Description	Generate stage-appropriate pedagogical guidance using PRIMM-Lite methodology. Call this to get the next teaching prompt for the learner based on their current stage (predict, run, or investigate).
Inputs	`learner_id` (string, required); `chapter_number` (integer, required); `stage` (string, required): current PRIMM stage
Outputs	`guidance_text` (string): the teaching prompt; `expected_response_type` (string): what kind of answer to expect (prediction, observation, explanation)
Tier	All
Depends on	JSON state file

Tool 6: assess_response

Field	Value
Name	`assess_response`
Description	Evaluate the learner's answer to a teaching prompt and update their confidence score. Call this after the learner responds to a guidance prompt.
Inputs	`learner_id` (string, required); `chapter_number` (integer, required); `stage` (string, required); `learner_response` (string, required): the learner's answer
Outputs	`assessment` (string): feedback on the answer; `correct` (boolean): whether the response met expectations; `updated_confidence` (float): adjusted confidence score; `recommendation` (string): what to do next
Tier	All
Depends on	JSON state file

Tool 7: get_exercises

Field	Value
Name	`get_exercises`
Description	Return practice exercises matched to the learner's weak areas. Call this when the learner wants to practice or when their confidence in a topic drops below the threshold. Free-tier learners get exercises for chapters 1 through 5 only.
Inputs	`learner_id` (string, required); `chapter_number` (integer, optional): specific chapter, or omit to get exercises for weakest areas
Outputs	`exercises` (list): each with `id`, `prompt`, `difficulty`, `target_concept`; `weak_areas` (list): concepts the learner struggles with
Tier	Gated
Depends on	JSON state file (for tier check and weak areas), local content directory

Tool 8: submit_code

Field	Value
Name	`submit_code`
Description	Execute the learner's code in a sandboxed environment and return the output. Call this when the learner writes code and wants to test it. Free-tier learners have limited daily submissions.
Inputs	`learner_id` (string, required); `code` (string, required): the Python code to execute
Outputs	`stdout` (string): standard output; `stderr` (string): error output if any; `exit_code` (integer): 0 for success
Tier	Gated
Depends on	JSON state file (for tier check), mock sandbox

Tool 9: get_upgrade_url

Field	Value
Name	`get_upgrade_url`
Description	Generate a Stripe checkout link for upgrading from free to paid tier. Call this ONLY for free-tier learners who hit a paywall or ask about upgrading. Never call this for paid learners.
Inputs	`learner_id` (string, required)
Outputs	`checkout_url` (string): Stripe checkout link; `current_tier` (string): confirms learner is on free tier
Tier	Free only
Depends on	JSON state file, Stripe API

The Tier Access Matrix

Three categories. Every tool falls into exactly one:

Category	Meaning	Tools
All	Every learner can call these regardless of tier	register_learner, get_learner_state, update_progress, generate_guidance, assess_response
Gated	The tool checks the learner's tier before returning content. Free learners get limited access (chapters 1-5, limited exercises, limited code submissions).	get_chapter_content, get_exercises, submit_code
Free only	Only free-tier learners should call this tool. Paid learners have no reason to upgrade.	get_upgrade_url

Notice the logic. "All" tools are the core tutoring loop: register, check state, teach, assess, record progress. Every learner needs these regardless of whether they pay.

"Gated" tools deliver the content that makes paying worthwhile: full chapter access, targeted exercises, code execution. Free learners get a taste (chapters 1 through 5). Paid learners get everything.

"Free only" has exactly one tool. Once a learner has paid, showing them a checkout link is pointless and confusing.

The Dependency Graph

Every tool depends on at least one resource. There are exactly four resource types in TutorClaw:

Resource	What It Is	Where It Lives
JSON state file	Learner records, progress, tier, confidence, exchanges	`data/` directory
Local content directory	Chapter text and exercise files	`content/` directory
Mock sandbox	Code execution environment (mock for now, real later)	In-process
Stripe API	Payment processing (mock for now, real in Lesson 14)	External service

The dependency map:

Tool	JSON State	Local Content	Mock Sandbox	Stripe API
register_learner	writes
get_learner_state	reads
update_progress	reads + writes
get_chapter_content	reads (tier)	reads
generate_guidance	reads
assess_response	reads + writes
get_exercises	reads (tier + weak areas)	reads
submit_code	reads (tier)		executes
get_upgrade_url	reads (tier)			calls

Two patterns to notice. First: every tool depends on the JSON state file. It is the backbone of the entire system. If the state file is corrupted, every tool breaks. Second: only two tools touch external resources beyond JSON and content files. submit_code needs a sandbox; get_upgrade_url needs Stripe. Everything else is local file reads and writes.

A Tutoring Session Through the Dependency Lens

Trace a single session to see how the tools chain together:

Learner sends "Teach me about variables" via WhatsApp
Agent calls get_learner_state (reads JSON) to check where the learner left off
Agent calls get_chapter_content (reads JSON for tier, reads content directory) to fetch Chapter 1
Agent calls generate_guidance (reads JSON) to get the PRIMM "predict" prompt
Agent presents the prompt. Learner responds with a prediction.
Agent calls assess_response (reads + writes JSON) to evaluate the prediction
Agent calls update_progress (reads + writes JSON) to record the interaction

Seven tool calls. Five hit the JSON state file. Two also hit the content directory. Zero hit Stripe or the sandbox. This is the typical pattern: most of a tutoring session lives inside state and content. Payment and code execution are occasional events, not every-turn events.

Try With AI

Exercise 1: Write a Tool Contract from Scratch

Pick a tool that does NOT exist in TutorClaw yet: a get_learning_path tool that recommends the next three chapters based on the learner's confidence scores across all completed chapters.

I want to add a tool called get_learning_path to TutorClaw. It takes
a learner_id and returns the next three recommended chapters based
on the learner's confidence scores. Write the full tool contract:
name, description, inputs, outputs, tier access, and dependencies.

What you are learning: Writing a tool contract forces you to think through the agent's decision: when would the agent call this tool instead of another? The description must make that unambiguous.

Exercise 2: Find the Bottleneck

Look at the dependency graph above and identify the single resource that, if it became slow or unavailable, would affect the most tools.

Looking at TutorClaw's dependency graph, which resource is the
biggest bottleneck? If I wanted to make TutorClaw more resilient,
which dependency should I address first and why?

What you are learning: Dependency mapping reveals risk. The JSON state file is a single point of failure for all 9 tools. This is fine for a local prototype (the whole point of this chapter), but it is the first thing you would change when scaling to production.

Exercise 3: Redesign the Tier Matrix

TutorClaw currently has two tiers (free and paid). Imagine a third tier, "student," that gets chapters 1 through 10 and limited code submissions but no exercises.

TutorClaw currently has free and paid tiers. Design a third tier
called "student" that gets chapters 1-10, limited code submissions
(10 per day), but no exercises. Which tools need their contracts
updated? Write the updated tier access matrix.

What you are learning: Tier design is product design, not engineering. Changing a tier means updating tool contracts and gating logic, not rewriting code. The spec absorbs the change; Claude Code implements it.

James finished the last contract and leaned back. "Nine tools. Five fields each. That took longer than I expected."

Emma scanned the page. She paused at submit_code. "You listed 'mock sandbox' as the dependency. Walk me through why."

"Because we are not actually executing learner code in production yet. The mock returns hardcoded output so we can test the flow. Real sandboxing is a scaling problem for later, not a product problem for now."

Emma nodded slowly. "When I designed my first tool server, I spent two weeks building a real sandbox before I had a single tool working end to end." She paused. "I should have mocked it first and shipped."

James grinned. "Like shipping crates from the warehouse. You do not build custom crates for every product. You use standard boxes, ship the product, and upgrade the packaging when volume justifies it."

"Standard boxes." Emma wrote the phrase in the margin of the spec. "I am stealing that analogy for the next chapter." She closed the document. "Your nine job descriptions are done. Lesson 3: you hand the first three to Claude Code and watch it build them."

The Tool Contract​

The Three State Tools (You Know These)​

Tool 1: register_learner​

Tool 2: get_learner_state​

Tool 3: update_progress​

The Six New Tools​

Tool 4: get_chapter_content​

Tool 5: generate_guidance​

Tool 6: assess_response​

Tool 7: get_exercises​

Tool 8: submit_code​

Tool 9: get_upgrade_url​

The Tier Access Matrix​

The Dependency Graph​

A Tutoring Session Through the Dependency Lens​

Try With AI​

Exercise 1: Write a Tool Contract from Scratch​

Exercise 2: Find the Bottleneck​

Exercise 3: Redesign the Tier Matrix​