Skip to main content

Build the State Tools

James looked at his tutorclaw-mcp project from Chapter 57. One tool: register_learner. It worked. He had tested it, connected it to OpenClaw, verified it from WhatsApp. But something nagged him.

He restarted the server. Then he sent a message from WhatsApp: "Show me my learning progress."

Nothing. The learner he had registered was gone.

"That is the problem with in-memory storage," Emma said, looking at his screen. "Every restart wipes the slate. Your learner registered five minutes ago and the server has already forgotten them."

"So I need a database."

"You need files on disk. JSON files. Simple, local, no database server to install." She pointed at the spec they had designed in Lesson 2. "Three state tools: register_learner upgraded to persist data, get_learner_state to read it back, update_progress to change it. Describe each one to Claude Code. One tool per prompt. You have the specs from yesterday."


You are doing exactly what James is doing. Three tools, JSON persistence, data that survives restarts.

Tool 1: Upgrade register_learner

Open Claude Code in your tutorclaw-mcp project. Send this prompt:

I want to upgrade register_learner to persist data in a JSON file
at data/learners.json instead of in-memory. Each learner gets a
unique ID, a name, a tier (default "free"), and a mock API key.
The data must survive server restarts.

Use MOCK_LEARNER_ID = "learner-001" and MOCK_API_KEY = "test-key-001"
as the default values for now. Real auth comes later.

Spec this out before building.

Review the spec Claude Code produces. Check the same elements you checked in Chapter 57:

ElementWhat to Look For
Tool descriptionSpecific enough that an agent knows when to call this instead of get_learner_state or update_progress
Input parametersName (required), anything else optional
Output formatReturns the learner record including the generated ID
PersistenceReads from and writes to data/learners.json

If the tool description is vague, steer it. The agent will have nine tools eventually. A description like "handles learner data" is useless when three different tools all handle learner data. A description like "Register a new learner by name. Call this when a user wants to sign up or create a new learning profile for the first time" gives the agent a clear selection signal.

Once the spec looks right:

The spec looks good. Build this.

Claude Code creates the data/ directory, sets up the JSON file handling, and updates your server with the persistent version of register_learner. Run the tests:

uv run pytest

If tests fail, paste the output back to Claude Code and ask it to fix them. That is the cycle.

Tool 2: get_learner_state

Now add the second tool. Send this prompt to Claude Code:

Add a get_learner_state tool that reads from data/learner_state.json.
It takes a learner_id and returns their current state:
- chapter (default 1)
- stage (default "predict")
- confidence score (default 0.5)
- tier (read from learner record)
- exchanges remaining (default 50 for free tier, unlimited for paid)
- weak areas (default empty list)

Use MOCK_LEARNER_ID = "learner-001" for now.
Spec this before building.

Review the spec. Pay special attention to the tool description. This tool reads state. It does not change state. The description must make that distinction clear, because the agent will also have update_progress, which does change state. If the descriptions overlap, the agent will pick the wrong one.

Good distinction:

  • get_learner_state: "Retrieve the current learning state for a learner. Call this when you need to check where a learner is in the curriculum, what their confidence level is, or how many exchanges they have remaining. This tool reads state but does not modify it."
  • update_progress: "Record a learning interaction and update the learner's state. Call this after a learner completes a stage, answers a question, or finishes an exercise."

If Claude Code's description does not make the read-versus-write distinction, steer it:

Make the tool description clearly state that this tool reads state
but does not modify it. The agent needs to distinguish this from
update_progress, which writes state.

Approve the spec, let Claude Code build, run tests:

uv run pytest

Tool 3: update_progress

The third state tool. Send this prompt:

Add an update_progress tool that takes learner_id, chapter, stage,
and confidence_delta. It updates the learner's state in the
data/learner_state.json file and returns the new state.

The confidence_delta is added to the current confidence score
(clamped between 0.0 and 1.0). If the learner has no existing
state, create a default state first, then apply the update.

Use MOCK_LEARNER_ID = "learner-001" for now.
Spec this before building.

Review the spec. This tool writes state. The description should say so. The output should return the updated state so the agent can confirm what changed.

Approve, build, test:

uv run pytest

The Restart Test

This is the verification that matters most. Everything before this was unit testing: does each tool work in isolation? The restart test answers a bigger question: does your data survive?

Step 1: Register a learner

Call register_learner through Claude Code or your test suite. Confirm you get a learner record back with an ID.

Step 2: Stop the server

Kill the running server process. The in-memory version from Chapter 57 would lose everything here. Your JSON version should not.

Step 3: Start the server again

uv run tutorclaw

Step 4: Call get_learner_state

Query the learner you registered before the restart. If the data comes back with the correct name, tier, and default state values, the persistence layer works.

If the data is missing, something went wrong with the JSON file write. Ask Claude Code:

I registered a learner, restarted the server, and called
get_learner_state. The learner data is missing. Check the
JSON file handling in register_learner. Is it writing to
data/learners.json on every registration?

This is the normal cycle: test, find the gap, describe the fix, verify again.

Connect to OpenClaw and Test from WhatsApp

Do not wait until all 9 tools are built to connect. Connect now with three tools.

Start the server, then connect to OpenClaw the same way you did in Chapter 57:

openclaw mcp set tutorclaw --url http://localhost:8000/mcp --transport streamable-http
openclaw gateway restart

Open the dashboard. Navigate to Agents > Tools. You should see three tools: register_learner, get_learner_state, update_progress.

Now pick up your phone. Send a WhatsApp message:

Register a new learner named James

The agent calls register_learner and returns a learner ID. Then send:

Show me my learning progress

The agent calls get_learner_state and returns chapter 1, predict stage, 0.5 confidence.

Three tools, working from your phone. As you add more tools in the next three lessons, they appear automatically when you restart the server. You do not need to reconnect; openclaw gateway restart picks up new tools from the running server.

What You Built

Three tools now handle the state layer of TutorClaw:

ToolPurposeData File
register_learnerCreate a new learner record with ID, name, tier, API keydata/learners.json
get_learner_stateRead current chapter, stage, confidence, exchanges, weak areasdata/learner_state.json
update_progressWrite updated chapter, stage, confidence after a learning interactiondata/learner_state.json

All three use JSON files on disk. No database server. No cloud storage. The data survives restarts because it lives in files, not in memory.

The mock auth pattern (MOCK_LEARNER_ID and MOCK_API_KEY) lets you test without building a real authentication system. Every tool call uses the same hardcoded learner. Real auth with unique API keys comes later in the chapter. This is a common development technique: mock the parts you are not building yet so you can focus on the parts you are.

Try With AI

Exercise 1: Audit the Tool Descriptions

Ask Claude Code to review all three state tool descriptions together:

List the tool descriptions for register_learner, get_learner_state,
and update_progress side by side. Could an agent with nine tools
tell them apart without ambiguity? Are there any overlapping phrases
that might cause the agent to pick the wrong tool?

What you are learning: When multiple tools operate on the same data (learner state), the descriptions must carve out distinct responsibilities. Overlap in descriptions means confusion in tool selection.

Exercise 2: Test the Edge Case

What happens if you call get_learner_state for a learner that does not exist?

Call get_learner_state with a learner_id that was never registered.
What does it return? Is the error message clear enough for an agent
to understand what went wrong and what to do next?

What you are learning: Tools need to fail clearly. An agent that gets a vague error cannot recover. An agent that gets "Learner not found. Register this learner first using register_learner" knows exactly what to do.

Exercise 3: Evaluate the Storage Choice

Ask Claude Code to think about the tradeoffs of JSON file storage:

We are using JSON files for learner state. What are the limitations
of this approach? At what point would we need to switch to a
database? What specific problems would we hit first?

What you are learning: JSON files work for a single-user local product. They break under concurrent writes, large datasets, or multi-server deployments. Knowing the limits of your storage choice is a product design skill, not just a technical one.


James registered a learner, stopped the server, started it again, and called get_learner_state. The data was there. Name, tier, default chapter, default stage. All of it.

"JSON files on disk," he said. "That is all it took."

"For now," Emma said. She paused, then added: "Honestly, I am not sure exactly when JSON files stop being enough. I have seen projects run on flat files longer than anyone expected, and I have seen them break earlier than planned. It depends on how many learners hit the server at once, how often you write, how big the files get." She shrugged. "The right answer is: ship the product first. Scale the storage later. You will know when JSON is not enough because the tests will start failing under load."

"So we have three tools. State is handled."

"State is handled. But your tutor has nothing to teach yet." She opened the spec from Lesson 2. "Content tools are next. Chapter text, exercises, teaching material. The tutor needs a curriculum before it can tutor."