Skip to main content

From Problem Statement to Specification

If you're new to programming

A specification is a precise description of what you want to build, written in Python (function signatures with types and docstrings). In this chapter, you drive the full TDG cycle independently. If a step feels unclear, revisit the chapter that taught it: Ch 46 for TDG basics, Ch 49 for function signatures, Ch 52 for test suites.

If you've coded before

This chapter removes all scaffolding. You translate English requirements into typed Python stubs, write comprehensive test suites, and prompt AI to implement. The method is identical to TDD, except AI writes the implementation instead of you.

Emma slides her laptop toward James. On the screen is a single sentence:

SmartNotes needs search. Users want to find notes by keyword and optionally filter by tag.

"Specify it," she says. "Don't write code. Don't open Claude Code. Write the function signature, the types, and the docstring. That's your spec."

James hesitates. In every TDG exercise since Chapter 46, Emma gave him the function name, told him the parameters, showed him the return type. Now there is nothing but a problem statement and a blinking cursor.

"Where do I start?" he asks.

"With three questions," Emma says. "What goes in? What comes out? What types? Answer those and you have a function signature. Add the edge cases and you have a docstring. That's the whole spec."

Reading New Code? Use PRIMM-AI+

When you encounter new Python syntax in this chapter, use the PRIMM-AI+ method from Chapter 42: Predict what the code does before running it [AI-FREE]. Rate your confidence (1-5). Run it to check your prediction. Investigate any surprises.


Three Questions That Build a Signature

You are doing exactly what James is doing. You have a problem statement and no guidance. Start with these three questions:

Question 1: What goes in?

Read the problem statement again: "find notes by keyword and optionally filter by tag." Three inputs emerge:

InputWhyType
A collection of notesYou need something to search throughlist[Note]
A keywordThe search termstr
A tag (optional)Narrows results, but not always needed`str

Question 2: What comes out?

"Return matching notes." Not strings. Not dictionaries. Notes. The return type is list[Note].

Question 3: What name fits?

The function searches notes. search_notes says exactly what it does. Not find, not query, not get_results. Name it after the action and the thing it acts on.

Those three answers produce a function signature:

def search_notes(
notes: list[Note],
keyword: str,
tag: str | None = None,
) -> list[Note]:
...

Output:

# No output -- this is a stub. The ... (ellipsis) means
# "not implemented yet." It passes pyright but does nothing.

Run uv run pyright smartnotes_search.py on a file containing this stub. It passes with zero errors. The types are valid, the signature is complete, and no implementation exists yet. That is the point: you specified the interface before writing a single line of logic.


The Docstring IS the Specification

A function signature tells you what goes in and out. A docstring tells you how it should behave. This is where most of the design decisions live.

Here is a bad docstring:

def search_notes(
notes: list[Note],
keyword: str,
tag: str | None = None,
) -> list[Note]:
"""Search notes."""
...

"Search notes" restates the function name. It tells the AI nothing about case sensitivity, ordering, empty inputs, or what "match" means. If you hand this to Claude Code and say "implement this," it will make those decisions for you. Some will be wrong.

Here is a good docstring:

def search_notes(
notes: list[Note],
keyword: str,
tag: str | None = None,
) -> list[Note]:
"""Search notes by keyword with optional tag filter.

- Matches keyword against title and body (case-insensitive)
- If tag is provided, only returns notes that have that tag
- Title matches come before body-only matches
- Empty keyword returns all notes (filtered by tag if provided)
- Empty notes list returns an empty list
"""
...

Every bullet is a design decision. "Case-insensitive" is not obvious from the problem statement. You decided that. "Title matches come before body-only matches" is a sorting rule that the original requirement did not specify. You decided that too. The docstring makes these decisions explicit so the AI (or another developer) does not have to guess.

Docstring elementDesign decision it captures
"case-insensitive"How matching works
"title and body"Where to search
"Title matches come before body-only"Sort order
"Empty keyword returns all notes"Edge case: what does "" mean?
"Empty notes list returns an empty list"Edge case: no data

Choosing the Right Data Model

The Note dataclass already exists from Chapter 50:

from dataclasses import dataclass, field

@dataclass
class Note:
title: str
body: str
word_count: int
author: str = "Anonymous"
is_draft: bool = True
tags: list[str] = field(default_factory=list)

When should you use @dataclass versus other options?

OptionUse whenSmartNotes decision
@dataclassYou need a simple container with typed fieldsAlready using it for Note
Pydantic BaseModelYou need validation (e.g., "word_count must be >= 0")Not needed here
dictYou are passing throwaway data between two linesAvoid for domain objects

The search function takes list[Note] and returns list[Note]. No new data model needed. Reuse what exists.


The Complete Stub File

Here is the full smartnotes_search.py that you would write before touching Claude Code:

"""SmartNotes search feature -- specification only (no implementation)."""

from dataclasses import dataclass, field


@dataclass
class Note:
title: str
body: str
word_count: int
author: str = "Anonymous"
is_draft: bool = True
tags: list[str] = field(default_factory=list)


def search_notes(
notes: list[Note],
keyword: str,
tag: str | None = None,
) -> list[Note]:
"""Search notes by keyword with optional tag filter.

- Matches keyword against title and body (case-insensitive)
- If tag is provided, only returns notes that have that tag
- Title matches come before body-only matches
- Empty keyword returns all notes (filtered by tag if provided)
- Empty notes list returns an empty list
"""
...

Output:

$ uv run pyright smartnotes_search.py
0 errors, 0 warnings, 0 informations

The file type-checks cleanly. Every parameter has a type. The return type is explicit. The docstring captures five design decisions. And the body is a single ... because you are not writing logic yet.


PRIMM-AI+ Practice: Predict the Signature

Predict [AI-FREE]

Press Shift+Tab to enter Plan Mode.

Here is a different problem statement:

Calculate the total word count across all notes, excluding drafts.

Before writing anything, predict:

  1. The function name
  2. The parameter list with types
  3. The return type
  4. Two edge cases the docstring should mention

Write your prediction on paper or in Plan Mode. Rate your confidence from 1 to 5.

Reference signature
def total_word_count(notes: list[Note]) -> int:
"""Return the sum of word_count for all non-draft notes.

- Skips notes where is_draft is True
- Returns 0 if notes is empty or all notes are drafts
"""
...

Key decisions: The function takes list[Note] (not individual notes). The return type is int (not float, because word counts are whole numbers). "Excluding drafts" means filtering on is_draft, not deleting notes. The edge cases handle empty input and all-drafts input.

Run

Press Shift+Tab to exit Plan Mode.

Create a file called smartnotes_wordcount.py with the reference signature above. Run uv run pyright smartnotes_wordcount.py and confirm it passes with zero errors.

Investigate

In Claude Code, ask:

/investigate @smartnotes_total.py
Why does total_word_count return int instead of float?
Word counts are always whole numbers, but is there a case
where float would be safer?

Compare the AI's answer to your own reasoning. Notice whether it raises a point you had not considered.

Modify

Change the problem: "Calculate the total word count across all notes, excluding drafts, grouped by author."

This changes the return type. Instead of a single int, you now need a mapping from author names to counts. Update the signature:

  • What should the return type be? (dict[str, int])
  • What edge case does grouping add? (author with zero non-draft notes)

Write the updated stub. Run uv run pyright to confirm it passes.

Make [Mastery Gate]

Write stubs for TWO new SmartNotes functions from these problem statements alone, with no guidance:

Problem 1: "Find all tags used across all notes, sorted alphabetically, with no duplicates."

Problem 2: "Return the note with the highest word count. If there is a tie, return the one whose title comes first alphabetically. If the list is empty, return None."

For each stub:

  • Name the function
  • Choose the parameter types
  • Choose the return type (hint: Problem 2 needs Note | None)
  • Write a docstring covering at least two edge cases
  • Run uv run pyright to verify the types pass

Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Review My Specification

Write your stub for Problem 1 or Problem 2 from the Mastery Gate above. Then paste it into Claude Code:

Review this function specification. Does the docstring cover
all edge cases? Are the types correct? Is there anything a
developer would need to ask me before implementing this?

[paste your stub here]

Read the AI's feedback. If it identifies a missing edge case, add it to your docstring. If it suggests a type change, evaluate whether the change is justified.

What you're learning: Specification review is a skill. The AI acts as a reviewer who catches gaps in your design. You are practicing the same review loop that professional teams use before implementation begins.

Prompt 2: Compare Two Specifications

Here are two specifications for the same problem: "Find the
note with the highest word count." Which specification is
better and why?

# Spec A
def find_max(notes):
"""Find max note."""
...

# Spec B
def longest_note(notes: list[Note]) -> Note | None:
"""Return the note with the highest word_count.

- Returns None if notes is empty
- Ties broken by title (alphabetical, ascending)
"""
...

The AI should clearly prefer Spec B. Read its reasoning and check whether it matches your own assessment. If it raises a point you missed, note it.

What you're learning: You are calibrating your sense of specification quality by evaluating two extremes. This builds judgment for when your own specs fall somewhere in between.

Prompt 3: From Vague to Precise

Give Claude Code a deliberately vague problem statement:

I need a function that does something with note statistics.
Ask me clarifying questions before writing the specification.

Answer the AI's questions one at a time. After it has enough information, ask it to produce a stub. Compare its stub to what you would have written. Are the types the same? Is the docstring more or less precise than yours?

What you're learning: You are experiencing the specification process from the other side. When you give vague instructions, the AI must ask the same three questions (What goes in? What comes out? What types?) that you learned to ask yourself. Watching it ask reinforces the habit.


James stares at the finished stub file. No implementation, no tests, no generated code. Just a function signature and a docstring.

"In the warehouse," he says, "the best purchase orders were one page. Item, quantity, delivery date, acceptance criteria. The supplier figured out how to source it. This feels the same: the function signature is my purchase order. I'm telling the AI exactly what I want delivered."

Emma nods. "And like a good purchase order, the spec protects you. If the supplier delivers the wrong thing, you point at the spec: 'I asked for case-insensitive matching. You gave me case-sensitive. Fix it.' Without the spec, you're arguing opinions."

She pauses. "I learned that the hard way. Last year I skipped the spec and let Claude Code generate from a vague prompt. I said, 'search notes by keyword.' It built a function that searched file names instead of note contents. Three hours wasted rewriting because I didn't specify the inputs and outputs first."

"So the spec saves time," James says.

"The spec saves arguments," Emma corrects. "You have the specification. Now you need to define 'correct' before AI writes anything. That means writing the tests. Not after the code, not alongside the code. Before."