Skip to main content

Writing Real Specifications

Emma sets a timer. "You have everything you need: typed signatures, defaults, docstrings. Now put them together. Write three function stubs with complete specifications. Write the tests. Prompt AI for the bodies. Verify. I will be back in fifteen minutes."

She leaves. James opens a new file.


The TDG Cycle: From Stub to Passing Tests

The TDG (Test-Driven Generation) cycle has five steps:

  1. Write the stub (signature + docstring + ...)
  2. Write the tests (assert statements that define correct behavior)
  3. Prompt AI (paste the stubs and tests, ask AI to implement)
  4. Read AI's code (predict behavior before running)
  5. Verify (run pytest and pyright)

You completed a single TDG cycle in Chapter 46. Now you do it with richer signatures and multiple functions.


Step 1: Write the Stubs

Here are three function stubs for SmartNotes text processing:

def word_count(text: str) -> int:
"""Count the number of words in text."""
...

def format_title(title: str) -> str:
"""Strip whitespace and capitalize each word."""
...

def note_preview(body: str, max_length: int = 50) -> str:
"""Return the first max_length characters of body, with '...' if truncated."""
...

Each stub has three things: typed parameters, a return type, and a what-style docstring. Together, these form a specification precise enough for AI to generate the body.


Step 2: Write the Tests

Tests define what "correct" means. Write them before AI generates anything:

def test_word_count() -> None:
assert word_count("hello world") == 2
assert word_count("") == 0
assert word_count("one") == 1

def test_format_title() -> None:
assert format_title(" my first note ") == "My First Note"

def test_note_preview() -> None:
assert note_preview("Short text", 50) == "Short text"
assert note_preview("A" * 100, 50) == "A" * 50 + "..."

Output (from pytest before AI implements):

FAILED test_word_count - TypeError: ... is not callable
FAILED test_format_title - TypeError: ... is not callable
FAILED test_note_preview - TypeError: ... is not callable

All tests fail because the stubs have no body (just ...). That is expected. The tests are waiting for AI to fill in the implementations.


Step 3: Prompt AI

Open Claude Code and paste your stubs and tests:

Here are three function stubs with their tests.
Implement the body of each function. Keep bodies straight-line
(no if-statements, no loops). Use only built-in string methods.

[paste stubs and tests here]

AI generates the bodies. Here is what a typical response looks like:

def word_count(text: str) -> int:
"""Count the number of words in text."""
return len(text.split())

def format_title(title: str) -> str:
"""Strip whitespace and capitalize each word."""
return title.strip().title()

def note_preview(body: str, max_length: int = 50) -> str:
"""Return the first max_length characters of body, with '...' if truncated."""
return body[:max_length] + "..." * (len(body) > max_length)

Output:

2
My First Note
Short text

Step 4: Read AI's Code

Before running the tests, read each implementation and predict:

  • word_count: .split() splits on whitespace, len() counts the pieces. For "", .split() returns [], so len is 0. Matches the test.
  • format_title: .strip() removes whitespace, .title() capitalizes each word. For " my first note ", that produces "My First Note". Matches the test.
  • note_preview: body[:max_length] takes the first N characters. "..." * (len(body) > max_length) adds "..." only when the body is longer than max_length (because True equals 1 and False equals 0 in Python, so multiplying a string by True gives the string and by False gives ""). Matches the tests.
SmartNotes Connection

These three functions are building blocks for the SmartNotes app. word_count measures note length. format_title cleans up user input. note_preview creates summaries for a notes list view. Each function does one thing.


Step 5: Verify

Run pytest and pyright:

uv run pytest test_notes.py -v
uv run pyright notes.py

Output (from pytest):

test_notes.py::test_word_count PASSED
test_notes.py::test_format_title PASSED
test_notes.py::test_note_preview PASSED

3 passed

Output (from pyright):

0 errors, 0 warnings, 0 informations

All tests pass. All types check. The specification produced a correct implementation.


James Struggles

Emma comes back. James has two functions passing and is stuck on note_preview.

"My test says note_preview('A' * 100, 50) should return 'AAA...AAA...' but AI's version returns 'AAA...AAA' without the dots when the text is exactly 50 characters."

Emma looks at his test. "Your test expects dots for text that is exactly 50 characters long. But your docstring says 'if truncated.' Is 50 characters of a 100-character string truncated?"

"Yes, it's cut short."

"What about 50 characters of a 50-character string?"

James pauses. "That is the full text. No truncation."

"So your test is correct for 100 characters, but you need another test for the 50-character edge case. Write it."

He adds: assert note_preview("A" * 50, 50) == "A" * 50 (no dots). The test passes. The specification was right; he just needed to test the boundary.


PRIMM-AI+ Practice: The Full Cycle

Predict [AI-FREE]

Look at this stub and its tests. Predict what AI will generate as the body. Write your prediction and a confidence score from 1 to 5 before checking.

def tag_string(tags: list[str]) -> str:
"""Join a list of tags into a comma-separated string."""
...

def test_tag_string() -> None:
assert tag_string(["python", "ai"]) == "python, ai"
assert tag_string(["solo"]) == "solo"
assert tag_string([]) == ""
Check your prediction

AI will most likely generate: return ", ".join(tags)

The .join() method connects list items with the separator ", ". For ["python", "ai"], it produces "python, ai". For ["solo"], it produces "solo" (no separator needed). For [], it produces "" (joining nothing gives an empty string).

If you predicted .join(), your mental model of string methods from Chapter 48 is working. If you predicted a loop, remember that built-in methods like .join() handle iteration internally, so the body stays straight-line.

Run

Create a file with the tag_string stub and tests. Prompt AI to implement. Run pytest.

Investigate

If AI generated a different implementation than you predicted, write one sentence explaining why AI's version also satisfies the tests. If it matches, write one sentence explaining why .join() is the natural choice.

Modify

Change the separator from ", " to " | " in the docstring. Re-prompt AI. Does it generate a different body? Update the tests to match the new separator.

Make [Mastery Gate]

Complete a full TDG cycle for these two functions without looking at any examples:

  1. Write the stubs (signature + docstring + ...)
  2. Write at least 2 test assertions per function
  3. Prompt AI for implementations
  4. Verify with pytest and pyright

Functions:

  • char_count(text: str) -> int (count characters, not words)
  • first_word(text: str) -> str (return the first word of the text)

All tests must pass and pyright must report zero errors.


SmartNotes TDG Challenge

Emma sets the real challenge before she leaves for the day. "Write SmartNotes function stubs using dict[str, str]. Write the tests. Prompt AI. Verify everything."

Here are the four stubs:

def create_note(title: str, body: str, author: str = "Anonymous") -> dict[str, str]:
"""Create a note as a dictionary with keys 'title', 'body', and 'author'."""
...

def note_word_count(note: dict[str, str]) -> int:
"""Count the total words in the note's body."""
...

def format_note_header(note: dict[str, str]) -> str:
"""Format the note's title and author as 'Title by Author'."""
...

def merge_metadata(base: dict[str, str], override: dict[str, str]) -> dict[str, str]:
"""Merge two metadata dicts. Values in override replace values in base."""
...

Write tests for each:

def test_create_note() -> None:
note: dict[str, str] = create_note("Test", "Hello world")
assert note["title"] == "Test"
assert note["body"] == "Hello world"
assert note["author"] == "Anonymous"

def test_create_note_with_author() -> None:
note: dict[str, str] = create_note("Test", "Hello", author="James")
assert note["author"] == "James"

def test_note_word_count() -> None:
note: dict[str, str] = {"title": "Test", "body": "hello world", "author": "A"}
assert note_word_count(note) == 2

def test_format_note_header() -> None:
note: dict[str, str] = {"title": "My Note", "body": "...", "author": "James"}
assert format_note_header(note) == "My Note by James"

def test_merge_metadata() -> None:
base: dict[str, str] = {"author": "James", "topic": "Python"}
override: dict[str, str] = {"topic": "Types", "status": "draft"}
result: dict[str, str] = merge_metadata(base, override)
assert result == {"author": "James", "topic": "Types", "status": "draft"}

Prompt AI to implement all four functions. Run pytest and pyright. If all tests pass, you have completed the Phase 2 TDG cycle.

Phase 2 deliverable

These SmartNotes functions use dict[str, str] for note data. This works, but notice the fragility: if you write note["titel"] (typo), Python crashes at runtime with a KeyError. Pyright cannot check dictionary key names. There is no way to enforce that every note has "title," "body," and "author." This limitation is deliberate. Phase 3 introduces dataclasses, which solve all three problems.


Ch 49 Syntax Card: Functions as Contracts

# Ch 49 Syntax Card: Functions as Contracts

# Indentation: 4 spaces inside the function
def greet(name: str) -> str:
return f"Hello, {name}!" # inside

# Multiple parameters
def add(a: int, b: int) -> int:
return a + b

# return vs print
def calc(x: int) -> int: # return: gives value back
return x * 2
def show(x: int) -> None: # print: displays, returns None
print(x * 2)

# Default values
def read_time(words: int, wpm: int = 250) -> float:
return words / wpm

# Keyword arguments
read_time(1500) # positional
read_time(1500, wpm=300) # keyword

# Docstring
def func(x: int) -> str:
"""One-line description of what this does."""
...

# Stub (for TDG)
def my_function(x: str) -> int: ...

Keep this card open while writing function stubs. It covers every pattern from this chapter.


Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Generate From Your SmartNotes Stubs

Here are my SmartNotes function stubs and tests. Implement all
four function bodies. Keep all bodies straight-line (no if, for,
or while). Use only built-in operations.

[paste your stubs and tests from the SmartNotes TDG Challenge]

Read AI's implementations carefully before running tests. For merge_metadata, AI will likely use {**base, **override} (dictionary unpacking). This is straight-line code: it creates a new dictionary combining both inputs, with override values replacing base values for duplicate keys.

What you're learning: You are reading AI-generated code that uses patterns you have not explicitly learned yet (like {**base, **override}). The TDG cycle lets you verify correctness through tests even when the implementation uses unfamiliar syntax.

Prompt 2: Evaluate Your Specification Quality

Look at my four function stubs (signature + docstring) for
SmartNotes. Rate each specification on a scale of 1-5 for
clarity. For any rated below 4, suggest a more precise
docstring that would reduce ambiguity.

Read AI's ratings. If it rates any below 4, compare its suggested docstring to yours. Does the suggested version eliminate ambiguity you did not notice? Update your stubs if the improvements make sense.

What you're learning: You are using AI as a specification reviewer, similar to a colleague reviewing your function contracts. The goal is precision: a specification so clear that there is only one reasonable implementation.


Key Takeaways

  1. The TDG cycle is: stub, test, prompt, read, verify. Write the specification (stub + docstring) and the acceptance criteria (tests) first. AI generates the implementation. You verify.

  2. Tests define correctness. Your assertions are the authority on what "correct" means. If AI's implementation passes your tests, it meets your specification. If it fails, the specification or the implementation needs adjustment.

  3. dict[str, str] works but is fragile. Typos in dictionary keys crash at runtime, and pyright cannot catch them. This fragility motivates dataclasses in Phase 3.


Looking Ahead

You have completed Phase 2. You can write typed variables, string expressions, collection annotations, and function signatures with defaults and docstrings. You can execute the full TDG cycle: specify, test, generate, verify. In Phase 3, you replace the fragile dict[str, str] with dataclasses that give you named fields, autocomplete, and compile-time checking. The specification gets even more powerful.