SmartNotes Control Flow TDG

James has all the pieces now: branches from Lesson 1, for loops from Lessons 2 and 3, while/break from Lesson 4, nesting from Lesson 5, and a systematic testing strategy from Lesson 6. Emma pulls up a timer on her phone.

"Fifteen minutes," she says. "Write the stubs and tests for four SmartNotes functions. I'll be back."

James stares at the blank file. He has never built a test suite from scratch for multiple functions at once. But he knows the pattern: stub first, tests second, AI third.

The Four Function Stubs

Create a file called smartnotes_flow.py. These four functions cover every control flow pattern from this chapter. Each stub has a docstring that specifies the exact behavior your tests will verify:

import math  # needed for math.ceil in reading_time_report

def categorize_note(word_count: int) -> str:
    """Return 'short', 'medium', or 'long' based on word count.

    short: 200 or fewer
    medium: 201-1000
    long: above 1000
    """
    ...


def count_tags(notes_tags: list[list[str]]) -> dict[str, int]:
    """Count how many times each tag appears across all notes.

    Each inner list contains tags for one note.
    Return a dictionary mapping each tag to its total count.
    """
    ...


def filter_notes_by_tag(
    notes: list[dict[str, str]],
    tag: str,
) -> list[dict[str, str]]:
    """Return only notes whose 'tags' field contains the given tag.

    Each note is a dict with keys like 'title', 'body', 'tags'.
    The 'tags' field is a comma-separated string of tag names.
    """
    ...


def reading_time_report(notes: list[dict[str, str]]) -> str:
    """Generate a formatted report of title and reading time.

    Reading time = word count of 'body' divided by 200, rounded up.
    Each line: 'Title: Xmin'
    Return all lines joined with newlines.
    """
    ...

Notice that every note is still a dict[str, str]. The tags field is a comma-separated string, not a list. The body field holds the text as a plain string. You will feel the friction of checking string keys and splitting comma-separated values by hand.

Why dict[str, str] still?

These functions would be cleaner with a proper data structure that enforces field names at the type level. You will build exactly that in Chapter 51, where dataclasses replace these fragile dictionaries. For now, the pain is the point: it shows you why better tools exist.

The Test Suite

Create a file called test_smartnotes_flow.py. Write tests that cover every branch, every loop edge case, and every boundary value. Here is a complete example:

from smartnotes_flow import (
    categorize_note,
    count_tags,
    filter_notes_by_tag,
    reading_time_report,
)
import math


# --- categorize_note tests ---

def test_categorize_short_zero() -> None:
    assert categorize_note(0) == "short"


def test_categorize_short_boundary() -> None:
    assert categorize_note(200) == "short"


def test_categorize_medium_lower() -> None:
    assert categorize_note(201) == "medium"


def test_categorize_medium_upper() -> None:
    assert categorize_note(1000) == "medium"


def test_categorize_long_boundary() -> None:
    assert categorize_note(1001) == "long"


def test_categorize_long_high() -> None:
    assert categorize_note(5000) == "long"


# --- count_tags tests ---

def test_count_tags_empty() -> None:
    assert count_tags([]) == {}


def test_count_tags_single_note() -> None:
    result: dict[str, int] = count_tags([["python", "testing"]])
    assert result == {"python": 1, "testing": 1}


def test_count_tags_overlapping() -> None:
    result: dict[str, int] = count_tags([
        ["python", "testing"],
        ["python", "loops"],
        ["testing", "loops", "python"],
    ])
    assert result["python"] == 3
    assert result["testing"] == 2
    assert result["loops"] == 2


# --- filter_notes_by_tag tests ---

def test_filter_no_match() -> None:
    notes: list[dict[str, str]] = [
        {"title": "Note A", "body": "hello", "tags": "python,testing"},
    ]
    assert filter_notes_by_tag(notes, "finance") == []


def test_filter_one_match() -> None:
    notes: list[dict[str, str]] = [
        {"title": "Note A", "body": "hello", "tags": "python,testing"},
        {"title": "Note B", "body": "world", "tags": "finance"},
    ]
    result: list[dict[str, str]] = filter_notes_by_tag(notes, "finance")
    assert len(result) == 1
    assert result[0]["title"] == "Note B"


def test_filter_multiple_matches() -> None:
    notes: list[dict[str, str]] = [
        {"title": "Note A", "body": "hello", "tags": "python,testing"},
        {"title": "Note B", "body": "world", "tags": "python,finance"},
        {"title": "Note C", "body": "foo", "tags": "testing"},
    ]
    result: list[dict[str, str]] = filter_notes_by_tag(notes, "python")
    assert len(result) == 2


def test_filter_empty_list() -> None:
    assert filter_notes_by_tag([], "python") == []


# --- reading_time_report tests ---

def test_report_empty() -> None:
    assert reading_time_report([]) == ""


def test_report_single_note() -> None:
    notes: list[dict[str, str]] = [
        {"title": "Quick Thought", "body": " ".join(["word"] * 150)},
    ]
    result: str = reading_time_report(notes)
    assert result == "Quick Thought: 1min"


def test_report_multiple_notes() -> None:
    notes: list[dict[str, str]] = [
        {"title": "Short", "body": " ".join(["word"] * 100)},
        {"title": "Long", "body": " ".join(["word"] * 800)},
    ]
    result: str = reading_time_report(notes)
    lines: list[str] = result.split("\n")
    assert len(lines) == 2
    assert "Short: 1min" in lines[0]
    assert "Long: 4min" in lines[1]

That is 16 tests across four functions. Each test targets a specific path: a branch boundary, an empty collection, overlapping data, or a formatting rule. When Emma comes back, James has most of them written. But test_report_single_note is failing. He used math.ceil(150 / 200) in his head and got 1, but his stub returns ... so the test cannot pass yet. That is expected. The stubs are placeholders; the AI fills them in next.

The TDG Workflow

Follow these five steps every time you use Test-Driven Generation:

Step 1: Write stubs. Define the function signature, type annotations, and docstring. The body is just ....

Step 2: Write tests. Cover every branch, every loop edge case, and every boundary. Your tests are the specification.

Step 3: Prompt AI. Give the AI your stubs and tests. Ask it to write implementations that pass all tests. In Claude Code, type /tdg to start a TDG cycle, or use this prompt:

Here are my function stubs and test file. Write implementations
for all four functions in smartnotes_flow.py so that every test
in test_smartnotes_flow.py passes. Use only if/elif/else,
for loops, and while loops. No list comprehensions.
Do not modify the test file.

Step 4: Run pytest. Execute uv run pytest test_smartnotes_flow.py -v and check the results. Every test should pass.

Step 5: Fix and iterate. If tests fail, read the failure message. Decide whether the bug is in the AI's implementation or in your test. Fix the right one and run again.

What to Watch For

When the AI generates implementations, check these common issues:

Off-by-one in categorize_note. Does it use <= or < at the boundaries? Your boundary tests at 200, 201, 1000, and 1001 will catch this.
Missing tags in filter_notes_by_tag. The AI might forget to split the comma-separated tags string, or it might match partial tag names (e.g., "test" matching "testing"). Your test data will reveal this.
Rounding in reading_time_report. The docstring says "rounded up." If the AI uses int() instead of math.ceil(), your 150-word note test will expose the difference.
Key errors from dict[str, str]. If the AI assumes a key exists that your test data does not include, you will get a KeyError at runtime. This is the fragility of unstructured dictionaries.

Tracing a boundary bug

When Emma returns, James has three functions passing. The fourth, reading_time_report, fails on the single-note test. The AI wrote int(word_count / 200) instead of math.ceil(word_count / 200). James spots it because his test expects "Quick Thought: 1min" for 150 words: int(150 / 200) returns 0, but math.ceil(150 / 200) returns 1. The boundary test caught the bug before it reached real users.

Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Generate Implementations

Copy your smartnotes_flow.py stubs and test_smartnotes_flow.py into the conversation, then send:

Write implementations for all four functions so every test passes.
Use only if/elif/else, for loops, and while loops.
No list comprehensions. Keep all type annotations.

Run uv run pytest test_smartnotes_flow.py -v. If any test fails, paste the failure output back and ask the AI to fix only the failing function.

Prompt 2: Review for Dict Fragility

After all tests pass, send:

Look at filter_notes_by_tag and reading_time_report.
What happens if a note dict is missing the 'tags' or 'body' key?
How would a dataclass prevent this problem?

What you are learning: You are seeing why dict[str, str] is fragile. The AI will explain that a dataclass enforces required fields at construction time, so missing keys become impossible. That is exactly what Chapter 51 teaches.

Prompt 3: Write Tests for Edge Cases the AI Might Miss

Look at the four functions in smartnotes_flow.py. For each one,
suggest one additional edge case test that is NOT already in my
test file. Focus on inputs that could cause unexpected behavior:
empty strings, very large numbers, tags with extra whitespace,
or notes with no body text.

Review the AI's suggestions before adding them. For each suggested test, decide whether it tests a genuinely different path or just duplicates existing coverage. Add only the tests that reveal new behavior, then run uv run pytest to confirm everything passes.

What you're learning: You are building the habit of looking beyond the obvious test cases. The AI is good at generating edge cases, but you decide which ones are worth keeping.

"Stubs, tests, prompt, pytest, fix," James says. "Five steps. The tests are the spec. The AI writes code to pass them, not the other way around." He looks at his screen. "The boundary tests at 200 and 201 caught an off-by-one in the AI's first attempt. And the empty-list test for count_tags caught a KeyError I never would have thought to check for."

"What about the dict fragility?" Emma asks.

James grimaces. "That was the worst part. I had to remember that tags is a comma-separated string, not a list. I had to hope the body key exists. One misspelled key and the whole function crashes at runtime with no warning." He pauses. "In a warehouse, every bin has a label with a barcode. You scan it, the system confirms the contents. These dicts are like bins with handwritten sticky notes. The label might say 'tags' or 'tag' or 'Tags,' and you don't find out which one until someone reaches in and comes up empty."

Emma nods. "That frustration is the point. You just described exactly why Chapter 51 exists. Dataclasses replace those sticky notes with enforced field names, types, and defaults. If a field is missing, Python tells you at construction time, not when your function blows up in production."

"Good. Because I'm tired of hoping my keys are spelled right."

The Four Function Stubs​

The Test Suite​

The TDG Workflow​

What to Watch For​

Try With AI​

Prompt 1: Generate Implementations​

Prompt 2: Review for Dict Fragility​

Prompt 3: Write Tests for Edge Cases the AI Might Miss​