Testing Every Path
James finishes writing tests for his categorize_note() function and leans back. Three tests, one for each branch. He feels done.
Emma glances at his screen. "What happens when word_count is exactly 1000?"
James pauses. The condition is word_count > 1000, so 1000 would fall into the elif word_count > 200 branch and return "medium". He checks his tests: he used 1500 for "long", 500 for "medium", and 100 for "short". None of them sit right on the boundary.
"Those middle-of-the-road values prove each branch works," Emma says. "But the bugs hide at the edges. What about 1001? What about 200? What about 0? You need to test where one branch hands off to the next."
You have been writing tests since Chapter 49. This lesson does not introduce new testing tools. Instead, it teaches you how to think about which tests to write. The goal is a simple discipline: list every branch, then test the boundaries between them.
Branch Listing: Your First Step
Before writing any tests, list the branches in the function. Here is categorize_note() from Lesson 1:
def categorize_note(word_count: int) -> str:
"""Categorize a note by its word count."""
if word_count > 1000:
return "long"
elif word_count > 200:
return "medium"
else:
return "short"
Three branches, three possible return values. Write them out:
| Branch | Condition | Returns |
|---|---|---|
| 1 | word_count > 1000 | "long" |
| 2 | word_count > 200 (and not > 1000) | "medium" |
| 3 | everything else | "short" |
This table is your test plan. Every row needs at least one test. James already had that part covered. The next step is what he was missing.
Boundary Testing
A boundary value is an input that sits right at the edge where one branch ends and another begins. For categorize_note(), the boundaries are at 200 and 1000:
def test_categorize_boundary_1001() -> None:
"""Just above the 'long' threshold."""
assert categorize_note(1001) == "long"
def test_categorize_boundary_1000() -> None:
"""Exactly at 1000: not long, should be medium."""
assert categorize_note(1000) == "medium"
def test_categorize_boundary_201() -> None:
"""Just above the 'medium' threshold."""
assert categorize_note(201) == "medium"
def test_categorize_boundary_200() -> None:
"""Exactly at 200: not medium, should be short."""
assert categorize_note(200) == "short"
def test_categorize_boundary_zero() -> None:
"""Edge case: zero words."""
assert categorize_note(0) == "short"
Each test targets a spot where the answer could go either way if the comparison operator were wrong. If someone wrote >= 1000 instead of > 1000, the test for 1000 would catch it immediately.
Emma's rule of thumb: for every > or < in a condition, test the value itself and the value plus or minus one.
Loop Boundary Testing
Branches are not the only place bugs hide. Loops have their own boundaries: what happens with an empty list, a single item, and many items?
def count_long_notes(notes: list[dict[str, int]]) -> int:
"""Count how many notes have more than 1000 words."""
count: int = 0
for note in notes:
if note["word_count"] > 1000:
count += 1
return count
The test plan for this function has two dimensions: loop boundaries and branch boundaries.
def test_count_empty_list() -> None:
"""No notes at all."""
assert count_long_notes([]) == 0
def test_count_single_long() -> None:
"""One note that qualifies."""
assert count_long_notes([{"word_count": 1500}]) == 1
def test_count_single_short() -> None:
"""One note that does not qualify."""
assert count_long_notes([{"word_count": 300}]) == 0
def test_count_mixed() -> None:
"""Several notes, some long, some not."""
notes: list[dict[str, int]] = [
{"word_count": 1500},
{"word_count": 300},
{"word_count": 2000},
{"word_count": 50},
]
assert count_long_notes(notes) == 2
Empty list, single item, many items. These three cases expose off-by-one errors in loop logic and catch functions that accidentally skip the first or last element.
Testing Nested Logic
When a function combines loops and branches, you need tests that exercise both dimensions. Here is a function that searches notes by keyword:
def search_notes(
notes: list[str], query: str
) -> list[str]:
"""Return notes that contain the query (case-insensitive)."""
if not query:
return []
results: list[str] = []
for note in notes:
if query.lower() in note.lower():
results.append(note)
return results
This function has a guard branch (if not query) and a loop with an inner branch (if query.lower() in note.lower()). The test plan covers both:
| Test case | What it exercises |
|---|---|
| Empty query | Guard branch returns early |
| Empty notes list | Loop runs zero times |
| No matches | Loop runs, inner branch never true |
| One match | Loop runs, inner branch true once |
| Multiple matches | Loop runs, inner branch true multiple times |
| Case mismatch | Case-insensitive logic works |
Six test cases. Each one targets a specific combination of branch and loop behavior.
PRIMM-AI+ Practice: Listing Test Cases
Predict [AI-FREE]
Press Shift+Tab to enter Plan Mode before predicting.
Look at this function. Without running it, list the minimum set of test cases you would need. Write your list and a confidence score from 1 to 5 before checking.
def tag_status(tags: list[str]) -> str:
"""Report the tagging status of a note."""
if len(tags) == 0:
return "untagged"
elif len(tags) > 5:
return "over-tagged"
else:
return "normal"
Check your list
Minimum test cases (branches):
- Empty list
[]returns"untagged"(branch 1) - List with 6+ items returns
"over-tagged"(branch 2) - List with 1-5 items returns
"normal"(branch 3)
Boundary tests:
4. Exactly 5 tags returns "normal" (boundary of > 5)
5. Exactly 6 tags returns "over-tagged" (just above boundary)
6. Exactly 1 tag returns "normal" (just above empty)
If you listed at least the three branch tests plus the boundary at 5/6, your coverage instincts are strong.
Run
Press Shift+Tab to exit Plan Mode.
Create a file called test_tag_status.py with the function and your tests. Run uv run pytest test_tag_status.py -v to verify all tests pass.
Investigate
Change the condition from len(tags) > 5 to len(tags) >= 5. Which boundary test now fails? Trace through the logic to explain why.
If you want to go deeper, run /investigate @test_tag_status.py in Claude Code and ask how boundary values expose off-by-one errors.
Modify
Add a fourth branch: if len(tags) > 10, return "excessive". Update your test suite to cover the new branch and its boundaries. Run the tests to verify.
Make [Mastery Gate]
Write a function rate_note_quality(word_count: int, has_title: bool) -> str that returns:
"incomplete"ifhas_titleisFalse"draft"if word count is 100 or fewer"ready"otherwise
Then write 6 or more tests covering every branch plus boundary values at 100 and 101. Run uv run pytest to verify all tests pass.
Coverage Exercise
For the search_notes function shown earlier, write the full test suite. Your tests should cover all six cases from the table:
def test_search_empty_query() -> None:
"""Empty query returns empty list."""
assert search_notes(["My note"], "") == []
def test_search_empty_notes() -> None:
"""No notes to search."""
assert search_notes([], "hello") == []
def test_search_no_match() -> None:
"""Query not found in any note."""
assert search_notes(["Buy groceries", "Call dentist"], "python") == []
def test_search_one_match() -> None:
"""Query found in exactly one note."""
assert search_notes(["Learn Python", "Buy groceries"], "python") == [
"Learn Python"
]
def test_search_multiple_matches() -> None:
"""Query found in several notes."""
notes: list[str] = ["Python basics", "Advanced Python", "Cooking"]
assert search_notes(notes, "python") == ["Python basics", "Advanced Python"]
def test_search_case_insensitive() -> None:
"""Matching ignores case."""
assert search_notes(["PYTHON notes"], "python") == ["PYTHON notes"]
Run uv run pytest to confirm all six pass.
Try With AI
If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.
Prompt 1: Generate a Function, Then List Its Test Cases
Write a Python function called filter_by_priority that takes a
list of dictionaries (each with "title" and "priority" keys)
and a minimum priority integer, and returns only the items
whose priority meets or exceeds the minimum. Use type annotations.
Before writing any tests, list every branch and loop boundary yourself. Then ask the AI to generate a test suite and compare its list to yours. Did it miss any boundary values?
What you're learning: You are practicing the "list branches first, then test" discipline by comparing your manual analysis to AI output.
Prompt 2: Find Missing Tests
Here is a function and its test suite. Are there any missing
test cases? Think about boundary values and edge cases.
def classify_age(age: int) -> str:
if age >= 65:
return "senior"
elif age >= 18:
return "adult"
else:
return "minor"
def test_senior():
assert classify_age(70) == "senior"
def test_adult():
assert classify_age(30) == "adult"
def test_minor():
assert classify_age(10) == "minor"
Read the AI's response. It should suggest boundary tests at 65, 64, 18, 17, and possibly 0 or negative ages. If it does not mention boundary values, ask it specifically about the threshold between "adult" and "senior."
What you're learning: You are using the AI as a second pair of eyes to catch gaps in test coverage.
Prompt 3: Identify Untested Paths
Here is a function. List every execution path through it, then
tell me which paths are not covered by the tests below.
def process_order(quantity: int, is_member: bool) -> str:
if quantity <= 0:
return "invalid"
if is_member:
if quantity >= 10:
return "bulk member discount"
return "member price"
return "standard price"
def test_standard():
assert process_order(5, False) == "standard price"
def test_member():
assert process_order(3, True) == "member price"
Read the AI's response. It should identify at least two missing paths: the invalid quantity case and the bulk member discount case. Write the missing tests yourself before checking whether the AI's suggested tests match yours.
What you're learning: You are practicing the "list paths first, then check coverage" discipline on nested logic, which is where untested paths hide most often.
"List the branches first," James says. "Draw the table. Then test the boundaries -- the exact value where one branch hands off to the next. 1000 and 1001. 200 and 201. That's where the off-by-one errors hide." He taps the desk. "It's like quality control checkpoints at a distribution center. You don't just spot-check random boxes. You check the transitions: where one conveyor feeds into the next, where the sort changes from domestic to international. That's where packages get misrouted."
"And for loops?" Emma asks.
"Empty, single, multiple. Three tests minimum. They catch the bugs that only show up at the extremes."
Emma is quiet for a moment. "I'm not sure I always follow my own rule on the empty-list case. When I'm prototyping fast, I skip it, tell myself I'll add it later." She shrugs. "Later usually means after the bug report."
James laughs. "So coverage is a thinking discipline. List paths, target boundaries, test the edges. Don't wait for a tool to tell you what you missed."
"Exactly. And now that you know how to think about test cases, Lesson 7 flips the order. You write the tests first, then let the AI generate the implementation. Your tests become the specification."