Skip to main content

Testing Every Path

James finishes writing tests for his categorize_note() function and leans back. Three tests, one for each branch. He feels done.

Emma glances at his screen. "What happens when word_count is exactly 1000?"

James pauses. The condition is word_count > 1000, so 1000 would fall into the elif word_count > 200 branch and return "medium". He checks his tests: he used 1500 for "long", 500 for "medium", and 100 for "short". None of them sit right on the boundary.

"Those middle-of-the-road values prove each branch works," Emma says. "But the bugs hide at the edges. What about 1001? What about 200? What about 0? You need to test where one branch hands off to the next."

If you're new to testing

You have been writing tests since Chapter 49. This lesson does not introduce new testing tools. Instead, it teaches you how to think about which tests to write. The goal is a simple discipline: list every branch, then test the boundaries between them.


Branch Listing: Your First Step

Before writing any tests, list the branches in the function. Here is categorize_note() from Lesson 1:

def categorize_note(word_count: int) -> str:
"""Categorize a note by its word count."""
if word_count > 1000:
return "long"
elif word_count > 200:
return "medium"
else:
return "short"

Three branches, three possible return values. Write them out:

BranchConditionReturns
1word_count > 1000"long"
2word_count > 200 (and not > 1000)"medium"
3everything else"short"

This table is your test plan. Every row needs at least one test. James already had that part covered. The next step is what he was missing.


Boundary Testing

A boundary value is an input that sits right at the edge where one branch ends and another begins. For categorize_note(), the boundaries are at 200 and 1000:

def test_categorize_boundary_1001() -> None:
"""Just above the 'long' threshold."""
assert categorize_note(1001) == "long"

def test_categorize_boundary_1000() -> None:
"""Exactly at 1000: not long, should be medium."""
assert categorize_note(1000) == "medium"

def test_categorize_boundary_201() -> None:
"""Just above the 'medium' threshold."""
assert categorize_note(201) == "medium"

def test_categorize_boundary_200() -> None:
"""Exactly at 200: not medium, should be short."""
assert categorize_note(200) == "short"

def test_categorize_boundary_zero() -> None:
"""Edge case: zero words."""
assert categorize_note(0) == "short"

Each test targets a spot where the answer could go either way if the comparison operator were wrong. If someone wrote >= 1000 instead of > 1000, the test for 1000 would catch it immediately.

Emma's rule of thumb: for every > or < in a condition, test the value itself and the value plus or minus one.


Loop Boundary Testing

Branches are not the only place bugs hide. Loops have their own boundaries: what happens with an empty list, a single item, and many items?

def count_long_notes(notes: list[dict[str, int]]) -> int:
"""Count how many notes have more than 1000 words."""
count: int = 0
for note in notes:
if note["word_count"] > 1000:
count += 1
return count

The test plan for this function has two dimensions: loop boundaries and branch boundaries.

def test_count_empty_list() -> None:
"""No notes at all."""
assert count_long_notes([]) == 0

def test_count_single_long() -> None:
"""One note that qualifies."""
assert count_long_notes([{"word_count": 1500}]) == 1

def test_count_single_short() -> None:
"""One note that does not qualify."""
assert count_long_notes([{"word_count": 300}]) == 0

def test_count_mixed() -> None:
"""Several notes, some long, some not."""
notes: list[dict[str, int]] = [
{"word_count": 1500},
{"word_count": 300},
{"word_count": 2000},
{"word_count": 50},
]
assert count_long_notes(notes) == 2

Empty list, single item, many items. These three cases expose off-by-one errors in loop logic and catch functions that accidentally skip the first or last element.


Testing Nested Logic

When a function combines loops and branches, you need tests that exercise both dimensions. Here is a function that searches notes by keyword:

def search_notes(
notes: list[str], query: str
) -> list[str]:
"""Return notes that contain the query (case-insensitive)."""
if not query:
return []
results: list[str] = []
for note in notes:
if query.lower() in note.lower():
results.append(note)
return results

This function has a guard branch (if not query) and a loop with an inner branch (if query.lower() in note.lower()). The test plan covers both:

Test caseWhat it exercises
Empty queryGuard branch returns early
Empty notes listLoop runs zero times
No matchesLoop runs, inner branch never true
One matchLoop runs, inner branch true once
Multiple matchesLoop runs, inner branch true multiple times
Case mismatchCase-insensitive logic works

Six test cases. Each one targets a specific combination of branch and loop behavior.


PRIMM-AI+ Practice: Listing Test Cases

Predict [AI-FREE]

Look at this function. Without running it, list the minimum set of test cases you would need. Write your list and a confidence score from 1 to 5 before checking.

def tag_status(tags: list[str]) -> str:
"""Report the tagging status of a note."""
if len(tags) == 0:
return "untagged"
elif len(tags) > 5:
return "over-tagged"
else:
return "normal"
Check your list

Minimum test cases (branches):

  1. Empty list [] returns "untagged" (branch 1)
  2. List with 6+ items returns "over-tagged" (branch 2)
  3. List with 1-5 items returns "normal" (branch 3)

Boundary tests: 4. Exactly 5 tags returns "normal" (boundary of > 5) 5. Exactly 6 tags returns "over-tagged" (just above boundary) 6. Exactly 1 tag returns "normal" (just above empty)

If you listed at least the three branch tests plus the boundary at 5/6, your coverage instincts are strong.

Run

Create a file called test_tag_status.py with the function and your tests. Run uv run pytest test_tag_status.py -v to verify all tests pass.

Investigate

Change the condition from len(tags) > 5 to len(tags) >= 5. Which boundary test now fails? Trace through the logic to explain why.

Modify

Add a fourth branch: if len(tags) > 10, return "excessive". Update your test suite to cover the new branch and its boundaries. Run the tests to verify.

Make [Mastery Gate]

Write a function rate_note_quality(word_count: int, has_title: bool) -> str that returns:

  • "incomplete" if has_title is False
  • "draft" if word count is 100 or fewer
  • "ready" otherwise

Then write 6 or more tests covering every branch plus boundary values at 100 and 101. Run uv run pytest to verify all tests pass.


Coverage Exercise

For the search_notes function shown earlier, write the full test suite. Your tests should cover all six cases from the table:

def test_search_empty_query() -> None:
"""Empty query returns empty list."""
assert search_notes(["My note"], "") == []

def test_search_empty_notes() -> None:
"""No notes to search."""
assert search_notes([], "hello") == []

def test_search_no_match() -> None:
"""Query not found in any note."""
assert search_notes(["Buy groceries", "Call dentist"], "python") == []

def test_search_one_match() -> None:
"""Query found in exactly one note."""
assert search_notes(["Learn Python", "Buy groceries"], "python") == [
"Learn Python"
]

def test_search_multiple_matches() -> None:
"""Query found in several notes."""
notes: list[str] = ["Python basics", "Advanced Python", "Cooking"]
assert search_notes(notes, "python") == ["Python basics", "Advanced Python"]

def test_search_case_insensitive() -> None:
"""Matching ignores case."""
assert search_notes(["PYTHON notes"], "python") == ["PYTHON notes"]

Run uv run pytest to confirm all six pass.


Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Generate a Function, Then List Its Test Cases

Write a Python function called filter_by_priority that takes a
list of dictionaries (each with "title" and "priority" keys)
and a minimum priority integer, and returns only the items
whose priority meets or exceeds the minimum. Use type annotations.

Before writing any tests, list every branch and loop boundary yourself. Then ask the AI to generate a test suite and compare its list to yours. Did it miss any boundary values?

What you're learning: You are practicing the "list branches first, then test" discipline by comparing your manual analysis to AI output.

Prompt 2: Find Missing Tests

Here is a function and its test suite. Are there any missing
test cases? Think about boundary values and edge cases.

def classify_age(age: int) -> str:
if age >= 65:
return "senior"
elif age >= 18:
return "adult"
else:
return "minor"

def test_senior():
assert classify_age(70) == "senior"

def test_adult():
assert classify_age(30) == "adult"

def test_minor():
assert classify_age(10) == "minor"

Read the AI's response. It should suggest boundary tests at 65, 64, 18, 17, and possibly 0 or negative ages. If it does not mention boundary values, ask it specifically about the threshold between "adult" and "senior."

What you're learning: You are using the AI as a second pair of eyes to catch gaps in test coverage.


Key Takeaways

  1. List branches before writing tests. Draw a table of every branch, its condition, and its return value. Every row needs at least one test.

  2. Boundary values catch the real bugs. For every > or < comparison, test the threshold value itself and the values just above and below it. Off-by-one errors live at these edges.

  3. Loop boundaries matter too. Always test with an empty collection, a single item, and multiple items. These three cases catch most loop-related bugs.

  4. Nested logic needs combined tests. When a function has both branches and loops, your test suite must cover both dimensions: what the loop iterates over and which branch runs inside it.

  5. Coverage is a thinking discipline, not a tool. Before reaching for any automated tool, the habit of manually listing paths and boundaries will make you a stronger tester.


Looking Ahead

You now know how to think systematically about which tests to write. In Lesson 7, you will use test-driven growth (TDG) to write tests before writing the function, letting your test cases guide the implementation rather than following it.