SmartNotes Bug Hunt
This lesson is a practice challenge, not a teaching lesson. You already learned all the tools in Lessons 1-4: reading tracebacks, print debugging, recognizing AI failure patterns, and the five-step debugging loop. Now you apply them all to five real bugs. If you get stuck on any bug, go back to the specific lesson that covers that skill.
Five bugs, one per Error Taxonomy category. The challenge is diagnosis speed and systematic process, not just finding the fix. Track which step of the debugging loop reveals each bug.
In Lesson 4, James walked through the debugging loop on Bug #5 with Emma guiding each step. Now Emma hands him a SmartNotes statistics module and a test file.
"Five bugs. Five categories. Use the loop." She sets a timer on her phone. "I'll be at my desk."
If you want guided help on any bug, type /debug in Claude Code. It walks you through reproduce, isolate, identify, fix, verify. But try without it first.
Setup
Your chapter directory already contains two files:
smartnotes_buggy.py-- a SmartNotes statistics module with five planted bugstest_smartnotes_buggy.py-- tests that expose all five bugs
Open both files in your editor. Read smartnotes_buggy.py to see what each function is supposed to do, but do not look at the bug comments yet. The point is to find the bugs through the loop, not by reading the answers.
Run all tests:
uv run pytest test_smartnotes_buggy.py -v
You should see multiple failures. Count them. Each failure points to a different bug. Your job: fix all five, one at a time, following the loop for each one.
Keep a debugging journal as you work. For each bug, write down:
- What the test failure said
- Which Error Taxonomy category it belongs to
- Which debugging tool you used (traceback, print, spec reading)
- What you changed
- How many tests pass after the fix
Bug #1: The Crash
Run the tests. One of the first failures you see should look like this:
FAILED test_smartnotes_buggy.py::TestReadingTimeSeconds::test_short_note
TypeError: unsupported operand type(s) for /: 'str' and 'int'
The traceback points to a specific line in reading_time_seconds. The function signature says it returns int, and pyright is happy with the types. But something crashes at runtime.
Your turn. Apply the loop:
- Reproduce: Run the specific failing test
- Isolate: The traceback already points to the exact line
- Identify: What operation is failing? What types are involved?
- Fix: Change the minimum amount of code
- Verify: Run the full suite. How many tests pass now?
Hint: classification
What does the Error Taxonomy call a bug where the code crashes because of a type mismatch at runtime? The traceback literally says TypeError.
Hint: finding the fix
Look at the line the traceback points to. One of the values is being wrapped in a function call that changes its type. What happens if you remove that wrapper?
The fix
The line str(words) / words_per_minute converts words (an int) to a string, then tries to divide a string by an integer. Remove the str() wrapper:
minutes: float = words / words_per_minute
Category: Type Error. The types were correct at the annotation level but wrong at runtime because of the str() call.
Run the full suite again. Note how many tests pass now compared to before.
Bug #2: The Silent Wrong Answer
This one does not crash. The test failure looks different:
FAILED test_smartnotes_buggy.py::TestReadingTimeMinutes::test_short_note_has_fractional_time
AssertionError: assert 0 == 0.04 ± 1.0e-02
The function returns 0 when the test expects 0.04. No traceback, no crash. The function runs and returns a number. The number is just wrong.
Your turn. The loop again:
- Reproduce: Run the failing test
- Isolate: Call
reading_time_minuteswith a note of 10 words. What do you get? - Identify: Print the intermediate calculation. What does
10 // 250produce versus10 / 250? - Fix: One character change
- Verify: Full suite
Hint: classification
The code runs without errors. The types are correct. But the result is mathematically wrong. What does the Error Taxonomy call a bug where the logic produces the wrong answer?
Hint: the operator
Python has two division operators. One returns a float (/), the other returns an integer by discarding the remainder (//). Which one is used here?
The fix
Change // to / in the return statement:
return note.word_count / words_per_minute
Category: Logic Error. The code is syntactically correct and type-safe, but // (floor division) discards the fractional part, returning 0 instead of 0.04.
Bug #3: The Spec Mismatch
This failure is subtle. Look at the test output:
FAILED test_smartnotes_buggy.py::TestFilterNotesByAllTags::test_partial_match_excluded
AssertionError: assert 1 == 0
The test says: "A note with only some of the required tags should be excluded." But the function includes it anyway. The function does not crash, the types are correct, and the logic is internally consistent. The problem is that the code does something different from what the docstring promises.
Your turn:
- Reproduce: Run the failing test
- Isolate: Create two notes. One has tags
["quick"], the other has["quick", "python"]. Filter for["quick", "python"]. Which notes come back? - Identify: Read the docstring. Read the code. What does the docstring say? What does the code actually do? Specifically, look at the boolean operation in the
ifstatement. - Fix: One word change
- Verify: Full suite
Hint: classification
The code works correctly for what it does. But what it does is not what the specification (docstring) says it should do. What does the Error Taxonomy call this?
Hint: any vs all
The function is called filter_notes_by_all_tags. The docstring says "ALL of the given tags." But the if statement uses a Python built-in that checks whether at least one condition is true.
The fix
Change any() to all():
if all(tag in note.tags for tag in required_tags):
Category: Specification Error. The code is internally correct (it filters by ANY tag), but the specification says it should filter by ALL tags. The bug is the gap between intent and implementation.
Bug #4: The Edge Case Crash
FAILED test_smartnotes_buggy.py::TestAverageWordCount::test_empty_list_returns_zero
ZeroDivisionError: division by zero
The traceback points to average_word_count. The function works for normal inputs but crashes when given an empty list.
Your turn:
- Reproduce: Run the specific test
- Isolate: Call
average_word_count([]). Confirm the crash. - Identify: What does
len([])return? What happens when you divide by that number? - Fix: Add protection for the edge case. What should the function return for an empty list? (Read the docstring.)
- Verify: Full suite
Hint: classification
The function works for normal data but fails on a boundary case. What does the Error Taxonomy call a bug that only appears with unusual or empty input?
The fix
Add an early return before the division:
def average_word_count(notes: list[Note]) -> float:
if not notes:
return 0.0
total: int = 0
for note in notes:
total += note.word_count
return total / len(notes)
Category: Data/Edge-Case Error. The function never considered the empty-list case. The docstring says "Returns 0.0 if the list is empty," but the implementation does not honor that contract.
Bug #5: The Ordering Problem
FAILED test_smartnotes_buggy.py::TestTagCoverageReport::test_partial_coverage
AssertionError: assert '83.3%' in 'Tag coverage: 0/6 (0.0%)'
The function returns 0.0% when it should return 83.3%. You walked through this one with Emma in Lesson 4. Now do it yourself without the guided walkthrough.
Your turn:
- Reproduce: Run the specific test
- Isolate: Try one note, one tag, one known tag. Still
0.0%? - Identify: Add print statements before and after the percentage calculation. When is
found_tagspopulated relative to whenpercentageis computed? - Fix: Move the lines into the correct order
- Verify: Full suite
Hint: classification
You saw this category in Lesson 4. The individual lines of code are correct, but they run in the wrong sequence.
The fix
Move the percentage calculation and total_tags assignment after the loop:
for note in notes:
for tag in note.tags:
if tag in all_tags:
found_tags.add(tag)
total_tags: int = len(all_tags)
percentage: float = (len(found_tags) / total_tags) * 100 if total_tags > 0 else 0.0
Category: Orchestration Error. The percentage was calculated before the loop that populates found_tags, so it was always based on an empty set.
All Tests Pass
Run the full suite one final time:
uv run pytest test_smartnotes_buggy.py -v
Every test should pass. If any still fail, go back to the failing test and apply the loop again. Do not guess. Follow the steps.
Check your debugging journal. You should have five entries, each with a category, a tool, and a fix. Here is what the completed journal should look like:
| Bug | Category | Tool Used | Fix |
|---|---|---|---|
| #1 | Type Error | Traceback reading | Removed str() wrapper |
| #2 | Logic Error | Print debugging | Changed // to / |
| #3 | Specification Error | Spec reading (docstring vs code) | Changed any() to all() |
| #4 | Data/Edge-Case Error | Traceback reading | Added early return for empty list |
| #5 | Orchestration Error | Print debugging | Moved calculation after loop |
Try With AI
If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.
Prompt 1: Review Your Debugging Journal
Copy your five journal entries into Claude Code and send:
Here are my classifications for five bugs I just fixed:
Bug #1: [your category]
Bug #2: [your category]
Bug #3: [your category]
Bug #4: [your category]
Bug #5: [your category]
Check my classifications against the Error Taxonomy from
Chapter 43. Did I get any wrong? For each one, explain why
the category fits.
What you're learning: You are validating your own diagnostic thinking. If the AI corrects a classification, that correction reinforces the taxonomy better than reading a definition ever could.
Prompt 2: Map Bugs to Patterns
For each of the five bugs I fixed:
1. Type Error (str wrapping an int before division)
2. Logic Error (floor division instead of true division)
3. Specification Error (any instead of all)
4. Data/Edge-Case Error (no empty list check)
5. Orchestration Error (calculation before data collection)
Which of these would pyright catch? Which would only tests
catch? Which would neither catch (requiring human review)?
What you're learning: You are mapping the boundaries of each debugging tool. Static analysis catches some bugs, tests catch others, and human review catches the rest. Knowing which tool fits which category makes your debugging faster.
Prompt 3: Generate a New Bug Hunt
Write a new SmartNotes module called smartnotes_analytics.py
with three functions. Each function should have one bug from
a DIFFERENT Error Taxonomy category. Include a test file that
exposes all three bugs. Do not tell me which category each
bug belongs to. I want to classify them myself.
What you're learning: You are extending your practice beyond the five known bugs. When you classify bugs the AI planted without hints, you prove to yourself that the taxonomy is internalized, not just memorized.
James leans back from his screen. Five bugs fixed. Five entries in his journal. He looks at the table: Type Error, Logic Error, Specification Error, Data/Edge-Case Error, Orchestration Error. "Five bugs. Five categories. The loop worked every time."
He thinks about it in warehouse terms. "It's like a quality inspection checklist. Five defect types: damaged packaging, wrong item, missing label, empty bin, items loaded in the wrong sequence. You don't randomly poke around hoping to spot problems. You run the same checklist on every shipment, every time. The systematic sweep catches things that gut instinct misses."
Emma comes back and looks at his test output. All green.
"You found all five," she says. "More importantly, you followed the same process for each one. That consistency is what makes debugging a skill instead of luck."
She closes the timer. "You can debug. You can read tracebacks, add print statements, recognize patterns, and follow the loop. Now the question is: can you build? Not following step-by-step instructions, but driving the whole cycle yourself. Stubs, tests, prompt, validate, iterate." She pauses. "Chapter 57: TDG Mastery."