Skip to main content

The Debugging Loop

In Lesson 3, Emma told James he had three debugging tools: traceback reading, print debugging, and pattern recognition. She also told him what comes next: "Reproduce, isolate, identify, fix, verify -- the complete debugging loop."

James thinks about this. "At the distribution center, we have something similar. When a customer reports a damaged shipment, the warehouse team follows an RCA process: Root Cause Analysis. Document the symptom. Reproduce it if you can. Narrow down where the damage happened. Find the root cause. Fix it. Then verify the fix didn't create a new problem. Same five steps every time, whether it's a crushed pallet or a mislabeled box."

"That's the debugging loop," Emma says. "You have three tools, but you've been using them ad hoc -- grab whichever tool seems right in the moment. Professionals follow a loop. The same loop every time, regardless of the bug."

If you're new to programming

This loop might feel like overkill for simple bugs. But building the habit now means you won't skip steps when bugs get complex. The loop works the same whether the bug is one wrong operator or a chain of three interacting functions.


The Five Steps

Here is the full loop. Every debugging session follows this sequence:

StepNameWhat you doWhy it matters
1ReproduceWrite a test that fails because of the bugIf you can't reproduce it, you can't prove you fixed it
2IsolateSimplify the input until the bug is clearStrip away complexity to find the minimum case that fails
3IdentifyName the error using the Error Taxonomy (use /bug to classify, or /debug to walk the full loop)Use traceback reading (L1), print debugging (L2), or pattern recognition (L3)
4FixChange the minimum amount of codeDon't rewrite the function; fix the specific line
5VerifyRun the FULL test suiteYour fix might break something else

The loop is sequential. Step 2 depends on Step 1, Step 4 depends on Step 3. Skipping a step is how "quick fixes" turn into new bugs.


Walking Through the Loop: Bug #5

Bug #5 in smartnotes_buggy.py is an orchestration error: the tag_coverage_report function calculates a percentage before the loop that collects the data. The test expects "5/6" and "83.3%", but the function returns "0/6 (0.0%)".

Let's walk through all five steps.

Step 1: Reproduce

The test already exists in test_smartnotes_buggy.py. Run it:

uv run pytest test_smartnotes_buggy.py::TestTagCoverageReport::test_partial_coverage -v

Output:

FAILED test_smartnotes_buggy.py::TestTagCoverageReport::test_partial_coverage
AssertionError: assert '83.3%' in 'Tag coverage: 0/6 (0.0%)'

The test fails. The function returns 0.0% when it should return 83.3%. You now have a reproducible failure. If you skip this step and jump to "I think I know what's wrong," you risk fixing a bug that isn't the real problem.

If you've coded before

In professional debugging, "reproduce" often means writing a NEW test that captures the exact failure. Here the test already exists. In Lesson 5 and beyond, you will write your own reproduction tests from scratch.

Step 2: Isolate

The test uses two sample notes with five tags across six known tags. That's a lot of data. Simplify:

# Minimal case: 1 note, 1 tag, 1 known tag
note = Note(title="Test", body="hello", word_count=1, tags=["python"])
result = tag_coverage_report([note], ["python"])
print(result)

Output:

Tag coverage: 0/1 (0.0%)

Even with one note and one tag, the percentage is 0.0%. The bug is not about the number of notes or tags. Something is wrong with how the function counts found tags.

Step 3: Identify

The percentage is 0.0% even though the note has the tag "python" and "python" is in all_tags. Use print debugging to see what's happening:

def tag_coverage_report(notes: list[Note], all_tags: list[str]) -> str:
if not all_tags:
return "No tags defined."

found_tags: set[str] = set()
total_tags: int = 0

print(f"BEFORE percentage: found_tags={found_tags}, total_tags={total_tags}")
percentage: float = (len(found_tags) / total_tags) * 100 if total_tags > 0 else 0.0
print(f"AFTER percentage: {percentage}")

for note in notes:
for tag in note.tags:
if tag in all_tags:
found_tags.add(tag)

total_tags = len(all_tags)
print(f"AFTER loop: found_tags={found_tags}, total_tags={total_tags}")

return f"Tag coverage: {len(found_tags)}/{total_tags} ({percentage:.1f}%)"

Output:

BEFORE percentage: found_tags=set(), total_tags=0
AFTER percentage: 0.0
AFTER loop: found_tags={'python'}, total_tags=1

The print statements reveal the problem. When the percentage is calculated, found_tags is empty and total_tags is 0. Both are set to their correct values after the percentage line runs. The code runs in the wrong order.

Using the Error Taxonomy from Chapter 43: this is an Orchestration Error. The code is correct in pieces, but the pieces run in the wrong sequence.

Step 4: Fix

The minimal fix: move the percentage calculation after the loop and after total_tags is assigned. Do not rewrite the function. Do not rename variables. Do not refactor the loop. Move two lines:

def tag_coverage_report(notes: list[Note], all_tags: list[str]) -> str:
if not all_tags:
return "No tags defined."

found_tags: set[str] = set()

for note in notes:
for tag in note.tags:
if tag in all_tags:
found_tags.add(tag)

total_tags: int = len(all_tags)
percentage: float = (len(found_tags) / total_tags) * 100 if total_tags > 0 else 0.0

return f"Tag coverage: {len(found_tags)}/{total_tags} ({percentage:.1f}%)"

The fix changes the order of lines, not the lines themselves. That's the minimal fix principle: change as little as possible so you can be confident you addressed the bug and nothing else.

Step 5: Verify

Run the full test suite, not just the test that was failing:

uv run pytest test_smartnotes_buggy.py -v

Why the full suite? Your fix moved lines around. What if moving total_tags broke the test_no_tags_defined case? What if changing the order affected the return string in a way another test depends on?

If you only run the one failing test and it passes, you might celebrate while a different test now fails. The full suite catches that.

The "Just One Test" Trap

Emma says: "I skip the Reproduce step more often than I should. When I'm confident I know the bug, I jump straight to Fix. About 30% of the time, my 'fix' doesn't actually address the real problem because I didn't verify my reproduction first." If even experienced developers skip steps and get burned, the loop exists for a reason.


Why Each Step Exists

When a step gets skipped, something specific goes wrong:

Skipped stepWhat happens
ReproduceYou fix a symptom, not the cause. The bug comes back in a different form.
IsolateYou drown in complexity. A 200-line function has many suspects; a 5-line case has one.
IdentifyYou apply the wrong category of fix. Type errors need type fixes, not logic fixes.
Fix (minimal)You introduce new bugs. Rewriting a function changes behavior you didn't intend to change.
Verify (full)Your fix breaks something else. You find out from a user, not a test.

PRIMM-AI+ Practice: The Loop on Bug #5

Predict [AI-FREE]

Press Shift+Tab to enter Plan Mode.

Here is the test output for Bug #5:

FAILED test_smartnotes_buggy.py::TestTagCoverageReport::test_partial_coverage
AssertionError: assert '83.3%' in 'Tag coverage: 0/6 (0.0%)'

And here is the function:

def tag_coverage_report(notes: list[Note], all_tags: list[str]) -> str:
if not all_tags:
return "No tags defined."

found_tags: set[str] = set()
total_tags: int = 0

percentage: float = (len(found_tags) / total_tags) * 100 if total_tags > 0 else 0.0

for note in notes:
for tag in note.tags:
if tag in all_tags:
found_tags.add(tag)

total_tags = len(all_tags)

return f"Tag coverage: {len(found_tags)}/{total_tags} ({percentage:.1f}%)"

Which step of the debugging loop will reveal the problem? Write your answer: "Step ___ will reveal it because ___."

Check your prediction

Step 3 (Identify) reveals it. Print debugging shows that found_tags is empty and total_tags is 0 when the percentage is calculated, because the calculation runs before the loop. Step 2 (Isolate) helps confirm it by showing the bug persists even with one note and one tag.

Run

Press Shift+Tab to exit Plan Mode.

Walk through the full loop on Bug #5. Open smartnotes_buggy.py and test_smartnotes_buggy.py in your editor.

  1. Reproduce: Run uv run pytest test_smartnotes_buggy.py::TestTagCoverageReport::test_partial_coverage -v
  2. Isolate: Try calling tag_coverage_report with one note and one tag
  3. Identify: Add print statements before and after the percentage line
  4. Fix: Move the percentage calculation after the loop
  5. Verify: Run uv run pytest test_smartnotes_buggy.py -v (ALL tests, not just one)

Investigate

In Claude Code, type:

/investigate @smartnotes_buggy.py

The tag_coverage_report function calculates percentage before
the loop that populates found_tags. Why does Python allow this
without raising an error? Why doesn't pyright catch it?

The AI's answer should explain that the code is syntactically valid and type-correct. Python does not enforce execution order, and pyright only checks types, not logic. This is why orchestration errors are invisible to static analysis tools.

Modify

Predict: after you move the percentage calculation below the loop, will test_no_tags_defined still pass? Write your prediction before running. Then run all tests and check.

Make [Mastery Gate]

Apply the full debugging loop to a NEW bug. In Claude Code, type:

Write a SmartNotes function called most_prolific_author that
takes a list of Notes and returns the author with the highest
total word count. Include an orchestration error where a
variable is used before it is fully computed. Include a test
that fails because of the bug.

Document each step of the loop:

  1. Reproduce: Run the failing test. Copy the error output.
  2. Isolate: Simplify to the minimum input that still fails.
  3. Identify: Name the error category (it should be Orchestration Error). Show your print debugging output.
  4. Fix: Change the minimum amount of code. Do not rewrite the function.
  5. Verify: Run ALL tests, not just the failing one.

All five steps documented = mastery.


Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Walk Me Through Your Loop

I just learned the five-step debugging loop: reproduce, isolate,
identify, fix, verify. Here is my understanding of each step:

1. Reproduce: run the code and see the error
2. Isolate: comment out code until the error stops
3. Identify: read the error message
4. Fix: rewrite the function
5. Verify: run the test that was failing

What am I getting wrong?

Read the AI's corrections carefully. It should push back on at least three of your descriptions: reproduce means writing a test (not just running code), isolate means simplifying input (not commenting out code), fix means minimal change (not rewriting), and verify means running all tests (not just one).

What you're learning: You are testing your own mental model against the AI's knowledge, catching misconceptions before they become habits.

Prompt 2: Generate a Bug for Practice

Write a SmartNotes function called find_notes_by_date_range
that takes a list of Notes (each with a created_date string
in "YYYY-MM-DD" format) and a start/end date, returning notes
within the range. Include a logic error where the boundary
comparison is wrong. Include a failing test.

Apply the full five-step loop. Don't ask the AI to fix the bug. Fix it yourself, then compare your fix to what the AI would suggest.

What you're learning: You are practicing the loop on unfamiliar code, building the habit of following the process even when the bug seems obvious.

Prompt 3: When to Skip Steps

Is it ever OK to skip a step in the debugging loop? For
example, if I can see the bug immediately, can I skip
Reproduce and Isolate and go straight to Fix?

Read the AI's answer. It should explain that skipping Reproduce is the most common mistake: without a failing test, you can't prove your fix works. Skipping Isolate is less dangerous for simple bugs but becomes critical when functions are longer than 20 lines.

What you're learning: You are developing judgment about when the loop is non-negotiable (complex bugs, unfamiliar code) versus when experienced developers take shortcuts (single-line bugs they've seen before).


James nods slowly. "Document the symptom, reproduce, narrow down, root cause, fix, verify the fix didn't create new problems. Same five steps whether it's a crushed pallet or a wrong operator. In the warehouse, we call it RCA. Here, it's the debugging loop. Same discipline."

Emma pauses. "I'll be honest: I skip the Reproduce step more often than I should. When I'm confident I know the bug, I jump straight to Fix. About 30% of the time, my 'fix' doesn't actually address the real problem because I didn't verify my reproduction first. The loop exists because human intuition is unreliable, including mine."

"You have the tools and the process," she continues. "Lesson 5 puts it all together. Five bugs in one SmartNotes module. Each one from a different category. No hints. Just the loop."