Multi-Round Iteration
James finishes his Level 2 re-prompt from the previous lesson. He runs uv run pytest and watches the output scroll by. The two tests that were failing now pass. He grins.
Then he reads the rest of the output. A test that passed before, test_search_no_matches, is now failing.
"It broke something," he says, staring at the screen.
Emma nods. "That happens. The AI changed the matching logic to handle case insensitivity, but it also changed how the function handles the 'no match' path. One fix, one new bug." She pulls up a chair. "This is why iteration is usually two or three rounds, not one. The skill is not getting it right on the first re-prompt. The skill is tracking what changed and converging."
When you fix one problem and accidentally create another, that is called a regression. It is not a sign that something is broken beyond repair. It means the fix was too broad or touched code it should not have. Regressions happen to professional developers every day. The difference between a beginner and a professional is that the professional expects them and has a process for catching them: tests.
You already know that patches can introduce regressions. In AI-assisted development, the pattern is the same, but the "developer" making the change is the AI, and your tests are the code review. This lesson formalizes the tracking process so you can iterate efficiently rather than ping-ponging between fixes.
Why Fixes Introduce New Bugs
In Chapter 46 Lesson 4, a note mentioned that fixes can introduce new failures: "Sometimes the AI's fix breaks something that was already working." That chapter deferred the topic because you were learning single-round re-prompting. Now you are ready.
Here is what typically happens. The AI receives your re-prompt asking it to make the search case-insensitive. To do this, it rewrites the matching logic. The original code might have been:
def search_notes(notes: list[Note], term: str) -> list[Note]:
"""Return notes whose title or body contains the search term."""
results: list[Note] = []
for note in notes:
if term in note.title or term in note.body:
results.append(note)
return results
The AI's fix for case insensitivity might produce:
def search_notes(notes: list[Note], term: str) -> list[Note]:
"""Return notes whose title or body contains the search term."""
results: list[Note] = []
lower_term: str = term.lower()
for note in notes:
if lower_term in note.title.lower() and lower_term in note.body.lower():
results.append(note)
return results
Spot the bug? The AI changed or to and. Now a note must contain the term in both title AND body to match, instead of either one. The case-insensitivity fix is correct, but the logical operator changed as a side effect.
This is a classic regression: the AI rewrote more than it needed to, and the extra change introduced a new error.
Tracking Errors Across Rounds
The key to multi-round iteration is tracking. Without a record, you lose context about what was passing, what broke, and what you already tried. Here is a simple table format:
| Test Name | Round 1 | Round 2 | Round 3 |
|---|---|---|---|
test_search_exact_title_match | PASS | PASS | PASS |
test_search_case_insensitive | FAIL (Omission) | PASS | PASS |
test_search_in_body | PASS | FAIL (Logic) | PASS |
test_search_no_matches | PASS | PASS | PASS |
test_search_empty_list | PASS | PASS | PASS |
test_search_empty_term | FAIL (Misinterpretation) | PASS | PASS |
Each cell records PASS or FAIL plus the Error Taxonomy category from Chapter 43. The table shows you the trajectory: Round 1 had two failures. Round 2 fixed both but introduced a regression on test_search_in_body. Round 3 fixed the regression without breaking anything else.
The Three-Round Walkthrough
Let us trace the full search_notes iteration.
Round 1: The Initial Prompt
You wrote a Level 1 prompt (from Lesson 1). The AI produced a case-sensitive search that did not handle empty terms. Result: 4 of 6 tests pass.
Failing tests:
test_search_case_insensitive: Omission (the AI never lowercased anything)test_search_empty_term: Misinterpretation (the AI returned[]for empty terms)
Round 2: The Fix That Breaks Something
Your re-prompt from Lesson 1 asked for case-insensitive matching and empty-term handling. The AI rewrote the function. It added .lower() calls and an early return for empty terms. But in the rewrite, it changed or to and in the matching condition.
New result: 5 of 6 pass.
Failing test:
test_search_in_body: Logic error (theandvsorchange means notes that match only in the body are excluded)
Previously failing tests: Both now pass.
This is progress. You went from 4/6 to 5/6. But the new failure is a regression: something that worked before broke.
Round 3: The Targeted Fix
Your re-prompt for Round 3 should be surgical:
The search_notes function has a logic error on the matching condition.
The current code uses `and` to combine the title and body checks,
but it should use `or`. A note should match if the term appears
in the title OR the body, not both.
Failing test:
FAILED test_search_in_body - assert 0 == 1
The note has title="Cooking Tips" and body="Use fresh herbs."
Searching for "herbs" should return this note because "herbs"
appears in the body. The current code requires it to appear
in both title and body.
Change `and` to `or` on the matching line. Do not change
anything else.
Notice the last line: "Do not change anything else." When you identify a one-word fix, telling the AI to limit its changes reduces the risk of another regression.
Result: 6 of 6 pass. Iteration complete.
SmartNotes search_notes Across Three Rounds
Here is the function at each round, so you can see how it evolved:
Round 1 output (case-sensitive, no empty-term handling):
def search_notes(notes: list[Note], term: str) -> list[Note]:
"""Return notes whose title or body contains the search term."""
results: list[Note] = []
for note in notes:
if term in note.title or term in note.body:
results.append(note)
return results
Round 2 output (case-insensitive, but and regression):
def search_notes(notes: list[Note], term: str) -> list[Note]:
"""Return notes whose title or body contains the search term."""
if not term:
return list(notes)
results: list[Note] = []
lower_term: str = term.lower()
for note in notes:
if lower_term in note.title.lower() and lower_term in note.body.lower():
results.append(note)
return results
Round 3 output (all tests pass):
def search_notes(notes: list[Note], term: str) -> list[Note]:
"""Return notes whose title or body contains the search term."""
if not term:
return list(notes)
results: list[Note] = []
lower_term: str = term.lower()
for note in notes:
if lower_term in note.title.lower() or lower_term in note.body.lower():
results.append(note)
return results
The only difference between Round 2 and Round 3 is one word: and became or. But finding that one word required reading the code, understanding the logic, and writing a precise re-prompt.
When to Stop Iterating
Three rounds is typical for a well-specified function. If you are on round 4 or 5 and still failing tests, one of these is happening:
- Your tests conflict. Two tests expect contradictory behavior. Review your test logic.
- Your prompt is missing critical context. The AI cannot converge because it does not understand a constraint you have not stated.
- The function is too complex for one prompt. Break it into smaller functions and iterate on each one separately.
If you are not making progress (the number of passing tests is not increasing), stop re-prompting. Read the code, understand the problem, and either fix it manually or start over with a better prompt. Lesson 3 covers this judgment call.
PRIMM-AI+ Practice: Regression Tracking
Predict [AI-FREE]
Look at the Round 1 output of search_notes above. Suppose you re-prompt the AI to add case-insensitive matching. Before running anything, predict:
- Which tests will now pass that were failing before? (Confidence: 1-5)
- Will any currently passing tests break? If so, which ones? (Confidence: 1-5)
- How many total rounds will you need to get to 6/6? (Confidence: 1-5)
Write your predictions down.
Check your predictions
Prediction 1: test_search_case_insensitive should now pass, since you explicitly asked for case-insensitive matching.
Prediction 2: There is a real risk of regression. The AI must rewrite the matching line to add .lower() calls. If it rewrites too aggressively (changing the logical operator, altering the loop structure, or modifying the return type), a previously passing test could break. The most likely regressions are on test_search_in_body or test_search_exact_title_match, since those depend on the matching condition.
Prediction 3: Two or three rounds is typical. One round for the initial generation, one for the case-sensitivity fix, and possibly one more if a regression appears.
Run
Using the Round 1 output in your own SmartNotes project, write a re-prompt asking for case-insensitive matching and empty-term handling. Run uv run pytest. Record the results in a tracking table like the one shown earlier.
Investigate
For any new failures (tests that passed in Round 1 but fail in Round 2), classify them:
- Regression from rewrite: the AI changed code it should not have touched
- Incomplete fix: the AI fixed part of the problem but missed an edge case
- New misinterpretation: the AI understood your re-prompt differently than you intended
Write one sentence explaining which category applies.
Modify
Write a Round 3 re-prompt that targets only the regression. Include the instruction "Do not change anything else" if the fix is a single-line change. Run pytest. Did you converge to 6/6?
Make [Mastery Gate]
Without guidance, run a complete multi-round iteration on this function:
def count_notes_by_tag(notes: list[Note]) -> dict[str, int]:
"""Count how many notes contain each tag. Case-insensitive tag matching."""
...
Write four tests first (at least one for case-insensitive tags, one for duplicate tags across notes, one for an empty list, one for a note with no tags). Then prompt, iterate, and track your rounds in a table. You should converge in two or three rounds.
Try With AI
If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.
Prompt 1: Diagnose a Regression
I asked you to make search_notes case-insensitive, and now
test_search_in_body is failing. Here is the current code:
def search_notes(notes: list[Note], term: str) -> list[Note]:
if not term:
return list(notes)
results: list[Note] = []
lower_term: str = term.lower()
for note in notes:
if lower_term in note.title.lower() and lower_term in note.body.lower():
results.append(note)
return results
The test expects that searching for "herbs" returns a note with
body="Use fresh herbs." but the function returns an empty list.
What changed that caused this regression?
Read the AI's diagnosis. Did it identify the and vs or issue? Compare its explanation to the one in this lesson.
Prompt 2: Evaluate a Re-prompt
Which of these two re-prompts is more likely to fix the regression
without breaking other tests?
Re-prompt A: "Fix the search function, test_search_in_body is failing."
Re-prompt B: "On line 7 of search_notes, change 'and' to 'or' in the
matching condition. The note should match if the term appears in the
title OR the body, not both. Do not change any other lines."
Explain your reasoning.
What you're learning: You are building judgment about re-prompt specificity. A surgical re-prompt that names the exact line and exact change is safer than a vague one that gives the AI freedom to rewrite more than necessary.
Key Takeaways
-
Fixes can introduce regressions. When the AI rewrites a function to fix one bug, it may change code that was already correct. This is normal, not catastrophic.
-
Track errors across rounds. A simple table (test name, Round 1 result, Round 2 result, Round 3 result) with Error Taxonomy categories prevents you from losing context about what was working.
-
Surgical re-prompts reduce regression risk. When you know the fix is a one-line change, tell the AI exactly what to change and add "Do not change anything else." This limits the scope of the rewrite.
-
Two to three rounds is typical. If you are past round 3 and still not converging, the problem is likely in your tests or your prompt, not in the AI's ability to fix code.
Looking Ahead
You can now manage multi-round iteration. But how do you know what the AI actually changed between rounds without reading the entire function line by line? Lesson 3 introduces git diff output, which shows you exactly which lines were added, removed, or changed. It also introduces the 30% heuristic for deciding when to fix manually instead of re-prompting.