Skip to main content

SmartNotes Search Feature Capstone

If you're new to programming

This is a timed challenge with no step-by-step instructions. You drive the entire TDG cycle yourself: write the stub, write the tests, prompt AI, verify, and debug. If you get stuck, the <details> blocks at each step have hints. You can also reference search-feature-spec.md in the chapter folder for the expected function signature.

If you've coded before

25 minutes. Problem statement to working, tested code. No scaffolding. The mastery gate is not just "tests pass" but also: can you explain the generated code and predict its behavior for an untested input?

In Lesson 3, James drove the TDG cycle with Emma nearby. She asked questions, nudged him toward the verification stack when he hesitated, and caught him before he modified his own tests. Now she pulls up a blank whiteboard and writes a single paragraph:

SmartNotes needs a search feature. Users want to find notes by keyword and optionally filter by tag. The search should be case-insensitive and match against both the title and body of each note. Title matches should appear before body-only matches.

She sets a timer on her phone. "You have everything you need. Stub, tests, prompt, verify, debug. Twenty-five minutes." She closes her laptop and walks out.


The Problem

Read the problem statement above one more time. That is your entire specification input. No hints about parameter types. No list of edge cases. No suggested function name. You extract all of that yourself.

Your deliverables:

FilePurpose
smartnotes_search.pyFunction stub with types and docstring
test_smartnotes_search.py8+ tests covering happy paths and edge cases
tdg_journal.mdDebugging journal documenting your cycle

Start the timer.


Step 1: Specify (5 minutes)

Open smartnotes_search.py. Write the function signature.

Three questions to answer before you type anything:

  1. What goes in? A list of notes, a keyword, and optionally a tag.
  2. What comes out? A list of matching notes, sorted by relevance.
  3. What types? Notes are Note dataclasses (from Chapter 51). Keyword is str. Tag is str | None. Return type is list[Note].

Write the stub. The body is .... The docstring specifies every behavior the problem statement describes, plus any edge cases you can think of.

Hint: what edge cases should the docstring mention?
  • What happens when the keyword is an empty string?
  • What happens when no notes match?
  • What happens when the notes list itself is empty?
  • What does "title matches before body-only matches" mean for a note where the keyword appears in both?

Your docstring should answer all four.

Run uv run pyright smartnotes_search.py. It should pass with no errors. If pyright flags a type, fix it now. The stub is your specification; it must be type-correct before you write a single test.


Step 2: Test (7 minutes)

Open test_smartnotes_search.py. Write at least 8 tests.

Think about what "correct" means for this function. Each test proves one specific behavior:

  • Does keyword matching work in the title?
  • Does keyword matching work in the body?
  • Is the matching truly case-insensitive?
  • Does the tag filter narrow results?
  • Does keyword + tag filtering work together?
  • Are title matches sorted before body-only matches?
  • Does an empty keyword return all notes?
  • Does an empty list return an empty list?
  • Does a keyword with no matches return an empty list?
Hint: creating test data

Build a fixture or a helper function that creates 3-4 sample Note objects with known titles, bodies, and tags. Reuse them across tests so you are not constructing notes in every test function.

@pytest.fixture
def sample_notes() -> list[Note]:
return [
Note(title="Python Tips", body="Learn basics of coding", word_count=50, tags=["beginner", "python"]),
Note(title="Debugging Guide", body="How to fix Python errors", word_count=120, tags=["python", "advanced"]),
Note(title="Cooking Pasta", body="Boil water and add salt", word_count=30, tags=["cooking"]),
]

Run uv run pytest test_smartnotes_search.py -v. Every test should FAIL (RED). If any test passes, your stub is not returning ... or your test is not actually calling the function. Fix it.

Commit both files to git before you proceed. The tests are the contract. Protect them.

git add smartnotes_search.py test_smartnotes_search.py
git commit -m "Add search_notes stub and test suite (RED)"

Step 3: Generate (3 minutes)

Open Claude Code and prompt:

Implement search_notes in smartnotes_search.py so that every test
in test_smartnotes_search.py passes. Do not modify the test file.

That is the entire prompt. The stub has the types. The docstring has the behavior. The tests have the acceptance criteria. The AI has everything it needs.

Hint: if you are unsure what to include in the prompt

The prompt above is sufficient. Your stub and tests are already in the project. Claude Code can read them. You do not need to paste code into the prompt. If you are working outside Claude Code, paste the stub file and the test file into the conversation.


Step 4: Verify (3 minutes)

Run the verification stack in order:

uv run ruff check smartnotes_search.py
uv run pyright smartnotes_search.py
uv run pytest test_smartnotes_search.py -v

Three possible outcomes:

OutcomeWhat it meansWhat to do
All GREENAI's implementation passes every testMove to Step 6 (Read)
Some REDOne or more tests failMove to Step 5 (Debug)
Pyright errorsAI broke the typesRe-prompt: "Fix the type errors. Keep the same logic."

Step 5: Debug (5 minutes)

If tests failed, you have practiced this loop:

  1. Read the failure message. Which test failed? What was expected vs. actual?
  2. Decide: is the bug in the implementation or the test? If your test expects the wrong thing, the test is wrong. If the implementation returns the wrong thing for a correct test, the implementation is wrong.
  3. Re-prompt or fix manually. Paste the failure output into Claude Code and ask it to fix the specific function. Or, if the fix is obvious, edit the implementation yourself.
  4. Run the stack again. Repeat until GREEN.
Hint: common failures for search functions
  • Ordering wrong: The AI may return all matches without sorting title-matches first. Check whether it separates title matches from body-only matches before combining.
  • Case sensitivity: The AI may forget .lower() on one side of the comparison.
  • Tag filter ignored: The AI may apply keyword matching but skip the tag check when tag is not None.
  • Empty keyword: The AI may return an empty list instead of all notes when keyword is "".

Document each failure and fix in tdg_journal.md. This is your debugging journal:

# TDG Cycle Journal: search_notes

## Iteration 1
- Prompt: "Implement search_notes..."
- Result: 6/9 tests passed
- Failures:
- test_title_before_body: returned body matches first
- test_empty_keyword: returned [] instead of all notes
- test_case_insensitive: missed .lower() on body
- Fix: Re-prompted with failure output

## Iteration 2
- Result: 9/9 tests passed
- Fix applied: AI separated title matches and body matches into two lists, then concatenated

Step 6: Read (2 minutes)

All tests are GREEN. Now apply PRIMM: predict before you verify.

Pick an input that is NOT in your test suite. For example: a keyword that appears as a substring of a word (searching for "cook" in a note titled "Cooking Pasta"). Predict what search_notes returns for that input. Then run it and compare.

Check for hardcoded values. Does the implementation use the keyword parameter in comparisons, or does it check for specific strings like "Python" or "pasta"? A hardcoded implementation passes your tests but fails on any real input.


Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Review My Search Implementation

After completing the capstone, paste your generated implementation into Claude Code:

Here is my search_notes implementation:

[paste generated code]

Evaluate the algorithm quality. Is it using substring matching
correctly? Is the sorting approach efficient? What would break
if the notes list had 10,000 entries?

What you're learning: You are moving beyond "does it pass tests" to "is it well-engineered." The AI acts as a code reviewer who can identify performance and correctness concerns that your test suite does not cover.

Prompt 2: What Edge Case Did I Miss?

Here is my test suite for search_notes:

[paste test file]

Suggest 3 test cases I did not write. For each one, explain
what behavior it would catch that my existing tests miss.

Run the suggested tests against your implementation. If any fail, you found a gap in your specification.

What you're learning: No test suite is complete. The AI can suggest edge cases you overlooked because it has seen thousands of search implementations. The goal is not perfection; it is expanding your instinct for what "correct" means.

Prompt 3: Rate My TDG Cycle

I just completed a TDG cycle on a search function. Here is what
I did:

1. Wrote a function stub from a problem statement
2. Wrote [N] tests
3. Prompted AI to implement
4. [describe: passed first try / needed N re-prompts]
5. Reviewed with PRIMM: predicted output for [describe input]
6. Checked for hardcoded values: [found / not found]

Rate my cycle. What did I do well? What should I improve
next time?

What you're learning: You are evaluating your own process, not just the code. The AI rates your TDG workflow, not the implementation. This builds metacognitive awareness: understanding how you work, not just what you produce.


James hears the door open. Emma is back. She glances at his screen: a green bar, a test file with nine tests, and a journal file with two iterations documented.

"Walk me through it," she says.

"The problem statement said case-insensitive matching, title before body, optional tag filter. I started with the stub: search_notes takes a list of Note, a keyword string, and an optional tag. Returns list[Note]." He points to the docstring. "I wrote five edge cases into the docstring before I wrote a single test."

"And the tests?"

"Nine. Three for keyword matching, one for case sensitivity, two for the tag filter, one for ordering, one for empty keyword, one for empty list. The first AI implementation failed on ordering and empty keyword. I re-prompted with the failure output and it fixed both."

Emma pulls up the journal. "Two iterations. Not bad. What did you check after the tests passed?"

"Searched for 'cook' in a note titled 'Cooking Pasta.' The AI used substring matching with .lower(), so it caught partial matches. No hardcoded strings."

She nods. "You drove the whole cycle. Specification, tests, generation, verification, debugging, review." She pauses. "Take a step back. What do you actually have now that you did not have at the start of Chapter 56?"

James thinks. "I can debug and I can build. The debugging loop from Chapter 56 catches what the tests miss. The tests catch what my eyes miss. The types catch what the tests cannot reach."

"What is that, structurally?"

"A pyramid." He sketches on the whiteboard. "Types at the base: they reject entire categories of errors at rest. Tests in the middle: they verify behavior for specific inputs. Human review at the top: I check for things neither tool can see, like whether the algorithm generalizes beyond the test data."

"The verification pyramid," Emma says. "Types, tests, human judgment. You just drove the whole thing independently. No scaffolding, no prompts from me, no step-by-step instructions."

James looks at the drawing. "In the warehouse, every shipment went through three checkpoints before it reached the floor. Barcode scan, weight check, visual inspection. If any checkpoint failed, the shipment went back. Same idea: if types fail, fix the specification. If tests fail, fix the implementation. If the code looks hardcoded, fix the algorithm."

"Final inspection before shipping," Emma says. "That is Phase 4."

She erases the whiteboard and writes a new line:

You have been specifying functions. But SmartNotes is growing. A list of functions is not an architecture. You need objects that model your domain: Notes with behavior, Collections that manage them. That is Phase 5: the Python object model.

James stares at the whiteboard. "So I stop writing standalone functions and start building structures?"

"Exactly. The dataclass you have been using for Note is the simplest form of it. Phase 5 gives it behavior: methods that belong to the note, not floating beside it. Your TDG cycle does not change. Your building blocks do."