Coverage: Proving What's Tested

James runs uv run pytest and all 24 tests pass. Green checkmarks fill his terminal. He leans back. "Done," he says.

Emma does not look convinced. "How do you know those 24 tests cover everything?" She points at categorize.py. "You have a function in there with four branches. How many of those branches does a test actually execute?"

James opens the file. He counts the branches, counts the tests, and realizes he has no idea which lines of code actually ran during the test suite. Tests passing tells him the tested code works. It says nothing about the untested code.

"There is a tool that answers that question," Emma says. "It watches which lines run during your tests and reports which ones never did."

If you're new to programming

Test coverage measures which lines of your code actually execute when you run your tests. If you have 100 lines of code and your tests execute 75 of them, you have 75% coverage. The other 25 lines have never been tested. Coverage does not tell you whether your tests are good; it tells you where tests are missing entirely.

If you have testing experience from another language

pytest-cov wraps Coverage.py, which instruments Python bytecode. It is comparable to Istanbul/nyc for JavaScript, JaCoCo for Java, or SimpleCov for Ruby. Line coverage is the default; branch coverage is available with --cov-branch.

Installing pytest-cov

You install pytest-cov as a development dependency, just like you installed pytest in Chapter 44:

uv add --dev pytest-cov

This is the same uv add --dev command you used before. The --dev flag means pytest-cov is a development tool, not part of your application. It does not ship to users; it helps you while building.

After installation, verify it works:

uv run pytest --co

The --co flag lists all collected tests without running them. If pytest-cov installed correctly, this command runs without errors.

Running Your First Coverage Report

Add the --cov flag to your pytest command, pointing it at your source directory:

uv run pytest --cov=smartnotes

This tells pytest: "Run all tests, and while running them, track which lines in the smartnotes/ package actually execute." After the tests finish, pytest-cov prints a coverage report:

Name                          Stmts   Miss  Cover   Missing
------------------------------------------------------------
smartnotes/__init__.py            0      0   100%
smartnotes/models.py              8      0   100%
smartnotes/categorize.py         12      3    75%   15-17
------------------------------------------------------------
TOTAL                            40     11    72%

Every column tells you something specific:

Column	What it means
Stmts	Total lines of executable code in the file (excludes comments and blank lines)
Miss	Lines that no test ever executed
Cover	Percentage of lines that were executed: `(Stmts - Miss) / Stmts`
Missing	The exact line numbers that were never run

Reading this report: smartnotes/models.py has 8 statements and all were executed (100% coverage). smartnotes/categorize.py has 12 statements, 3 were never executed (75% coverage), and the untested lines are 15, 16, and 17.

What the Missing Column Tells You

The Missing column is the most actionable part of the report. It tells you exactly where to look:

smartnotes/categorize.py         12      3    75%   15-17

Open categorize.py and go to lines 15-17. Those lines contain code that no test has ever run. Maybe it is a branch in an if/elif/else chain that no test input triggers. Maybe it is a helper function that nothing calls during tests. Either way, those lines are unverified.

This is the power of coverage: it turns "I think my tests are complete" into "I know lines 15-17 have never been tested."

Improving Coverage: Write the Missing Tests

Suppose lines 15-17 of categorize.py contain the "long" branch:

# categorize.py, lines 14-17
    elif word_count > 1000:
        category = "long"
        return category

No test passes a word count above 1000. The fix is to write a test that does:

def test_categorize_by_count_above_thousand_returns_long() -> None:
    # Arrange
    word_count: int = 1500

    # Act
    result: str = categorize_by_count(word_count)

    # Assert
    assert result == "long"

Run coverage again:

uv run pytest --cov=smartnotes

Name                          Stmts   Miss  Cover   Missing
------------------------------------------------------------
smartnotes/__init__.py            0      0   100%
smartnotes/models.py              8      0   100%
smartnotes/categorize.py         12      0   100%
------------------------------------------------------------
TOTAL                            40      0   100%

Lines 15-17 now show as covered. The Missing column is empty. Every line of code in the project has been executed by at least one test.

Coverage Is Necessary but Not Sufficient

100% coverage does not mean your code is bug-free. It means every line ran during tests. A line can run and still produce wrong results if your assertions are weak. Consider:

def test_categorize_runs_without_crashing() -> None:
    categorize_by_count(1500)  # No assert!

This test executes the "long" branch, so coverage counts those lines as covered. But it does not check the return value. The function could return "wrong" and this test would still pass.

Coverage tells you where tests are missing. It does not tell you whether existing tests are strong. Think of coverage as a floor, not a ceiling: 0% coverage means you have proven nothing; 100% coverage means every line has run at least once; strong assertions make the tests meaningful.

SmartNotes Capstone: Tests as Specification

This is the culmination of Chapter 52. You will write a comprehensive test suite for SmartNotes that serves as a specification for the project.

Here is the challenge: build a test suite of 15-20 tests that uses every technique from this chapter.

Your Test Suite Should Include

Fixtures (from Lesson 2):

A conftest.py with at least two fixtures (e.g., sample_note, long_note)
At least one test file that uses fixtures from conftest.py

Parametrize (from Lesson 3):

At least one parametrized test with 4+ cases
Custom IDs for readability

Exception tests (from Lesson 4):

At least two pytest.raises tests for built-in exceptions
At least one test using match= to verify the error message

Edge cases (from Lesson 4):

Tests for zero, empty string, negative numbers, and boundary values

Coverage target:

Run uv run pytest --cov=smartnotes and aim for 90%+ coverage
Identify any remaining untested lines and decide whether they need tests

Example Structure

tests/
├── conftest.py                  # Shared fixtures
├── test_models.py               # Note dataclass tests
├── test_categorize.py           # categorize_by_count tests (parametrized)
├── test_edge_cases.py           # Edge case and exception tests

Write the tests. Run coverage. Read the report. If coverage is below 90%, check the Missing column and write the tests that fill the gaps.

Tests as Specification: The Big Idea

Look at your finished test suite. Each test function describes one behavior that your code must exhibit:

test_categorize_by_count_zero_returns_short specifies that zero words means "short"
test_int_rejects_non_numeric_raises_value_error specifies that non-numeric strings are invalid
test_note_stores_title specifies that the Note dataclass preserves the title field

Your test suite IS your specification. If someone new joins the project and asks "what does this code do?", the test names answer the question. If a test is missing, the behavior is unspecified. If it is not tested, it is not specified.

This is the core thesis of this section of the book. Tests are not just verification tools. They are executable documentation that defines what your code promises to do.

PRIMM-AI+ Practice: Coverage Analysis

Predict [AI-FREE]

Press Shift+Tab to enter Plan Mode before predicting.

Look at this code and test file. Without running coverage, predict which lines will be covered and which will be missing. Write your predictions and a confidence score from 1 to 5 before checking.

smartnotes/priority.py:

def prioritize(word_count: int, days_old: int) -> str:     # line 1
    """Assign priority based on length and age."""           # line 2
    if word_count > 1000:                                    # line 3
        return "review"                                      # line 4
    elif days_old > 30:                                      # line 5
        return "archive"                                     # line 6
    elif days_old > 7:                                       # line 7
        return "follow-up"                                   # line 8
    else:                                                    # line 9
        return "active"                                      # line 10

tests/test_priority.py:

from smartnotes.priority import prioritize


def test_long_note_gets_review() -> None:
    result: str = prioritize(1500, 5)
    assert result == "review"


def test_old_note_gets_archive() -> None:
    result: str = prioritize(100, 45)
    assert result == "archive"

Questions:

Which lines will show as covered?
Which lines will show in the Missing column?
What is the approximate coverage percentage?

Check your predictions

Covered lines: 1, 2, 3, 4, 5, 6. The first test enters the word_count > 1000 branch (lines 3-4). The second test fails line 3's condition and enters the days_old > 30 branch (lines 5-6).

Missing lines: 7, 8, 9, 10. No test triggers the days_old > 7 branch or the else branch.

Coverage: 6 out of 10 executable lines = 60%. (Line 2 is a docstring; coverage tools may or may not count it depending on configuration. The approximation is close enough.)

To reach 100%, add tests for prioritize(50, 14) (follow-up) and prioritize(50, 3) (active).

Run

Press Shift+Tab to exit Plan Mode.

Create smartnotes/priority.py with the function above. Write four tests in test_priority.py that cover all four branches. Run uv run pytest --cov=smartnotes and verify the Missing column is empty for priority.py.

Investigate

Remove one of your four tests. Run coverage again. Note which lines appear in the Missing column. Does the Missing column match the branch you stopped testing? Add the test back.

If you want to go deeper, run /investigate @test_priority.py in Claude Code and ask how coverage relates to branch testing strategy.

Modify

Add a fifth branch to prioritize: if word_count == 0, return "empty". This new branch should appear in the Missing column when you run coverage (because no test covers it yet). Write the test, run coverage again, and confirm the line is now covered.

Make [Mastery Gate]

Without looking at any examples, create a complete test suite for a module with two functions:

Module (smartnotes/analyzer.py):

def count_words(text: str) -> int:
    """Count words in text. Empty string returns 0."""
    if text == "":
        return 0
    words: list[str] = text.split()
    return len(words)


def classify_length(word_count: int) -> str:
    """Classify text length."""
    if word_count > 500:
        return "long"
    elif word_count > 100:
        return "medium"
    else:
        return "short"

Your test suite must:

Use at least one fixture in conftest.py
Include at least one parametrized test with custom IDs
Test the edge case of count_words("") returning 0
Achieve 100% coverage (verify with uv run pytest --cov=smartnotes)
Have every test name follow the test_<what>_<condition>_<expected> convention

Run coverage and confirm 100% with an empty Missing column.

Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Interpret a Coverage Report

Here is my coverage report:

Name                          Stmts   Miss  Cover   Missing
------------------------------------------------------------
smartnotes/models.py              8      0   100%
smartnotes/categorize.py         12      3    75%   15-17
smartnotes/utils.py              20      8    60%   5-7, 12-16
------------------------------------------------------------
TOTAL                            40     11    72%

Explain what this report tells me. Which file needs the most
attention? What should I do about lines 5-7 and 12-16 in utils.py?

Read the AI's interpretation. It should identify utils.py as the file with the lowest coverage and suggest looking at those specific lines to understand what code is untested. Compare its advice to what you learned in this lesson.

What you're learning: You are evaluating whether the AI's coverage analysis matches the report-reading skills you just developed.

Prompt 2: Generate Tests for Missing Lines

My coverage report shows these lines are untested in
smartnotes/categorize.py (lines 15-17):

    elif word_count > 1000:
        category = "long"
        return category

Write a test that covers these lines. Use the AAA pattern,
a descriptive test name, and type annotations on all variables.

Review the AI's test. Does it pass a word count above 1000? Does it assert the return value is "long"? Does the name follow the convention? If the test is correct, add it to your test file and run coverage to confirm those lines are now covered.

What you're learning: You are using the AI to generate targeted tests based on coverage data, then verifying the result with the coverage tool.

Prompt 3: Coverage vs. Confidence

My test suite has 100% line coverage, but I am not confident the
code is correct. Here is a test that covers a branch without
checking the result:

def test_runs_without_error() -> None:
    categorize_by_count(1500)

Explain why 100% coverage does not equal 100% confidence. Then
rewrite this test so it actually proves the function works, and
give me two more examples of "coverage without confidence" traps.

What you're learning: You are developing the critical distinction between code that has been executed and code that has been verified. Coverage measures the floor, not the ceiling.

James stares at the coverage report: 100%, zero missing lines. "It is like completing a full warehouse audit. Every shelf inspected, every item counted, every discrepancy resolved. The report does not guarantee nothing will ever go wrong, but it proves that every area has been checked at least once."

"The 'at least once' part matters," Emma says. "I have seen teams celebrate 100% coverage with weak assertions. A test that calls a function without checking the return value covers the line but proves nothing." She looks at James's test suite. "Yours has strong assertions, though. Each test checks a specific result."

"So the test suite is the specification," James says. "If someone asks what SmartNotes does, they can read the test names."

"That is the thesis of this entire section," Emma confirms. "And now the question is: what happens when the AI generates code that fails your tests? How do you iterate on output that is close but not right?" She pulls up a blank file. "Chapter 53. You will take everything you built here -- dataclasses, fixtures, parametrize, coverage -- and use it as the feedback loop for refining AI-generated code."

Installing pytest-cov​

Running Your First Coverage Report​

What the Missing Column Tells You​

Improving Coverage: Write the Missing Tests​

Coverage Is Necessary but Not Sufficient​

SmartNotes Capstone: Tests as Specification​

Your Test Suite Should Include​

Example Structure​

Tests as Specification: The Big Idea​

PRIMM-AI+ Practice: Coverage Analysis​

Predict [AI-FREE]​

Run​

Investigate​

Modify​

Make [Mastery Gate]​

Try With AI​

Prompt 1: Interpret a Coverage Report​

Prompt 2: Generate Tests for Missing Lines​

Prompt 3: Coverage vs. Confidence​