Testing Dataclass-Based Code

James looks at his SmartNotes project. Every function passes notes around as dict[str, str]. Every function is one typo away from a KeyError. He knows the fix now: replace the dicts with a Note dataclass.

"Where do I start?" he asks.

Emma pulls up the test file. "Start with the data model. Define the Note class. Then change one function at a time. After each change, run the tests. If they pass, move on. If they fail, you know exactly which function broke."

James nods. "Tests as my safety net."

"Tests as your specification," Emma corrects. "The tests describe what each function should do. The dataclass describes what a note looks like. Together, they leave almost nothing to guesswork."

If you're new to programming

This lesson ties together everything from Chapter 51. You will define a Note dataclass, rewrite SmartNotes functions to use it, and write tests that create Note instances. By the end, your SmartNotes project will be structured, type-checked, and tested.

The Note Dataclass

Here is the complete Note dataclass for SmartNotes. It replaces every dict[str, str] in your project:

from dataclasses import dataclass, field

@dataclass
class Note:
    title: str
    body: str
    word_count: int
    author: str = "Anonymous"
    is_draft: bool = True
    tags: list[str] = field(default_factory=list)

Compare this to the dict approach from Lesson 1:

Dict approach	Dataclass approach
`note["title"]`	`note.title`
`note["body"]`	`note.body`
No defined shape	Shape defined by class fields
Pyright: 0 warnings on typos	Pyright: error on every typo
No autocomplete on keys	Full autocomplete on fields

Every field has a name and a type. Three fields have defaults. The tags field uses field(default_factory=list) from Lesson 4. This class is the single source of truth for what a note contains.

Transforming SmartNotes Functions

create_note

Before (dict-based):

def create_note(title: str, body: str, author: str = "Anonymous") -> dict[str, str]:
    return {
        "title": title,
        "body": body,
        "author": author,
        "word_count": str(len(body.split())),
    }

After (dataclass-based):

def create_note(title: str, body: str, author: str = "Anonymous") -> Note:
    """Create a new Note with an automatically calculated word count."""
    word_count: int = len(body.split())
    return Note(
        title=title,
        body=body,
        word_count=word_count,
        author=author,
    )

Notice two improvements. The return type changed from dict[str, str] to Note. The word_count is now an int instead of a string. With dicts, everything had to be str because the type was dict[str, str]. With a dataclass, each field has its own type.

note_word_count

Before:

def note_word_count(note: dict[str, str]) -> int:
    return int(note["word_count"])

After:

def note_word_count(note: Note) -> int:
    """Return the word count of a note."""
    return note.word_count

The function is simpler because note.word_count is already an int. No string conversion needed.

format_note_header

Before:

def format_note_header(note: dict[str, str]) -> str:
    return f"{note['title']} by {note['author']}"

After:

def format_note_header(note: Note) -> str:
    """Format a note's title and author as a header string."""
    return f"{note.title} by {note.author}"

Same logic, but note.title and note.author are checked by pyright and autocompleted by your editor.

merge_notes

Before:

def merge_notes(base: dict[str, str], override: dict[str, str]) -> dict[str, str]:
    merged: dict[str, str] = {}
    for key in base:
        merged[key] = base[key]
    for key in override:
        merged[key] = override[key]
    return merged

After:

def merge_notes(base: Note, override: Note) -> Note:
    """Merge two notes, using override's title and body with combined tags."""
    combined_tags: list[str] = base.tags + override.tags
    merged_word_count: int = len(override.body.split())
    return Note(
        title=override.title,
        body=override.body,
        word_count=merged_word_count,
        author=base.author,
        tags=combined_tags,
    )

The dataclass version is explicit about what "merge" means: take the override's title and body, keep the base's author, and combine both tag lists. The dict version blindly copied keys, which could introduce unexpected fields.

Writing Tests with Dataclass Instances

Tests become clearer when they create and compare dataclass instances. Here is the full test file:

import pytest
from dataclasses import dataclass, field

@dataclass
class Note:
    title: str
    body: str
    word_count: int
    author: str = "Anonymous"
    is_draft: bool = True
    tags: list[str] = field(default_factory=list)


def create_note(title: str, body: str, author: str = "Anonymous") -> Note:
    """Create a new Note with an automatically calculated word count."""
    word_count: int = len(body.split())
    return Note(
        title=title,
        body=body,
        word_count=word_count,
        author=author,
    )


def note_word_count(note: Note) -> int:
    """Return the word count of a note."""
    return note.word_count


def format_note_header(note: Note) -> str:
    """Format a note's title and author as a header string."""
    return f"{note.title} by {note.author}"


def merge_notes(base: Note, override: Note) -> Note:
    """Merge two notes, using override's title and body with combined tags."""
    combined_tags: list[str] = base.tags + override.tags
    merged_word_count: int = len(override.body.split())
    return Note(
        title=override.title,
        body=override.body,
        word_count=merged_word_count,
        author=base.author,
        tags=combined_tags,
    )


# --- Tests ---

def test_create_note_defaults() -> None:
    note: Note = create_note(title="Test", body="Hello world")
    assert note.title == "Test"
    assert note.body == "Hello world"
    assert note.word_count == 2
    assert note.author == "Anonymous"
    assert note.is_draft is True
    assert note.tags == []


def test_create_note_custom_author() -> None:
    note: Note = create_note(title="Meeting", body="Discussed items", author="James")
    assert note.author == "James"


def test_note_word_count() -> None:
    note: Note = create_note(title="Long Note", body="one two three four five")
    assert note_word_count(note) == 5


def test_format_note_header() -> None:
    note: Note = create_note(title="Weekly Review", body="Summary of the week")
    result: str = format_note_header(note)
    assert result == "Weekly Review by Anonymous"


def test_format_note_header_custom_author() -> None:
    note: Note = create_note(title="Plan", body="Ship it", author="Emma")
    result: str = format_note_header(note)
    assert result == "Plan by Emma"


def test_merge_notes_uses_override_content() -> None:
    base: Note = create_note(title="Old", body="Original content", author="James")
    override: Note = create_note(title="New", body="Updated content", author="Emma")
    result: Note = merge_notes(base, override)
    assert result.title == "New"
    assert result.body == "Updated content"


def test_merge_notes_keeps_base_author() -> None:
    base: Note = create_note(title="Old", body="Original", author="James")
    override: Note = create_note(title="New", body="Updated")
    result: Note = merge_notes(base, override)
    assert result.author == "James"


def test_merge_notes_combines_tags() -> None:
    base: Note = Note(
        title="A", body="First", word_count=1, tags=["python"]
    )
    override: Note = Note(
        title="B", body="Second", word_count=1, tags=["ai"]
    )
    result: Note = merge_notes(base, override)
    assert result.tags == ["python", "ai"]


def test_note_equality() -> None:
    note_a: Note = Note(title="Same", body="Content", word_count=1)
    note_b: Note = Note(title="Same", body="Content", word_count=1)
    assert note_a == note_b


def test_note_inequality() -> None:
    note_a: Note = Note(title="First", body="Content", word_count=1)
    note_b: Note = Note(title="Second", body="Content", word_count=1)
    assert note_a != note_b

Save this as test_smartnotes_dataclass.py and run:

uv run pytest test_smartnotes_dataclass.py -v

All 10 tests should pass. Each test creates Note instances directly, calls a function, and asserts the result. No dictionary keys to misspell. No KeyError surprises.

What Dataclasses Do Not Catch

A gap you should know about

Try creating a note with the wrong types:

bad_note: Note = Note(title=999, body=True, word_count="five")

This runs without error. Python creates the instance and stores 999 as the title, True as the body, and "five" as the word count. The type annotations in a dataclass are hints for pyright and your editor, but Python does not enforce them at runtime.

Pyright will flag this code with type errors, which is good. But if the data comes from an external source (a JSON file, an API response, user input), pyright cannot check it because the types are only known at runtime.

Phase 3b introduces Pydantic, which validates types at runtime. Pydantic raises an error if you pass 999 where a str is expected. For now, dataclasses give you editor support and static checking. Pydantic adds the runtime safety layer on top.

PRIMM-AI+ Practice: Full Transformation

Predict [AI-FREE]

Press Shift+Tab to enter Plan Mode before predicting.

Read the test_merge_notes_combines_tags test above. Without running it, predict what result.tags will contain and what result.author will be. Rate your confidence from 1 to 5.

Check your prediction

result.tags is ["python", "ai"] because merge_notes concatenates base.tags + override.tags.
result.author is "Anonymous" because the base note was created with Note(title="A", body="First", word_count=1, tags=["python"]), which uses the default author.

Run

Press Shift+Tab to exit Plan Mode.

Save the complete test file above as test_smartnotes_dataclass.py. Run uv run pytest test_smartnotes_dataclass.py -v and confirm all 10 tests pass.

Investigate

Introduce a typo in one of the functions. Change note.title to note.titel in format_note_header. Run uv run pyright test_smartnotes_dataclass.py first. Does pyright catch it? Then run uv run pytest. What error do you get? Compare this to Lesson 1, where pyright was silent about the same kind of typo in dict-based code.

If you want to go deeper, run /investigate @test_smartnotes_dataclass.py in Claude Code and ask how dataclass attribute checking differs from dict key checking.

Modify

Add a new function called is_long_note(note: Note) -> bool that returns True if the word count is greater than 100 and False otherwise. Write two tests: one for a long note and one for a short note. Run pytest to verify both pass.

Make [Mastery Gate]

Without looking at any examples, write a complete file that:

Defines the Note dataclass with all six fields
Writes a function summarize_note(note: Note) -> str that returns "{title} ({word_count} words)"
Writes two tests for summarize_note: one with the default author and one with a custom author (the author does not appear in the summary, but the function should still work correctly regardless of author)

Run uv run pytest to verify both tests pass. Run uv run pyright to verify zero type errors.

Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Review the Transformation

Here is my Note dataclass and create_note function:

@dataclass
class Note:
    title: str
    body: str
    word_count: int
    author: str = "Anonymous"
    is_draft: bool = True
    tags: list[str] = field(default_factory=list)

def create_note(title: str, body: str, author: str = "Anonymous") -> Note:
    word_count: int = len(body.split())
    return Note(title=title, body=body, word_count=word_count, author=author)

Is there anything I should improve? Are there any edge cases
I am not handling?

Read the AI's suggestions. Does it mention empty strings, very long bodies, or other edge cases? Evaluate each suggestion: is it worth adding now, or is it premature complexity?

What you're learning: You are using the AI as a code reviewer, then applying your own judgment about which suggestions to accept. This is the core skill of working with AI assistants.

Prompt 2: Generate Additional Tests

Given the Note dataclass and create_note function above,
generate three additional test functions that cover edge
cases I might have missed. Each test should use type
annotations and clear assert statements.

Review the AI's tests. Do they test meaningful scenarios? Do the assertions check the right values? Run them alongside your existing tests to verify they pass.

What you're learning: You are evaluating AI-generated tests for quality and completeness, a skill you will use in every project.

Prompt 3: Compare Dict-Based Tests vs Dataclass-Based Tests

Show me the same test written two ways: once using dict[str, str]
and once using a Note dataclass. The test should verify that a
function returns the correct note title and word count. Explain
which version is safer and why.

Read both versions side by side. Count the ways the dict-based test could fail silently (misspelled key, wrong type, missing field). Then count the same risks in the dataclass version. The difference is the entire motivation for this chapter.

What you're learning: You are seeing the concrete, side-by-side payoff of switching from dicts to dataclasses, which reinforces why the transformation effort is worth it.

James runs uv run pytest -v and watches ten green checkmarks scroll past. "This is like completing a warehouse inventory audit. Every item has a barcode, every shelf is labeled, and the system rejects anything that does not scan."

Emma nods, then catches herself. "One thing I should flag: dataclasses do not validate types at runtime. If you pass title=999, Python accepts it. Pyright catches the mistake statically, but if the data comes from a JSON file or an API, you are on your own." She frowns. "I honestly am not sure where the line is between 'good enough' and 'needs runtime validation.' That is a judgment call we will revisit in Phase 3b with Pydantic."

"For now, pyright plus tests cover what I need?"

"For SmartNotes, yes. And speaking of tests, yours are solid but repetitive. You wrote the same setup code in twelve functions. Chapter 52 teaches you fixtures to write that setup once, parametrize to collapse duplicate test functions into one, and coverage to prove you are not missing anything."

The Note Dataclass​

Transforming SmartNotes Functions​

create_note​

note_word_count​

format_note_header​

merge_notes​

Writing Tests with Dataclass Instances​

What Dataclasses Do Not Catch​

PRIMM-AI+ Practice: Full Transformation​

Predict [AI-FREE]​

Run​

Investigate​

Modify​

Make [Mastery Gate]​

Try With AI​

Prompt 1: Review the Transformation​

Prompt 2: Generate Additional Tests​

Prompt 3: Compare Dict-Based Tests vs Dataclass-Based Tests​

The Note Dataclass

Transforming SmartNotes Functions

create_note

note_word_count

format_note_header

merge_notes

Writing Tests with Dataclass Instances

What Dataclasses Do Not Catch

PRIMM-AI+ Practice: Full Transformation

Predict [AI-FREE]

Run

Investigate

Modify

Make [Mastery Gate]

Try With AI

Prompt 1: Review the Transformation

Prompt 2: Generate Additional Tests

Prompt 3: Compare Dict-Based Tests vs Dataclass-Based Tests