Skip to main content

Test Structure: Arrange, Act, Assert

James scrolls through his SmartNotes test file. Fifteen test functions, and every single one looks different. Some cram everything into one line. Others sprawl across eight lines with no clear separation between setup, action, and checking. He can run them all and they pass, but when one fails, he stares at the function trying to figure out which part broke.

"You've been doing something right without knowing it," Emma says. She highlights three tests James wrote in Chapter 51. "Look at this one. You create a Note. You call a function. You check the result. Three phases, every time. That pattern has a name."

If you're new to programming

A test checks that your code does what you expect. You have been writing tests since Chapter 46. This lesson does not introduce tests; it names a structure you have already been using. Giving the structure a name makes it easier to spot when a test is missing a piece.

If you have testing experience from another language

The Arrange-Act-Assert (AAA) pattern is sometimes called Given-When-Then in BDD frameworks. Python's pytest uses plain functions rather than class hierarchies. If you are coming from JUnit or NUnit, the pattern is identical; the ceremony is lighter.


The Three Phases

Every well-structured test has three phases:

  1. Arrange: Set up the data and objects the test needs.
  2. Act: Call the function or method you are testing.
  3. Assert: Check that the result matches your expectation.

Here is a test you might have written in Chapter 51, with the phases labeled:

from smartnotes.models import Note


def test_note_stores_title() -> None:
# Arrange
note: Note = Note(title="Meeting Notes", body="Discuss roadmap", word_count=2)

# Act
result: str = note.title

# Assert
assert result == "Meeting Notes"

The comments are optional. What matters is that you can point at any line and say which phase it belongs to. If you cannot, the test needs restructuring.


Why Separation Matters

Consider this flat test with no separation:

def test_note_stuff() -> None:
assert Note(title="A", body="B", word_count=1).title == "A"

It works. It passes. But when it fails, pytest tells you the assertion on line 2 failed. You cannot tell whether the problem is in the Note constructor (Arrange), the attribute access (Act), or the comparison (Assert). Splitting the phases gives you a clear diagnostic target.

Compare:

def test_note_stores_title() -> None:
# Arrange
note: Note = Note(title="A", body="B", word_count=1)

# Act
result: str = note.title

# Assert
assert result == "A"

If this fails, you know the Note was constructed (Arrange succeeded), the attribute was accessed (Act succeeded or failed), and the comparison found a mismatch (Assert reports the difference). Three phases, three diagnostic checkpoints.


Test Naming Convention

A good test name tells you what failed without reading the code. The convention:

test_<what>_<condition>_<expected>
NameWhat it tells you
test_categorize_note_high_word_count_returns_longTesting categorize_note when word count is high; expects "long"
test_note_empty_body_stores_empty_stringTesting Note with empty body; expects empty string stored
test_count_words_single_word_returns_oneTesting count_words with one word; expects 1

When pytest reports a failure, you see the function name in the output. A name like test_stuff_works tells you nothing. A name like test_categorize_note_zero_words_returns_short tells you exactly what broke and under what condition.


Test Discovery: How pytest Finds Your Tests

pytest does not need a list of test functions. It discovers them automatically using two rules:

  1. File names: pytest looks for files named test_*.py or *_test.py.
  2. Function names: Inside those files, pytest runs every function whose name starts with test_.
smartnotes/
├── models.py
├── categorize.py
tests/
├── test_models.py # pytest finds this file
├── test_categorize.py # pytest finds this file
├── helpers.py # pytest ignores this (no test_ prefix)

Inside test_models.py:

def test_note_stores_title() -> None:   # pytest runs this
...

def test_note_stores_body() -> None: # pytest runs this
...

def helper_create_note() -> Note: # pytest ignores this (no test_ prefix)
...

You do not register tests anywhere. You do not import them into a runner. Name the file and the function correctly, and pytest handles the rest.


Assertion Introspection: Smart Diffs

When a plain assert fails, pytest does not just say "assertion failed." It shows you the actual values on both sides:

def test_categorize_note_medium() -> None:
# Arrange
word_count: int = 500

# Act
result: str = categorize_by_count(word_count)

# Assert
assert result == "medium"

If categorize_by_count(500) returns "long" instead of "medium", pytest shows:

AssertionError: assert 'long' == 'medium'

For longer strings or lists, pytest highlights exactly which characters or elements differ. This is called assertion introspection: pytest inspects the assert statement at runtime and extracts the left and right values to show you a meaningful diff. You do not need special assertion methods like assertEqual. A plain assert gives you full diagnostic information.


Test Isolation: Each Test Runs Independently

Every test function runs in its own world. If test_a creates a Note and modifies it, test_b starts fresh. No test depends on another test running first. No test can corrupt another test's data.

def test_first() -> None:
note: Note = Note(title="Alpha", body="First", word_count=1)
assert note.title == "Alpha"


def test_second() -> None:
# This test does NOT see the note from test_first.
# It starts with a clean slate.
note: Note = Note(title="Beta", body="Second", word_count=1)
assert note.title == "Beta"

This means you can run tests in any order and get the same results. You can run a single test with uv run pytest test_models.py::test_second and it will pass regardless of whether test_first ran before it. Isolation is what makes large test suites reliable.


Refactoring a Flat Test into AAA

Here is a test from a Chapter 51 TDG cycle, written quickly:

def test_cat() -> None:
assert categorize_by_count(1500) == "long"

This works, but it violates two principles: the name is vague, and the three phases are compressed into one line. Refactored:

def test_categorize_by_count_above_thousand_returns_long() -> None:
# Arrange
word_count: int = 1500

# Act
result: str = categorize_by_count(word_count)

# Assert
assert result == "long"

The refactored version is longer but tells you exactly what is being tested, under what conditions, and what the expected outcome is. When this test fails six months from now, the name alone tells you where to look.


PRIMM-AI+ Practice: Structuring Tests

Predict [AI-FREE]

Look at this test without running it. Identify which lines belong to Arrange, Act, and Assert. Write your answers and a confidence score from 1 to 5 before checking.

def test_count_tags_multiple_tags_returns_correct_count() -> None:
note: Note = Note(title="Ideas", body="Some text", word_count=2)
tags: list[str] = ["python", "testing", "pytest"]
result: int = len(tags)
assert result == 3
assert note.title == "Ideas"

Questions:

  1. Which lines are Arrange?
  2. Which line is Act?
  3. Which lines are Assert?
  4. Is there a problem with this test's structure?
Check your predictions

Lines 2-3 are Arrange: creating the Note and the tags list.

Line 4 is Act: calling len(tags).

Lines 5-6 are Assert: checking the count and the title.

The structural problem: The second assert (note.title == "Ideas") tests something unrelated to the function being tested (len). This test is checking two different things. A well-structured test checks one behavior. Split this into two tests: one for tag counting, one for title storage.

Run

Create a file called test_aaa_practice.py. Write these three tests using the AAA pattern:

  1. test_note_stores_body: Create a Note, access .body, assert it matches.
  2. test_categorize_by_count_low_returns_short: Arrange a word count of 50, call categorize_by_count, assert "short".
  3. test_note_word_count_matches_input: Create a Note with word_count=42, access .word_count, assert it equals 42.

Run uv run pytest test_aaa_practice.py -v and confirm all three pass. The -v flag shows each test name, so you can verify your naming convention.

Investigate

Look at the verbose pytest output. For each test, confirm you can identify which phase failed if you changed the expected value. Try changing one assert to a wrong value and observe the assertion introspection output.

Modify

Rename one of your tests to test_thing(). Run pytest again. Notice how the output becomes less useful when the name is vague. Rename it back to the descriptive version.

Make [Mastery Gate]

Without looking at any examples, write a test file called test_aaa_mastery.py with four tests:

  1. A test for Note title storage (AAA structure, descriptive name)
  2. A test for Note body storage (AAA structure, descriptive name)
  3. A test for categorize_by_count returning "medium" (AAA structure, descriptive name)
  4. A test for categorize_by_count returning "long" (AAA structure, descriptive name)

All four must have clear Arrange, Act, Assert separation. All four must follow the test_<what>_<condition>_<expected> naming convention. Run uv run pytest test_aaa_mastery.py -v and confirm all pass.


Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Identify the Pattern

Here is one of my test functions:

def test_cat() -> None:
assert categorize_by_count(1500) == "long"

Refactor it into the Arrange-Act-Assert pattern with a
descriptive test name following the convention
test_<what>_<condition>_<expected>. Keep type annotations
on all variables.

Compare the AI's refactored version to the pattern taught in this lesson. Does it separate the three phases? Does the name tell you what the test checks?

What you're learning: You are evaluating whether AI-generated test code follows a structural pattern you now understand.

Prompt 2: Generate and Review

Write 3 test functions for a function called count_words(text: str) -> int
that counts words in a string. Use the AAA pattern. Name each test
using the test_<what>_<condition>_<expected> convention. Type-annotate
all variables.

Read each test the AI produces. Check: are the three phases clearly separated? Are the names descriptive? Is every variable type-annotated? If anything is off, tell the AI what to fix.

What you're learning: You are applying the AAA pattern as a review checklist for AI-generated tests.


Key Takeaways

  1. Arrange-Act-Assert gives every test a clear structure. Arrange sets up data, Act calls the function, Assert checks the result. When a test fails, you know which phase to investigate.

  2. Descriptive names tell you what failed without reading code. The convention test_<what>_<condition>_<expected> turns the test name into a sentence that describes the specification.

  3. pytest discovers tests automatically. Name your files test_*.py and your functions test_*, and pytest finds them. No registration, no configuration.

  4. Assertion introspection shows you the actual vs. expected values. A plain assert in pytest gives you a meaningful diff, not just "assertion failed."

  5. Test isolation means every test runs independently. No test depends on another test running first. You can run any test alone and get the same result.


Looking Ahead

You can now structure individual tests clearly. But you are still duplicating setup code across every test. In Lesson 2, you will learn @pytest.fixture, which lets you write setup code once and share it across every test that needs it.