Chapter 55: Validation with Pydantic
James opens the validate_note_data function from Chapter 54. Thirty-two lines of isinstance checks, raise TypeError, raise ValueError. Six fields, twelve tests. He scrolls through it again and shakes his head. "If I add one more field, I need another five lines of validation and two more tests."
Emma pulls up a new file. "Watch this." She types six lines:
from pydantic import BaseModel, Field
class NoteCreate(BaseModel):
title: str = Field(min_length=1, max_length=200)
body: str = Field(min_length=1)
word_count: int = Field(ge=0)
author: str
tags: list[str]
is_draft: bool
"That replaces your entire validation function," she says.
James counts the lines. Six field declarations. No isinstance. No if checks. No raise. He looks at the 30-line function, then back at the six-line model. "Where did all the validation go?"
"Into the type annotations," Emma says. "Pydantic reads them and generates the validation for you. Same rules, less code, better error messages."
She pauses. "I'm still not sure when to use dataclasses vs Pydantic for internal data. We'll figure that out together in this chapter. But one thing is clear: when data arrives from outside your program, Pydantic is the right tool."
Why Pydantic Matters for Specification
In Chapter 54, you learned to validate data manually. Every field needed an isinstance check, a bounds check, and a raise statement. The validation was correct but verbose. Pydantic replaces that verbosity with declarations: you describe what valid data looks like, and Pydantic enforces the rules at runtime.
This connects directly to the specification mindset from Chapter 49. A function signature declares WHAT goes in and WHAT comes out. A Pydantic model declares what valid INPUT looks like. Together, they form a complete contract: the model validates the data, and the function processes it.
Testing changes too. Instead of writing twelve tests for twelve isinstance checks, you write tests that verify Pydantic catches invalid data. The tests focus on WHAT should be rejected, not HOW the rejection happens.
What You Will Learn
By the end of this chapter, you will be able to:
- Explain why dataclasses trust input blindly and why that matters for external data
- Define Pydantic BaseModel classes that validate types automatically on instantiation
- Add Field constraints (min_length, max_length, gt, ge) to enforce value rules
- Serialize models with model_dump() and parse JSON with model_validate_json()
- Apply the boundary pattern: Pydantic at the edges, dataclasses inside
Chapter Lessons
| Lesson | Title | What You Do | Duration |
|---|---|---|---|
| 1 | Why Dataclasses Trust Blindly | Pass wrong types to a dataclass, see that Python does not complain, understand the trust gap | 20 min |
| 2 | BaseModel: Runtime Validation from Types | Install Pydantic, define a BaseModel, see automatic validation, compare to manual code | 25 min |
| 3 | Field Constraints and Error Messages | Add min_length, max_length, gt, ge constraints, catch ValidationError, read structured errors | 20 min |
| 4 | Serialization and the Boundary Pattern | Convert models to dicts and JSON, parse JSON directly, apply the boundary pattern | 20 min |
| 5 | SmartNotes Boundary TDG | Complete a TDG with NoteCreate at the boundary, @dataclass Note inside, full test suite | 15 min |
| 6 | Chapter 55 Quiz | 50 scenario-based questions covering all Pydantic concepts | 25 min |
PRIMM-AI+ in This Chapter
Every lesson includes a PRIMM-AI+ Practice section following the five-stage cycle from Chapter 42. This is Phase 3: you are now WRITING validation models, building on the error handling (Chapter 54) and dataclass patterns (Chapter 51) you already own.
| Stage | What You Do | What It Builds |
|---|---|---|
| Predict [AI-FREE] | Predict whether Pydantic accepts or rejects given input, with a confidence score (1-5) | Calibrates your validation intuition |
| Run | Execute the code or run pytest, compare to your prediction | Creates the feedback loop |
| Investigate | Inspect the ValidationError output and trace which field and constraint triggered it | Makes your validation reasoning visible |
| Modify | Change a constraint or input value and predict the new result | Tests whether your understanding transfers |
| Make [Mastery Gate] | Write a Pydantic model from scratch with constraints and tests for every rule | Proves you can specify validation independently |
Syntax Card: Chapter 55
Reference this card while working through the lessons. Every construct shown here appears in at least one lesson.
# -- BaseModel (replaces manual validation) --------------------
from pydantic import BaseModel, Field, ValidationError
class NoteCreate(BaseModel):
title: str = Field(min_length=1, max_length=200)
body: str = Field(min_length=1)
word_count: int = Field(ge=0)
author: str
tags: list[str] = [] # Safe default (no default_factory needed)
is_draft: bool = False
# -- Instantiation (validates automatically) -------------------
note = NoteCreate(
title="My Note",
body="Content here.",
word_count=42,
author="James",
)
# tags defaults to [], is_draft defaults to False
# -- Catching validation errors --------------------------------
try:
bad = NoteCreate(title="", body="x", word_count=-1, author="J")
except ValidationError as e:
print(e) # Structured, multi-field error report
# -- Serialization ---------------------------------------------
note.model_dump() # Returns a dict
note.model_dump_json() # Returns a JSON string
# -- Parsing JSON directly -------------------------------------
json_string: str = '{"title": "Hi", "body": "x", "word_count": 1, "author": "J"}'
parsed = NoteCreate.model_validate_json(json_string)
# -- Boundary pattern ------------------------------------------
# External Data -> [NoteCreate (BaseModel)] -> [Note (@dataclass)] -> App Logic
Prerequisites
Before starting this chapter, you should be able to:
- Define
@dataclassclasses with typed fields and defaults (Chapter 51) - Write
try/exceptblocks and catch specific exception types (Chapter 54 Lesson 1) - Write manual validation functions with isinstance checks (Chapter 54 Lesson 5)
- Complete full TDG (Type-Driven Generation) cycles: stub, test, generate (Chapter 46)
- Load JSON files with
with openandjson.load(Chapter 54 Lesson 4)
The SmartNotes Connection
At the end of this chapter, you will build the boundary layer for SmartNotes. External data (JSON files, user input) enters through a NoteCreate Pydantic model that validates every field. Once validated, the data converts to your familiar @dataclass Note for use inside the application. This separation keeps validation at the edges and clean data structures at the core. You will write tests for valid input, invalid types, constraint violations, missing fields, and malformed JSON.