Skip to main content

Chapter 55: Validation with Pydantic

James opens the validate_note_data function from Chapter 54. Thirty-two lines of isinstance checks, raise TypeError, raise ValueError. Six fields, twelve tests. He scrolls through it again and shakes his head. "If I add one more field, I need another five lines of validation and two more tests."

Emma pulls up a new file. "Watch this." She types six lines:

from pydantic import BaseModel, Field


class NoteCreate(BaseModel):
title: str = Field(min_length=1, max_length=200)
body: str = Field(min_length=1)
word_count: int = Field(ge=0)
author: str
tags: list[str]
is_draft: bool

"That replaces your entire validation function," she says.

James counts the lines. Six field declarations. No isinstance. No if checks. No raise. He looks at the 30-line function, then back at the six-line model. "Where did all the validation go?"

"Into the type annotations," Emma says. "Pydantic reads them and generates the validation for you. Same rules, less code, better error messages."

She pauses. "I'm still not sure when to use dataclasses vs Pydantic for internal data. We'll figure that out together in this chapter. But one thing is clear: when data arrives from outside your program, Pydantic is the right tool."

Why Pydantic Matters for Specification

In Chapter 54, you learned to validate data manually. Every field needed an isinstance check, a bounds check, and a raise statement. The validation was correct but verbose. Pydantic replaces that verbosity with declarations: you describe what valid data looks like, and Pydantic enforces the rules at runtime.

This connects directly to the specification mindset from Chapter 49. A function signature declares WHAT goes in and WHAT comes out. A Pydantic model declares what valid INPUT looks like. Together, they form a complete contract: the model validates the data, and the function processes it.

Testing changes too. Instead of writing twelve tests for twelve isinstance checks, you write tests that verify Pydantic catches invalid data. The tests focus on WHAT should be rejected, not HOW the rejection happens.

What You Will Learn

By the end of this chapter, you will be able to:

  • Explain why dataclasses trust input blindly and why that matters for external data
  • Define Pydantic BaseModel classes that validate types automatically on instantiation
  • Add Field constraints (min_length, max_length, gt, ge) to enforce value rules
  • Serialize models with model_dump() and parse JSON with model_validate_json()
  • Apply the boundary pattern: Pydantic at the edges, dataclasses inside

Chapter Lessons

LessonTitleWhat You DoDuration
1Why Dataclasses Trust BlindlyPass wrong types to a dataclass, see that Python does not complain, understand the trust gap20 min
2BaseModel: Runtime Validation from TypesInstall Pydantic, define a BaseModel, see automatic validation, compare to manual code25 min
3Field Constraints and Error MessagesAdd min_length, max_length, gt, ge constraints, catch ValidationError, read structured errors20 min
4Serialization and the Boundary PatternConvert models to dicts and JSON, parse JSON directly, apply the boundary pattern20 min
5SmartNotes Boundary TDGComplete a TDG with NoteCreate at the boundary, @dataclass Note inside, full test suite15 min
6Chapter 55 Quiz50 scenario-based questions covering all Pydantic concepts25 min

PRIMM-AI+ in This Chapter

Every lesson includes a PRIMM-AI+ Practice section following the five-stage cycle from Chapter 42. This is Phase 3: you are now WRITING validation models, building on the error handling (Chapter 54) and dataclass patterns (Chapter 51) you already own.

StageWhat You DoWhat It Builds
Predict [AI-FREE]Predict whether Pydantic accepts or rejects given input, with a confidence score (1-5)Calibrates your validation intuition
RunExecute the code or run pytest, compare to your predictionCreates the feedback loop
InvestigateInspect the ValidationError output and trace which field and constraint triggered itMakes your validation reasoning visible
ModifyChange a constraint or input value and predict the new resultTests whether your understanding transfers
Make [Mastery Gate]Write a Pydantic model from scratch with constraints and tests for every ruleProves you can specify validation independently

Syntax Card: Chapter 55

Reference this card while working through the lessons. Every construct shown here appears in at least one lesson.

# -- BaseModel (replaces manual validation) --------------------
from pydantic import BaseModel, Field, ValidationError

class NoteCreate(BaseModel):
title: str = Field(min_length=1, max_length=200)
body: str = Field(min_length=1)
word_count: int = Field(ge=0)
author: str
tags: list[str] = [] # Safe default (no default_factory needed)
is_draft: bool = False

# -- Instantiation (validates automatically) -------------------
note = NoteCreate(
title="My Note",
body="Content here.",
word_count=42,
author="James",
)
# tags defaults to [], is_draft defaults to False

# -- Catching validation errors --------------------------------
try:
bad = NoteCreate(title="", body="x", word_count=-1, author="J")
except ValidationError as e:
print(e) # Structured, multi-field error report

# -- Serialization ---------------------------------------------
note.model_dump() # Returns a dict
note.model_dump_json() # Returns a JSON string

# -- Parsing JSON directly -------------------------------------
json_string: str = '{"title": "Hi", "body": "x", "word_count": 1, "author": "J"}'
parsed = NoteCreate.model_validate_json(json_string)

# -- Boundary pattern ------------------------------------------
# External Data -> [NoteCreate (BaseModel)] -> [Note (@dataclass)] -> App Logic

Prerequisites

Before starting this chapter, you should be able to:

  • Define @dataclass classes with typed fields and defaults (Chapter 51)
  • Write try/except blocks and catch specific exception types (Chapter 54 Lesson 1)
  • Write manual validation functions with isinstance checks (Chapter 54 Lesson 5)
  • Complete full TDG (Type-Driven Generation) cycles: stub, test, generate (Chapter 46)
  • Load JSON files with with open and json.load (Chapter 54 Lesson 4)

The SmartNotes Connection

At the end of this chapter, you will build the boundary layer for SmartNotes. External data (JSON files, user input) enters through a NoteCreate Pydantic model that validates every field. Once validated, the data converts to your familiar @dataclass Note for use inside the application. This separation keeps validation at the edges and clean data structures at the core. You will write tests for valid input, invalid types, constraint violations, missing fields, and malformed JSON.