Skip to main content
Updated Feb 26, 2026

Introduction to Pydantic and Data Validation

The Validation Problem

Flow diagram showing Pydantic validation process from raw input data through BaseModel validation layer with type checking and constraints, to validated output or ValidationError exceptions

Imagine you're building an AI agent that accepts user data. A user registers with their name, email, and age. But what happens if someone submits:

{
"name": "Alice",
"email": "not-an-email",
"age": "twenty-five"
}

Your code might crash silently, store invalid data, or worse—send bad data to your AI, which then generates incorrect responses. The problem: Python's type hints only document what SHOULD be there, but don't enforce what IS actually there at runtime.

This is where Pydantic enters the game. Pydantic is a library that validates data at runtime—it checks that your data actually matches your requirements before your code uses it. Type hints say "this SHOULD be an int"; Pydantic makes it "this MUST be an int or validation fails."

Why This Matters for AI-Native Development

When Claude Code generates JSON for you, you need to validate it's correct BEFORE using it. When you build APIs with FastAPI, Pydantic automatically validates every request. When you load configuration files, Pydantic ensures they're valid. In production systems, validation is not optional—it's your safety net.


Section 1: Your First Pydantic Model

Installing Pydantic

Like any Python library, Pydantic needs to be installed first. You've already learned this pattern in Chapter 17 with uv:

uv add pydantic

This installs Pydantic V2 (the modern version). Pydantic V1 is deprecated—always use V2.

Creating Your First Model: A Book

Let's start simple. Imagine you're building a library application that stores books. Each book has:

  • title (text, required)
  • author (text, required)
  • year (whole number, required, between 1000-2100)
  • price (decimal number, required, must be >= 0)
  • isbn (text, optional)

With Pydantic, you describe this structure in code:

from pydantic import BaseModel

class Book(BaseModel):
title: str
author: str
year: int
price: float
isbn: str | None = None # Optional field with default value

That's it. You've created a Pydantic model. Now let's use it.

Validation Happens Automatically

Creating a valid book works exactly as you'd expect:

# Valid data - no errors
book: Book = Book(
title="Python Guide",
author="Jane Doe",
year=2024,
price=29.99
)

print(book)
# Output: Book(title='Python Guide', author='Jane Doe', year=2024, price=29.99, isbn=None)

# Access fields like normal attributes
print(book.title) # Output: Python Guide
print(book.price) # Output: 29.99

But try passing invalid data:

from pydantic import ValidationError

try:
bad_book = Book(
title="Test Book",
author="Author",
year="not a year", # ERROR: should be int, got str
price=-10 # ERROR: must be >= 0
)
except ValidationError as e:
print(e)

Output (showing what validation catches):

2 validation errors for Book
year
Input should be a valid integer [type=int_type, input_value='not a year', input_type=str]
price
Input should be greater than or equal to 0 [type=greater_than_equal, input_value=-10, input_type=float]

Pydantic caught BOTH errors at once. This is powerful—you don't have to debug one error, fix it, then discover another. You see everything that's wrong.

💬 AI Colearning Prompt

"What happens when you pass a string to an int field in Pydantic? Explain the validation error and what type coercion means."


Section 2: Understanding Validation Errors

Reading ValidationError Messages

Pydantic's error messages are designed to help you. Let's break down what you're seeing:

from pydantic import BaseModel, ValidationError

class User(BaseModel):
name: str
age: int
email: str

try:
user = User(
name="Bob",
age="thirty", # Error 1: not an int
email="bob@example" # Error 2: doesn't look like email
)
except ValidationError as e:
# Print full error details
print(e)

# Or access error details programmatically
for error in e.errors():
print(f"Field: {error['loc']}") # Which field?
print(f"Problem: {error['msg']}") # What's wrong?
print(f"Type: {error['type']}") # What type of error?

This gives you:

  • loc (location): Which field has the problem?
  • msg (message): What's wrong in plain English?
  • type (type of error): Was it a type mismatch? A constraint violation? A format issue?

🎓 Expert Insight

In AI-native development, type hints document intent but Pydantic enforces it. When AI agents generate JSON or APIs send data, runtime validation catches mismatches before they corrupt your system. This isn't defensive programming—it's professional practice.

Multiple Errors at Once

One of Pydantic's superpowers is reporting ALL validation problems simultaneously. This saves debugging time:

try:
bad_user: User = User(
name=123, # Error: not a string
age="not a number", # Error: not an int
email="missing-at-sign" # Error: invalid format
)
except ValidationError as e:
# Shows all 3 errors at once
print(f"Found {len(e.errors())} validation errors")
for error in e.errors():
print(f" - {error['loc'][0]}: {error['msg']}")

Section 3: Nested Models

Real Data Is Complex

So far we've created flat models with simple fields. But real data is hierarchical. A Book might have an Author, and an Author has multiple attributes:

from pydantic import BaseModel

class Author(BaseModel):
name: str
bio: str

class Book(BaseModel):
title: str
authors: list[Author] # List of Author objects!
publication_date: str

Notice authors: list[Author]—this is a list of Author models. Pydantic validates each Author in the list.

Using Nested Models

Creating a book with authors:

# Method 1: Create Author objects first
author1: Author = Author(name="Alice Smith", bio="Python expert")
author2: Author = Author(name="Bob Johnson", bio="Data scientist")

book: Book = Book(
title="Advanced Python",
authors=[author1, author2],
publication_date="2024-01-15"
)

# Method 2: Pass dictionaries - Pydantic converts them
book2: Book = Book(
title="Web Development",
authors=[
{"name": "Charlie Brown", "bio": "Full-stack developer"},
{"name": "Diana Prince", "bio": "Frontend specialist"}
],
publication_date="2024-03-20"
)

# Serialize back to dictionary for APIs or storage
print(book.model_dump())
# Output: {
# 'title': 'Advanced Python',
# 'authors': [
# {'name': 'Alice Smith', 'bio': 'Python expert'},
# {'name': 'Bob Johnson', 'bio': 'Data scientist'}
# ],
# 'publication_date': '2024-01-15'
# }

Validation happens at all levels. If an Author's name is missing, Pydantic catches it:

try:
bad_book: Book = Book(
title="Test",
authors=[
{"name": "Valid Author", "bio": "Good"},
{"bio": "Missing name!"} # ERROR: name is required
],
publication_date="2024-01-01"
)
except ValidationError as e:
print(e)
# Shows error in nested structure:
# authors.1.name: Field required

🤝 Practice Exercise

Ask your AI: "Create an Author model with name and bio fields. Then create a Book model that contains a single author field (not a list—just one Author). Generate code that creates a Book with a nested Author and demonstrates the validation error when author data is missing."

Expected Outcome: You'll see working nested model structure and understand how Pydantic validates nested fields, catching missing required fields at any level of nesting.


Section 4: Common Mistakes

Mistake 1: Forgetting BaseModel

Pydantic models must inherit from BaseModel:

# WRONG - just a regular class, no validation
class Book: # Missing: BaseModel
title: str
author: str

book: Book = Book(title="Test", author="Author")
# This works but does NO validation!

# CORRECT - inherits from BaseModel
from pydantic import BaseModel

class Book(BaseModel): # Inherits validation
title: str
author: str

book: Book = Book(title="Test", author="Author")
# Now validation works

Mistake 2: Not Handling ValidationError

If you don't catch ValidationError, your program crashes:

# WRONG - will crash if data is invalid
book: Book = Book(title="Test", author=123) # Crash!

# CORRECT - handle the error gracefully
try:
book: Book = Book(title="Test", author=123)
except ValidationError as e:
print(f"Invalid data: {e}")
# Program continues, user sees helpful message

Mistake 3: Mixing Up Type Hints

Type hints must be precise. list is different from list[str]:

# Ambiguous - what's in the list?
tags: list # Could contain anything

# Precise - list of strings
tags: list[str] # Validates each item is a string

class Post(BaseModel):
title: str
tags: list[str] # Pydantic validates each tag

# Valid
post: Post = Post(title="AI", tags=["python", "pydantic"])

# Invalid - number in a list that should contain strings
try:
post: Post = Post(title="AI", tags=["python", 123]) # ERROR
except ValidationError as e:
print(e) # tags.1: Expected string, got int

Try With AI

Apply Pydantic data validation through AI collaboration that builds type-safe application skills.

🔍 Explore Validation Pain:

"Compare manual validation for user registration (username 3-20 chars, email with @, age 13-120) versus Pydantic BaseModel with Field() constraints. Show why runtime validation matters beyond type hints."

🎯 Practice Field Constraints:

"Build a User model with Pydantic validating: username (pattern r'^[a-z0-9_]+$'), email (@field_validator for domain check), age (ge=13, le=120), optional bio (max 200 chars). Handle ValidationError."

🧪 Test Edge Cases:

"Test Pydantic model with: '25' (string as int), 'test@localhost' (no domain dot), 120.5 (float as int), 201-char bio. Show how Pydantic coerces types and where custom validators are needed."

🚀 Apply Production Patterns:

"Create a complete user validation system with Pydantic showing: all errors at once (not first-fail), clear error messages, type coercion (str → int), custom validators, and explain when to use Field() vs @field_validator."