Why Dataclasses Trust Blindly
James looks at his Note dataclass from Chapter 51. Six fields, all typed. title: str, body: str, word_count: int. He has been writing these type annotations for weeks now. They feel solid. Reliable. He trusts them.
Emma pulls up the REPL. "Try something for me. Create a Note where title is the number 999."
James frowns. "That would be wrong. Title is supposed to be a string."
"Try it."
Type annotations in Python are labels that describe what kind of data a variable should hold. But labels are not locks. Python reads the annotations but does not enforce them when the program runs. This lesson shows you exactly what that means.
In Java or C#, assigning an int to a String field causes a compile-time error. Python's type annotations are advisory, not enforced. Static checkers like pyright catch mismatches in your source code, but they cannot inspect data that arrives at runtime from files, APIs, or user input.
The Experiment: Note(title=999)
Remember the Note dataclass from Chapter 51, Lesson 5? Here it is:
from dataclasses import dataclass
@dataclass
class Note:
title: str
body: str
word_count: int
author: str
tags: list[str]
is_draft: bool
Now create a Note with obviously wrong data:
broken_note = Note(
title=999,
body=42,
word_count="many",
author=True,
tags="not-a-list",
is_draft="yes",
)
print(broken_note)
Output:
Note(title=999, body=42, word_count='many', author=True, tags='not-a-list', is_draft='yes')
No error. No warning. No crash. Python created the object, stored every wrong value, and printed it without complaint. The type annotations say title: str, but Python did not check. It trusted you.
Why Does Python Allow This?
Python's type annotations are metadata. They exist for two audiences:
- Humans reading the code, who see
title: strand know what to expect - Static type checkers like pyright, which analyze the code before it runs
Neither audience is Python itself. At runtime, Python's @dataclass decorator generates an __init__ method that assigns whatever values you pass. It does not insert isinstance checks. It does not validate. It trusts the caller completely.
Pyright Catches It (But Only in Your Code)
Open your terminal and run pyright on a file containing the broken Note:
uv run pyright broken_note.py
Pyright flags every field:
error: Argument of type "int" cannot be assigned to parameter "title" of type "str"
error: Argument of type "int" cannot be assigned to parameter "body" of type "str"
error: Argument of type "str" cannot be assigned to parameter "word_count" of type "int"
Pyright catches the problem because it can see the literal values 999, 42, "many" in your source code. It knows their types and compares them against the annotations.
But here is the gap: pyright analyzes source code. It cannot analyze data that arrives at runtime.
The Gap: External Data
When data comes from a JSON file, a web request, or user input, pyright cannot see it. Consider this code:
import json
with open("notes.json") as f:
raw_data = json.load(f) # returns parsed JSON (list of dicts)
for item in raw_data:
note = Note(
title=item["title"], # Could be anything
body=item["body"], # Could be anything
word_count=item["word_count"], # Could be "many"
author=item["author"],
tags=item["tags"],
is_draft=item["is_draft"],
)
Every value comes from item, which is a dictionary loaded from JSON. Pyright sees item["title"] and knows the type is object. It cannot know whether the JSON file actually contains a string, an integer, or null. The data is invisible until the program runs.
This is the trust gap. Pyright protects you from mistakes in your own code. It cannot protect you from mistakes in external data. For that, you need runtime validation.
What You Had Before: Manual Validation
In Chapter 54, Lesson 5, you wrote validate_note_data: 30+ lines of isinstance checks, one per field, each with its own raise TypeError or raise ValueError. That function fills the gap. It checks types at runtime, before the data reaches the dataclass constructor.
The manual approach works. But it is verbose, repetitive, and fragile. Every new field requires another block of checks. Every renamed field requires updates in multiple places.
Chapter 55 introduces a tool that fills the same gap with far less code.
PRIMM-AI+ Practice: Trust Gap
Predict [AI-FREE]
Press Shift+Tab to enter Plan Mode before predicting.
Look at this code without running it. Predict what happens. Write your prediction and a confidence score from 1 to 5 before checking.
from dataclasses import dataclass
@dataclass
class Config:
name: str
version: int
debug: bool
config = Config(name=42, version="three", debug=0)
print(config.name)
print(type(config.name))
Check your prediction
Output:
42
<class 'int'>
No error. Python prints 42 and confirms the type is int, even though the annotation says str. The @dataclass decorator does not validate types. The annotation is a label, not a constraint.
If you predicted an error, your intuition is reasonable but incorrect for Python dataclasses. That intuition will serve you well in Lesson 2, where Pydantic models DO enforce types.
Run
Press Shift+Tab to exit Plan Mode.
Create a file called trust_gap.py with the code above. Run uv run python trust_gap.py. Compare the output to your prediction. Then run uv run pyright trust_gap.py and observe the errors pyright reports.
Investigate
In Claude Code, type /investigate @trust_gap.py and ask why the dataclass accepted wrong types. Then add this line after the print statements:
print(config.name.upper())
Predict what happens. Then run it. The int type has no .upper() method, so Python raises an AttributeError. This is the real danger: the wrong type sits quietly until code tries to use it. The error appears far from where the bad data entered.
Modify
Change debug=0 to debug=1. Does config.debug behave like True in an if statement? Run this:
if config.debug:
print("Debug mode on")
The integer 1 is truthy, so the if block runs. But type(config.debug) is int, not bool. Predict what isinstance(config.debug, bool) returns (remember: bool is a subclass of int in Python).
Make [Mastery Gate]
Without looking at any examples, create a @dataclass called Product with fields name: str, price: float, and in_stock: bool. Create an instance with wrong types for all three fields. Write three assert statements that prove Python accepted the wrong types:
assert type(product.name) is not str
assert type(product.price) is not float
assert type(product.in_stock) is not bool
Run the file. All three asserts should pass, confirming that dataclasses do not enforce types at runtime.
Try With AI
If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.
Prompt 1: Explore the Trust Gap
I created a Python dataclass with title: str and then
passed title=999. Python accepted it without any error.
Why does Python allow this? When would this cause a
real problem in a program?
Read the AI's response carefully. It should explain that type annotations are not enforced at runtime. Compare its explanation to what you observed in this lesson.
What you're learning: You are confirming your understanding of the trust gap by asking the AI to explain it independently.
Prompt 2: Static vs Runtime
What is the difference between static type checking
(like pyright) and runtime type checking in Python?
Give me one example where static checking catches a
bug and one example where only runtime checking can
catch it.
Review the AI's examples. The static example should involve literal values in source code. The runtime example should involve data from an external source (file, API, user input). This maps to the two scenarios in this lesson.
What you're learning: You are building a mental model for when each kind of checking applies.
Prompt 3: Identify Fields in Your Domain
I work in [your field, e.g., logistics, healthcare,
education, finance]. List five data fields I might
receive from an external source (file, form, or API).
For each field, tell me: what type it should be, what
could go wrong if the type is wrong, and whether a
dataclass would catch the problem at runtime.
Replace [your field] with your actual profession or area of interest. Read the AI's response and check whether it correctly identifies that dataclasses would NOT catch any of these problems. This connects the trust gap to data you actually work with.
What you're learning: You are applying the trust gap concept to your own domain, identifying where runtime validation would protect your real-world data.
James frowned at the screen. "So dataclasses are like a warehouse that accepts every delivery without inspection. The truck shows up, the dock worker reads the label, puts it on the shelf. Nobody opens the box to check if the label matches what's inside."
"That's it," Emma said. "The label says 'title: str' but the box contains an integer. The dataclass just shelves it."
"And pyright is the purchasing department reviewing the purchase orders before they ship. If the PO says 'send us strings' and the vendor writes 'sending integers,' pyright catches it. But if the vendor sends an unlabeled box with no PO..."
"Pyright has nothing to compare against. That's the trust gap. Your own code, pyright can check. External data, it can't."
James drummed his fingers on the desk. "How common is this in practice? Like, do real applications actually get integer titles from JSON files?"
Emma tilted her head. "I'm not sure how to quantify it. I've seen it happen with API responses, user-submitted forms, config files someone edited by hand. The exact frequency varies. But the pattern is always the same: data crosses a boundary, types stop being guaranteed, and something downstream breaks in a confusing way."
"So we need a dataclass that inspects the delivery. Opens the box, checks the contents, rejects the shipment if it doesn't match."
"That's exactly what Pydantic's BaseModel does. Same field declarations you already know, but it validates types the moment you create an instance. If the title is an integer, it rejects it on the spot."