The class Keyword and @dataclass
James stares at the three problems from Lesson 1: typos crash at runtime, pyright cannot warn about bad keys, and the editor offers no autocomplete for dictionary fields. He needs a way to define the shape of a note so that tools can help him.
"You already know the building blocks," Emma says. "Lesson 2 taught you what @ does. Now you are going to use it on something new."
She types two words on the screen: class Note.
James tilts his head. "I have seen class in code before. Chapter 43 had examples with classes, and type() in Chapter 47 told me that 42 is an int and "hello" is a str. Are those classes?"
"Exactly," Emma says. "You have been using classes since your first line of Python. Now you are going to define your own."
A class is a blueprint that describes a kind of data. An instance is a specific piece of data built from that blueprint. You already use classes every day: int is a class, and 42 is an instance of it. This lesson teaches you to create your own classes using @dataclass, which handles the tedious setup automatically.
Python's @dataclass is similar to Kotlin's data class, Java's record, or C#'s record struct. It auto-generates __init__, __repr__, and __eq__. The class keyword here defines a data-holding type, not a full OOP class with inheritance and polymorphism. Those topics come later.
Classes and Instances: A Mental Model
You have been working with classes and instances since Chapter 47, Lesson 1, when you used type() to inspect values. Here is the pattern:
| Class | Instance | How you create it |
|---|---|---|
int | 42 | int(42) or just 42 |
str | "hello" | str("hello") or just "hello" |
list | [1, 2, 3] | list([1, 2, 3]) or just [1, 2, 3] |
Note | note = Note(...) | You will write this today |
A class defines the structure. An instance is a concrete value that follows that structure. When you write x: int = 42, the class is int and the instance is 42. When you write name: str = "James", the class is str and the instance is "James".
The only difference with Note is that you define the class yourself instead of using one built into Python.
The class Keyword
Here is the simplest possible class definition:
from dataclasses import dataclass
@dataclass
class Note:
title: str
body: str
Let's break this down line by line:
from dataclasses import dataclass: Imports thedataclassdecorator from Python's standard library. This is a one-time setup line at the top of your file.@dataclass: The decorator from Lesson 2. It passes theNoteclass through thedataclass()function, which adds useful methods automatically.class Note:: Theclasskeyword creates a new type calledNote. The colon starts an indented block, just likedefandif.title: strandbody: str: These are fields. Each field has a name and a type annotation. They describe the data that everyNoteinstance must contain.
Creating Instances
Once the class exists, you create instances by calling it like a function:
from dataclasses import dataclass
@dataclass
class Note:
title: str
body: str
my_note: Note = Note(title="Meeting Notes", body="Discussed timeline.")
print(my_note)
Output:
Note(title='Meeting Notes', body='Discussed timeline.')
The line Note(title="Meeting Notes", body="Discussed timeline.") creates a new instance of Note. You pass values for each field as keyword arguments. The result is stored in my_note, which has type Note.
Attribute Access with Dot Notation
Instead of note["title"] (the dict approach from Lesson 1), you use dot notation:
print(my_note.title) # Meeting Notes
print(my_note.body) # Discussed timeline.
This is where the three problems from Lesson 1 disappear:
- Typo protection: If you type
my_note.titel, pyright flags it immediately as an error. The class defines exactly which attributes exist. - Autocomplete: When you type
my_note.in your editor, it suggeststitleandbody. No guessing. - No silent corruption: You cannot accidentally add a misspelled field.
my_note.staus = "draft"produces a pyright error becausestausis not a defined field.
What @dataclass Generates
Remember from Lesson 2: a decorator passes your code through a function that modifies it. The @dataclass decorator reads your field definitions and automatically generates three special methods.
Python uses names with double underscores (like __init__, __repr__, __eq__) for special methods. These are called dunder methods (short for "double underscore"). They are not meant to be called directly. Instead, Python calls them behind the scenes when you create an instance, print it, or compare two instances. The @dataclass decorator generates these methods for you, so you rarely need to write them by hand.
Here is what @dataclass generates for the Note class above:
# @dataclass generates something equivalent to this:
def __init__(self, title: str, body: str) -> None:
self.title = title
self.body = body
def __repr__(self) -> str:
return f"Note(title={self.title!r}, body={self.body!r})"
def __eq__(self, other: object) -> bool:
if not isinstance(other, Note):
return NotImplemented
return self.title == other.title and self.body == other.body
__init__: Called when you writeNote(title="...", body="..."). It stores each argument on the instance.__repr__: Called when youprint()an instance. It produces the readable output you saw above.__eq__: Called when you compare two instances with==. Two notes are equal if all their fields match.
Without @dataclass, you would need to write all three methods by hand. For a class with six fields, that is roughly 20 lines of repetitive code. The decorator handles it in zero lines.
When Python calls a method on an instance, it passes the instance as the first argument, named self. When you write my_note.title, Python looks up the title attribute that __init__ stored on self. You do not call __init__ directly; Python calls it for you when you create an instance with Note(...). The @dataclass decorator generates these methods, so you rarely need to think about self at this stage.
The @dataclass Equivalence
Applying what you learned in Lesson 2, the shorthand:
@dataclass
class Note:
title: str
body: str
Is equivalent to:
class Note:
title: str
body: str
Note = dataclass(Note)
The dataclass() function receives the Note class, reads its field annotations, generates __init__, __repr__, and __eq__, attaches them to the class, and returns the modified class. The @ shorthand does this in one line instead of two.
Comparing Instances
Because @dataclass generates __eq__, you can compare instances directly:
note_a: Note = Note(title="Plan", body="Ship feature by Friday.")
note_b: Note = Note(title="Plan", body="Ship feature by Friday.")
note_c: Note = Note(title="Plan", body="Different body text.")
print(note_a == note_b) # True (all fields match)
print(note_a == note_c) # False (body differs)
This is essential for testing. You can write assert result == expected where both sides are dataclass instances, and the comparison checks every field automatically.
PRIMM-AI+ Practice: Building a Dataclass
Predict [AI-FREE]
Read this code without running it. Write down what print(book) will output. Rate your confidence from 1 to 5.
from dataclasses import dataclass
@dataclass
class Book:
title: str
author: str
pages: int
book: Book = Book(title="Python Basics", author="Emma", pages=320)
print(book)
Check your prediction
Output:
Book(title='Python Basics', author='Emma', pages=320)
The @dataclass decorator generates a __repr__ method that prints the class name followed by each field and its value. The format is always ClassName(field1=value1, field2=value2, ...).
Run
Create a file called book_model.py with the code above. Run uv run python book_model.py and compare the output to your prediction.
Investigate
Try accessing an attribute that does not exist:
print(book.publisher)
Run uv run pyright book_model.py before running the file. Does pyright catch the error? Compare this to Lesson 1, where pyright reported zero warnings for a misspelled dictionary key.
Modify
Add a genre field of type str to the Book class. Create a new instance with all four fields. Print it and verify the output includes the new field.
Make [Mastery Gate]
Without looking at any examples, define a dataclass called Task with three fields: description (str), assignee (str), and priority (int). Create two instances with the same field values and one instance with different values. Write three assert statements:
- The two identical instances are equal (
==) - The identical instance is not equal to the different instance (
!=) - Accessing
.descriptionon the first instance returns the expected string
Run uv run python to verify all assertions pass.
Try With AI
If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.
Prompt 1: Explain the Generated Methods
I defined this dataclass:
@dataclass
class Note:
title: str
body: str
What methods does @dataclass generate? Show me what the
equivalent hand-written class would look like without
using @dataclass.
Compare the AI's response to the equivalence shown in this lesson. Does it generate the same three methods? Are there any differences in the implementation?
What you're learning: You are verifying that your understanding of what @dataclass generates matches reality. If the AI shows additional methods, note them but do not worry about memorizing them yet.
Prompt 2: Convert a Dict to a Dataclass
Here is a function that uses a dictionary:
def create_note(title: str, body: str) -> dict[str, str]:
return {"title": title, "body": body, "status": "draft"}
Rewrite this using a @dataclass instead of a dict. Keep
all type annotations.
Review the AI's output. Does the dataclass have the same fields as the dictionary keys? Does it use the @dataclass decorator correctly? Try running the code to verify it works.
What you're learning: You are seeing the direct translation from dict-based code (Lesson 1's problem) to dataclass-based code (Lesson 3's solution). This previews the full SmartNotes transformation in Lesson 5.
Key Takeaways
-
A class is a blueprint; an instance is a concrete value.
intis a class,42is an instance.Noteis a class,Note(title="Plan", body="...")is an instance. -
@dataclassgenerates__init__,__repr__, and__eq__automatically. You define the fields; the decorator writes the repetitive code. -
Dot notation replaces bracket notation.
note.titleinstead ofnote["title"]. This gives you pyright checking, autocomplete, and protection against silent corruption. -
Dunder methods are special methods Python calls behind the scenes. Names like
__init__and__repr__follow the double-underscore convention. You rarely call them directly. -
selfis how an instance refers to itself inside methods.@dataclassgenerates methods that useselfto store and access fields. You do not need to write these methods yourself. -
Two dataclass instances with the same field values are equal. The generated
__eq__method compares every field, which makes assertions in tests clean and readable.
Looking Ahead
You can now define data structures that pyright checks and your editor autocompletes. In Lesson 4, you will learn how to set default values on fields, prevent a sneaky shared-list bug, and make instances immutable with frozen=True.