Skip to main content

The class Keyword and @dataclass

James stares at the three problems from Lesson 1: typos crash at runtime, pyright cannot warn about bad keys, and the editor offers no autocomplete for dictionary fields. He needs a way to define the shape of a note so that tools can help him.

"You already know the building blocks," Emma says. "Lesson 2 taught you what @ does. Now you are going to use it on something new."

She types two words on the screen: class Note.

James tilts his head. "I have seen class in code before. Chapter 43 had examples with classes, and type() in Chapter 47 told me that 42 is an int and "hello" is a str. Are those classes?"

"Exactly," Emma says. "You have been using classes since your first line of Python. Now you are going to define your own."

If you're new to programming

A class is a blueprint that describes a kind of data. An instance is a specific piece of data built from that blueprint. You already use classes every day: int is a class, and 42 is an instance of it. This lesson teaches you to create your own classes using @dataclass, which handles the tedious setup automatically.

If you know classes from another language

Python's @dataclass is similar to Kotlin's data class, Java's record, or C#'s record struct. It auto-generates __init__, __repr__, and __eq__. The class keyword here defines a data-holding type, not a full OOP class with inheritance and polymorphism. Those topics come later.


Classes and Instances: A Mental Model

You have been working with classes and instances since Chapter 47, Lesson 1, when you used type() to inspect values. Here is the pattern:

ClassInstanceHow you create it
int42int(42) or just 42
str"hello"str("hello") or just "hello"
list[1, 2, 3]list([1, 2, 3]) or just [1, 2, 3]
Notenote = Note(...)You will write this today

A class defines the structure. An instance is a concrete value that follows that structure. When you write x: int = 42, the class is int and the instance is 42. When you write name: str = "James", the class is str and the instance is "James".

The only difference with Note is that you define the class yourself instead of using one built into Python.


The class Keyword

Here is the simplest possible class definition:

from dataclasses import dataclass

@dataclass
class Note:
title: str
body: str

Let's break this down line by line:

  • from dataclasses import dataclass: Imports the dataclass decorator from Python's standard library. This is a one-time setup line at the top of your file.
  • @dataclass: The decorator from Lesson 2. It passes the Note class through the dataclass() function, which adds useful methods automatically.
  • class Note:: The class keyword creates a new type called Note. The colon starts an indented block, just like def and if.
  • title: str and body: str: These are fields. Each field has a name and a type annotation. They describe the data that every Note instance must contain.

Creating Instances

Once the class exists, you create instances by calling it like a function:

from dataclasses import dataclass

@dataclass
class Note:
title: str
body: str

my_note: Note = Note(title="Meeting Notes", body="Discussed timeline.")
print(my_note)

Output:

Note(title='Meeting Notes', body='Discussed timeline.')

The line Note(title="Meeting Notes", body="Discussed timeline.") creates a new instance of Note. You pass values for each field as keyword arguments. The result is stored in my_note, which has type Note.


Attribute Access with Dot Notation

Instead of note["title"] (the dict approach from Lesson 1), you use dot notation:

print(my_note.title)   # Meeting Notes
print(my_note.body) # Discussed timeline.

This is where the three problems from Lesson 1 disappear:

  1. Typo protection: If you type my_note.titel, pyright flags it immediately as an error. The class defines exactly which attributes exist.
  2. Autocomplete: When you type my_note. in your editor, it suggests title and body. No guessing.
  3. No silent corruption: You cannot accidentally add a misspelled field. my_note.staus = "draft" produces a pyright error because staus is not a defined field.

What @dataclass Generates

Remember from Lesson 2: a decorator passes your code through a function that modifies it. The @dataclass decorator reads your field definitions and automatically generates three special methods.

Dunder methods

Python uses names with double underscores (like __init__, __repr__, __eq__) for special methods. These are called dunder methods (short for "double underscore"). They are not meant to be called directly. Instead, Python calls them behind the scenes when you create an instance, print it, or compare two instances. The @dataclass decorator generates these methods for you, so you rarely need to write them by hand.

Here is what @dataclass generates for the Note class above:

# @dataclass generates something equivalent to this:

def __init__(self, title: str, body: str) -> None:
self.title = title
self.body = body

def __repr__(self) -> str:
return f"Note(title={self.title!r}, body={self.body!r})"

def __eq__(self, other: object) -> bool:
if not isinstance(other, Note):
return NotImplemented
return self.title == other.title and self.body == other.body
  • __init__: Called when you write Note(title="...", body="..."). It stores each argument on the instance.
  • __repr__: Called when you print() an instance. It produces the readable output you saw above.
  • __eq__: Called when you compare two instances with ==. Two notes are equal if all their fields match.

Without @dataclass, you would need to write all three methods by hand. For a class with six fields, that is roughly 20 lines of repetitive code. The decorator handles it in zero lines.

About self

When Python calls a method on an instance, it passes the instance as the first argument, named self. When you write my_note.title, Python looks up the title attribute that __init__ stored on self. You do not call __init__ directly; Python calls it for you when you create an instance with Note(...). The @dataclass decorator generates these methods, so you rarely need to think about self at this stage.


The @dataclass Equivalence

Applying what you learned in Lesson 2, the shorthand:

@dataclass
class Note:
title: str
body: str

Is equivalent to:

class Note:
title: str
body: str

Note = dataclass(Note)

The dataclass() function receives the Note class, reads its field annotations, generates __init__, __repr__, and __eq__, attaches them to the class, and returns the modified class. The @ shorthand does this in one line instead of two.


Comparing Instances

Because @dataclass generates __eq__, you can compare instances directly:

note_a: Note = Note(title="Plan", body="Ship feature by Friday.")
note_b: Note = Note(title="Plan", body="Ship feature by Friday.")
note_c: Note = Note(title="Plan", body="Different body text.")

print(note_a == note_b) # True (all fields match)
print(note_a == note_c) # False (body differs)

This is essential for testing. You can write assert result == expected where both sides are dataclass instances, and the comparison checks every field automatically.


PRIMM-AI+ Practice: Building a Dataclass

Predict [AI-FREE]

Read this code without running it. Write down what print(book) will output. Rate your confidence from 1 to 5.

from dataclasses import dataclass

@dataclass
class Book:
title: str
author: str
pages: int

book: Book = Book(title="Python Basics", author="Emma", pages=320)
print(book)
Check your prediction

Output:

Book(title='Python Basics', author='Emma', pages=320)

The @dataclass decorator generates a __repr__ method that prints the class name followed by each field and its value. The format is always ClassName(field1=value1, field2=value2, ...).

Run

Create a file called book_model.py with the code above. Run uv run python book_model.py and compare the output to your prediction.

Investigate

Try accessing an attribute that does not exist:

print(book.publisher)

Run uv run pyright book_model.py before running the file. Does pyright catch the error? Compare this to Lesson 1, where pyright reported zero warnings for a misspelled dictionary key.

Modify

Add a genre field of type str to the Book class. Create a new instance with all four fields. Print it and verify the output includes the new field.

Make [Mastery Gate]

Without looking at any examples, define a dataclass called Task with three fields: description (str), assignee (str), and priority (int). Create two instances with the same field values and one instance with different values. Write three assert statements:

  1. The two identical instances are equal (==)
  2. The identical instance is not equal to the different instance (!=)
  3. Accessing .description on the first instance returns the expected string

Run uv run python to verify all assertions pass.


Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Explain the Generated Methods

I defined this dataclass:

@dataclass
class Note:
title: str
body: str

What methods does @dataclass generate? Show me what the
equivalent hand-written class would look like without
using @dataclass.

Compare the AI's response to the equivalence shown in this lesson. Does it generate the same three methods? Are there any differences in the implementation?

What you're learning: You are verifying that your understanding of what @dataclass generates matches reality. If the AI shows additional methods, note them but do not worry about memorizing them yet.

Prompt 2: Convert a Dict to a Dataclass

Here is a function that uses a dictionary:

def create_note(title: str, body: str) -> dict[str, str]:
return {"title": title, "body": body, "status": "draft"}

Rewrite this using a @dataclass instead of a dict. Keep
all type annotations.

Review the AI's output. Does the dataclass have the same fields as the dictionary keys? Does it use the @dataclass decorator correctly? Try running the code to verify it works.

What you're learning: You are seeing the direct translation from dict-based code (Lesson 1's problem) to dataclass-based code (Lesson 3's solution). This previews the full SmartNotes transformation in Lesson 5.


Key Takeaways

  1. A class is a blueprint; an instance is a concrete value. int is a class, 42 is an instance. Note is a class, Note(title="Plan", body="...") is an instance.

  2. @dataclass generates __init__, __repr__, and __eq__ automatically. You define the fields; the decorator writes the repetitive code.

  3. Dot notation replaces bracket notation. note.title instead of note["title"]. This gives you pyright checking, autocomplete, and protection against silent corruption.

  4. Dunder methods are special methods Python calls behind the scenes. Names like __init__ and __repr__ follow the double-underscore convention. You rarely call them directly.

  5. self is how an instance refers to itself inside methods. @dataclass generates methods that use self to store and access fields. You do not need to write these methods yourself.

  6. Two dataclass instances with the same field values are equal. The generated __eq__ method compares every field, which makes assertions in tests clean and readable.


Looking Ahead

You can now define data structures that pyright checks and your editor autocompletes. In Lesson 4, you will learn how to set default values on fields, prevent a sneaky shared-list bug, and make instances immutable with frozen=True.