Structured Data: JSON
JSON (JavaScript Object Notation) is a text format for storing structured data. Unlike plain text, JSON preserves the structure of your data: field names, values, lists, and nested objects. Python's json module converts between Python objects and JSON strings. This lesson teaches you to save your entire SmartNotes notebook as a single JSON file and load it back.
Python's json module handles serialization via dumps()/loads() and file I/O via dump()/load(). Dataclasses need manual dict conversion (dataclasses.asdict()). This lesson builds save/load functions for SmartNotes with typed deserialization.
James looks at the Markdown files from Lesson 1. Three notes, three files, three sets of metadata parsed by line position.
"What if I add a field to Note?" he asks. "Say I add a created_date. The Markdown reader breaks because it expects the body on line 7, but now there is an extra line."
"That is the problem with unstructured formats," Emma says. "Markdown is great for humans. It is fragile for machines. JSON solves this: every value has a name. If you add a field, existing readers still find the fields they know."
What JSON Looks Like
JSON stores data as key-value pairs, similar to a Python dictionary:
{
"title": "Python Tips",
"body": "Learn the basics of Python programming.",
"word_count": 6,
"author": "James",
"is_draft": true,
"tags": ["beginner", "python"]
}
| Python type | JSON type | Example |
|---|---|---|
str | string | "hello" |
int, float | number | 42, 3.14 |
bool | boolean | true, false |
None | null | null |
list | array | [1, 2, 3] |
dict | object | {"key": "value"} |
Notice: JSON uses true/false (lowercase), not Python's True/False. JSON uses null, not None. The json module handles these conversions automatically.
Converting Python to JSON: json.dumps()
json.dumps() converts a Python dictionary to a JSON string:
import json
data = {
"title": "Python Tips",
"body": "Learn the basics of Python programming.",
"word_count": 6,
"author": "James",
"is_draft": True,
"tags": ["beginner", "python"],
}
json_string = json.dumps(data, indent=2)
print(json_string)
Output:
{
"title": "Python Tips",
"body": "Learn the basics of Python programming.",
"word_count": 6,
"author": "James",
"is_draft": true,
"tags": [
"beginner",
"python"
]
}
indent=2 makes the output human-readable. Without it, everything appears on one line.
Converting JSON to Python: json.loads()
json.loads() converts a JSON string back to a Python dictionary:
import json
json_string = '{"title": "Python Tips", "word_count": 6}'
data = json.loads(json_string)
print(type(data))
print(data["title"])
print(data["word_count"])
Output:
<class 'dict'>
Python Tips
6
The result is a plain dict, not a Note object. JSON does not know about your dataclass. You get back the raw data and must convert it yourself.
From Dataclass to JSON and Back
Your Note is a dataclass. json.dumps() does not know how to convert a dataclass directly. You need to convert it to a dictionary first. Python provides dataclasses.asdict() for this: it takes a dataclass instance and returns a dictionary with all its fields as keys:
from dataclasses import dataclass, field, asdict
import json
@dataclass
class Note:
title: str
body: str
word_count: int
author: str = "Anonymous"
is_draft: bool = True
tags: list[str] = field(default_factory=list)
note = Note(
title="Python Tips",
body="Learn the basics of Python programming.",
word_count=6,
author="James",
tags=["beginner", "python"],
)
# Dataclass → dict → JSON string
note_dict = asdict(note)
json_string = json.dumps(note_dict, indent=2)
print(json_string)
Output:
{
"title": "Python Tips",
"body": "Learn the basics of Python programming.",
"word_count": 6,
"author": "James",
"is_draft": true,
"tags": [
"beginner",
"python"
]
}
Going back requires unpacking the dictionary into the dataclass constructor. The ** operator takes a dictionary and passes each key-value pair as a keyword argument. So Note(**{"title": "Tips", "body": "Learn"}) is the same as Note(title="Tips", body="Learn"):
# JSON string → dict → dataclass
loaded_dict = json.loads(json_string)
loaded_note = Note(**loaded_dict)
print(loaded_note)
print(loaded_note.title)
print(loaded_note.tags)
Output:
Note(title='Python Tips', body='Learn the basics of Python programming.', word_count=6, author='James', is_draft=True, tags=['beginner', 'python'])
Python Tips
['beginner', 'python']
Every key in the dictionary must match a field name in the dataclass. If the dictionary has an extra key that the dataclass does not have, Python raises a TypeError.
Saving a Notebook to a JSON File
Now combine everything into a function that saves a list of notes:
from pathlib import Path
from dataclasses import asdict
import json
def save_notebook(notes: list[Note], file_path: Path) -> None:
"""Save a list of notes to a JSON file.
- Creates parent directories if they do not exist
- Overwrites the file if it already exists
- Uses indent=2 for human-readable output
"""
file_path.parent.mkdir(parents=True, exist_ok=True)
data: list[dict] = []
for note in notes:
data.append(asdict(note))
file_path.write_text(json.dumps(data, indent=2))
And a function to load them back:
def load_notebook(file_path: Path) -> list[Note]:
"""Load a list of notes from a JSON file.
- Returns an empty list if the file does not exist
"""
if not file_path.exists():
return []
data = json.loads(file_path.read_text())
notes: list[Note] = []
for note_dict in data:
notes.append(Note(**note_dict))
return notes
Test the round trip:
notes = [
Note("Python Tips", "Learn basics", 2, "James", tags=["python"]),
Note("Debugging", "Fix errors", 2, "James", tags=["debug"]),
Note("Cooking Pasta", "Boil water", 2, "Emma", tags=["cooking"]),
]
save_path = Path("data") / "notebook.json"
save_notebook(notes, save_path)
loaded = load_notebook(save_path)
print(f"Saved {len(notes)} notes, loaded {len(loaded)} notes")
print(f"First note: {loaded[0].title} by {loaded[0].author}")
print(f"Tags preserved: {loaded[1].tags}")
Output:
Saved 3 notes, loaded 3 notes
First note: Python Tips by James
Tags preserved: ['debug']
Three notes saved. Three notes loaded. All fields preserved, including lists and booleans.
What JSON Cannot Store
JSON handles strings, numbers, booleans, lists, and dictionaries. It does not handle:
| Python type | What happens with json.dumps() |
|---|---|
datetime | Raises TypeError |
set | Raises TypeError |
Path | Raises TypeError |
| Custom classes | Raises TypeError |
If your Note dataclass gains a created_at: datetime field later, you will need to convert it to a string before saving and parse it back when loading. The json module only handles the types in the table at the top of this lesson.
PRIMM-AI+ Practice: Predict the Parse
Predict [AI-FREE]
Press Shift+Tab to enter Plan Mode.
What does this code print?
import json
text = '{"name": "Alice", "scores": [95, 87, 92], "active": true}'
data = json.loads(text)
print(type(data["scores"]))
print(data["scores"][1])
print(data["active"])
print(type(data["active"]))
Write your predictions. Rate your confidence from 1 to 5.
Check your predictions
<class 'list'>
87
True
<class 'bool'>
json.loads() converts JSON arrays to Python lists and JSON booleans (true/false) to Python booleans (True/False). The conversion is automatic.
Run
Press Shift+Tab to exit Plan Mode.
Create json_practice.py with the code above. Run it with uv run python json_practice.py. Compare to your predictions.
Investigate
If you want to go deeper, run /investigate @json_practice.py in Claude Code and ask: "What happens if my JSON file has extra keys that my dataclass does not have? Show me what happens with Note(**data) when the dict has an unknown key."
This is a common production issue. JSON files evolve over time, and old code encounters new fields.
Modify
Add a created_at: str field to the Note dataclass (use str, not datetime, to keep it JSON-compatible). Update save_notebook and load_notebook to handle the new field. Verify the round trip still works.
Make [Mastery Gate]
Write a function merge_notebooks(file_paths: list[Path]) -> list[Note] that reads multiple JSON notebook files and merges them into a single list, removing duplicates (notes with the same title). In Claude Code, type /tdg to guide you through the cycle:
- Write the stub with types and docstring
- Write 3+ tests (no files, one file, overlapping titles)
- Prompt AI to implement
- Run
uv run ruff check,uv run pyright,uv run pytest
Try With AI
If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.
Prompt 1: Pretty vs Compact JSON
Show me the difference between json.dumps(data) and
json.dumps(data, indent=2) and json.dumps(data, separators=(',', ':')).
When would I use each format? Which saves the most disk space?
What you're learning: JSON formatting is a tradeoff between readability and size. Compact JSON saves space in production. Indented JSON is easier to debug. You decide based on the use case.
Prompt 2: Handle Unknown Fields
My Note dataclass has 6 fields. But the JSON file might
have extra keys from a newer version of the app. How do
I load the JSON without crashing on unknown keys? Show
me a safe loading function.
What you're learning: Production code must handle forward compatibility. The AI shows you how to filter dictionary keys before unpacking, a pattern used in every real data pipeline.
Prompt 3: Test the Save/Load Cycle
In Claude Code, type:
/tdg
Use the TDG workflow to write and test notebook_stats(file_path: Path) -> dict[str, int] that reads a JSON notebook file and returns statistics: total notes, total word count, number of drafts, number of unique tags. Write tests first, then generate.
What you're learning: You are building analysis functions that operate on persistent data. The function reads from a file (not from memory), reinforcing the persistence pattern.
James opens notebook.json in his editor. The structure is clean: a list of objects, each with named fields. He edits one title directly in the JSON file, saves it, and runs load_notebook again.
"The edit survived," he says. "I can change data in the file and the program picks it up."
"That is the power of a structured format," Emma says. "The Markdown files from Lesson 1 are nice for reading. JSON is better for machines. But neither is ideal for spreadsheet users who want to sort, filter, and analyze data in columns."
She opens a blank spreadsheet. "CSV. Comma-separated values. The format that every spreadsheet application in the world can read."
James nods. "So Markdown for humans, JSON for programs, CSV for spreadsheets."
"Exactly. And by the end of this chapter, SmartNotes handles all three."