Structured Data: JSON

If you're new to programming

JSON (JavaScript Object Notation) is a text format for storing structured data. Unlike plain text, JSON preserves the structure of your data: field names, values, lists, and nested objects. Python's json module converts between Python objects and JSON strings. This lesson teaches you to save your entire SmartNotes notebook as a single JSON file and load it back.

If you've coded before

Python's json module handles serialization via dumps()/loads() and file I/O via dump()/load(). Dataclasses need manual dict conversion (dataclasses.asdict()). This lesson builds save/load functions for SmartNotes with typed deserialization.

James looks at the Markdown files from Lesson 1. Three notes, three files, three sets of metadata parsed by line position.

"What if I add a field to Note?" he asks. "Say I add a created_date. The Markdown reader breaks because it expects the body on line 7, but now there is an extra line."

"That is the problem with unstructured formats," Emma says. "Markdown is great for humans. It is fragile for machines. JSON solves this: every value has a name. If you add a field, existing readers still find the fields they know."

What JSON Looks Like

JSON stores data as key-value pairs, similar to a Python dictionary:

{
    "title": "Python Tips",
    "body": "Learn the basics of Python programming.",
    "word_count": 6,
    "author": "James",
    "is_draft": true,
    "tags": ["beginner", "python"]
}

Python type	JSON type	Example
`str`	string	`"hello"`
`int`, `float`	number	`42`, `3.14`
`bool`	boolean	`true`, `false`
`None`	null	`null`
`list`	array	`[1, 2, 3]`
`dict`	object	`{"key": "value"}`

Notice: JSON uses true/false (lowercase), not Python's True/False. JSON uses null, not None. The json module handles these conversions automatically.

Converting Python to JSON: `json.dumps()`

json.dumps() converts a Python dictionary to a JSON string:

import json

data = {
    "title": "Python Tips",
    "body": "Learn the basics of Python programming.",
    "word_count": 6,
    "author": "James",
    "is_draft": True,
    "tags": ["beginner", "python"],
}

json_string = json.dumps(data, indent=2)
print(json_string)

Output:

{
  "title": "Python Tips",
  "body": "Learn the basics of Python programming.",
  "word_count": 6,
  "author": "James",
  "is_draft": true,
  "tags": [
    "beginner",
    "python"
  ]
}

indent=2 makes the output human-readable. Without it, everything appears on one line.

Converting JSON to Python: `json.loads()`

json.loads() converts a JSON string back to a Python dictionary:

import json

json_string = '{"title": "Python Tips", "word_count": 6}'
data = json.loads(json_string)

print(type(data))
print(data["title"])
print(data["word_count"])

Output:

<class 'dict'>
Python Tips
6

The result is a plain dict, not a Note object. JSON does not know about your dataclass. You get back the raw data and must convert it yourself.

From Dataclass to JSON and Back

Your Note is a dataclass. json.dumps() does not know how to convert a dataclass directly. You need to convert it to a dictionary first. Python provides dataclasses.asdict() for this: it takes a dataclass instance and returns a dictionary with all its fields as keys:

from dataclasses import dataclass, field, asdict
import json

@dataclass
class Note:
    title: str
    body: str
    word_count: int
    author: str = "Anonymous"
    is_draft: bool = True
    tags: list[str] = field(default_factory=list)

note = Note(
    title="Python Tips",
    body="Learn the basics of Python programming.",
    word_count=6,
    author="James",
    tags=["beginner", "python"],
)

# Dataclass → dict → JSON string
note_dict = asdict(note)
json_string = json.dumps(note_dict, indent=2)
print(json_string)

Output:

{
  "title": "Python Tips",
  "body": "Learn the basics of Python programming.",
  "word_count": 6,
  "author": "James",
  "is_draft": true,
  "tags": [
    "beginner",
    "python"
  ]
}

Going back requires unpacking the dictionary into the dataclass constructor. The ** operator takes a dictionary and passes each key-value pair as a keyword argument. So Note(**{"title": "Tips", "body": "Learn"}) is the same as Note(title="Tips", body="Learn"):

# JSON string → dict → dataclass
loaded_dict = json.loads(json_string)
loaded_note = Note(**loaded_dict)

print(loaded_note)
print(loaded_note.title)
print(loaded_note.tags)

Output:

Note(title='Python Tips', body='Learn the basics of Python programming.', word_count=6, author='James', is_draft=True, tags=['beginner', 'python'])
Python Tips
['beginner', 'python']

Every key in the dictionary must match a field name in the dataclass. If the dictionary has an extra key that the dataclass does not have, Python raises a TypeError.

Saving a Notebook to a JSON File

Now combine everything into a function that saves a list of notes:

from pathlib import Path
from dataclasses import asdict
import json


def save_notebook(notes: list[Note], file_path: Path) -> None:
    """Save a list of notes to a JSON file.

    - Creates parent directories if they do not exist
    - Overwrites the file if it already exists
    - Uses indent=2 for human-readable output
    """
    file_path.parent.mkdir(parents=True, exist_ok=True)
    data: list[dict] = []
    for note in notes:
        data.append(asdict(note))
    file_path.write_text(json.dumps(data, indent=2))

And a function to load them back:

def load_notebook(file_path: Path) -> list[Note]:
    """Load a list of notes from a JSON file.

    - Returns an empty list if the file does not exist
    """
    if not file_path.exists():
        return []
    data = json.loads(file_path.read_text())
    notes: list[Note] = []
    for note_dict in data:
        notes.append(Note(**note_dict))
    return notes

Test the round trip:

notes = [
    Note("Python Tips", "Learn basics", 2, "James", tags=["python"]),
    Note("Debugging", "Fix errors", 2, "James", tags=["debug"]),
    Note("Cooking Pasta", "Boil water", 2, "Emma", tags=["cooking"]),
]

save_path = Path("data") / "notebook.json"
save_notebook(notes, save_path)

loaded = load_notebook(save_path)
print(f"Saved {len(notes)} notes, loaded {len(loaded)} notes")
print(f"First note: {loaded[0].title} by {loaded[0].author}")
print(f"Tags preserved: {loaded[1].tags}")

Output:

Saved 3 notes, loaded 3 notes
First note: Python Tips by James
Tags preserved: ['debug']

Three notes saved. Three notes loaded. All fields preserved, including lists and booleans.

What JSON Cannot Store

JSON handles strings, numbers, booleans, lists, and dictionaries. It does not handle:

Python type	What happens with `json.dumps()`
`datetime`	Raises `TypeError`
`set`	Raises `TypeError`
`Path`	Raises `TypeError`
Custom classes	Raises `TypeError`

If your Note dataclass gains a created_at: datetime field later, you will need to convert it to a string before saving and parse it back when loading. The json module only handles the types in the table at the top of this lesson.

PRIMM-AI+ Practice: Predict the Parse

Predict [AI-FREE]

Press Shift+Tab to enter Plan Mode.

What does this code print?

import json

text = '{"name": "Alice", "scores": [95, 87, 92], "active": true}'
data = json.loads(text)
print(type(data["scores"]))
print(data["scores"][1])
print(data["active"])
print(type(data["active"]))

Write your predictions. Rate your confidence from 1 to 5.

Check your predictions

<class 'list'>
87
True
<class 'bool'>

json.loads() converts JSON arrays to Python lists and JSON booleans (true/false) to Python booleans (True/False). The conversion is automatic.

Run

Press Shift+Tab to exit Plan Mode.

Create json_practice.py with the code above. Run it with uv run python json_practice.py. Compare to your predictions.

Investigate

If you want to go deeper, run /investigate @json_practice.py in Claude Code and ask: "What happens if my JSON file has extra keys that my dataclass does not have? Show me what happens with Note(**data) when the dict has an unknown key."

This is a common production issue. JSON files evolve over time, and old code encounters new fields.

Modify

Add a created_at: str field to the Note dataclass (use str, not datetime, to keep it JSON-compatible). Update save_notebook and load_notebook to handle the new field. Verify the round trip still works.

Make [Mastery Gate]

Write a function merge_notebooks(file_paths: list[Path]) -> list[Note] that reads multiple JSON notebook files and merges them into a single list, removing duplicates (notes with the same title). In Claude Code, type /tdg to guide you through the cycle:

Write the stub with types and docstring
Write 3+ tests (no files, one file, overlapping titles)
Prompt AI to implement
Run uv run ruff check, uv run pyright, uv run pytest

Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Pretty vs Compact JSON

Show me the difference between json.dumps(data) and
json.dumps(data, indent=2) and json.dumps(data, separators=(',', ':')).
When would I use each format? Which saves the most disk space?

What you're learning: JSON formatting is a tradeoff between readability and size. Compact JSON saves space in production. Indented JSON is easier to debug. You decide based on the use case.

Prompt 2: Handle Unknown Fields

My Note dataclass has 6 fields. But the JSON file might
have extra keys from a newer version of the app. How do
I load the JSON without crashing on unknown keys? Show
me a safe loading function.

What you're learning: Production code must handle forward compatibility. The AI shows you how to filter dictionary keys before unpacking, a pattern used in every real data pipeline.

Prompt 3: Test the Save/Load Cycle

In Claude Code, type:

/tdg

Use the TDG workflow to write and test notebook_stats(file_path: Path) -> dict[str, int] that reads a JSON notebook file and returns statistics: total notes, total word count, number of drafts, number of unique tags. Write tests first, then generate.

What you're learning: You are building analysis functions that operate on persistent data. The function reads from a file (not from memory), reinforcing the persistence pattern.

James opens notebook.json in his editor. The structure is clean: a list of objects, each with named fields. He edits one title directly in the JSON file, saves it, and runs load_notebook again.

"The edit survived," he says. "I can change data in the file and the program picks it up."

"That is the power of a structured format," Emma says. "The Markdown files from Lesson 1 are nice for reading. JSON is better for machines. But neither is ideal for spreadsheet users who want to sort, filter, and analyze data in columns."

She opens a blank spreadsheet. "CSV. Comma-separated values. The format that every spreadsheet application in the world can read."

James nods. "So Markdown for humans, JSON for programs, CSV for spreadsheets."

"Exactly. And by the end of this chapter, SmartNotes handles all three."

What JSON Looks Like​

Converting Python to JSON: json.dumps()​

Converting JSON to Python: json.loads()​

From Dataclass to JSON and Back​

Saving a Notebook to a JSON File​

What JSON Cannot Store​

PRIMM-AI+ Practice: Predict the Parse​

Predict [AI-FREE]​

Run​

Investigate​

Modify​

Make [Mastery Gate]​

Try With AI​

Prompt 1: Pretty vs Compact JSON​

Prompt 2: Handle Unknown Fields​

Prompt 3: Test the Save/Load Cycle​

What JSON Looks Like

Converting Python to JSON: `json.dumps()`

Converting JSON to Python: `json.loads()`

From Dataclass to JSON and Back

Saving a Notebook to a JSON File

What JSON Cannot Store

PRIMM-AI+ Practice: Predict the Parse

Predict [AI-FREE]

Run

Investigate

Modify

Make [Mastery Gate]

Try With AI

Prompt 1: Pretty vs Compact JSON

Prompt 2: Handle Unknown Fields

Prompt 3: Test the Save/Load Cycle