Skip to main content

Dict and Set Comprehensions

If you're new to programming

You learned list comprehensions in Lesson 1: [expr for x in items]. Python uses the same pattern for dictionaries and sets. A dict comprehension builds a dictionary: {key: value for x in items}. A set comprehension builds a set of unique values: {expr for x in items}. The only difference is the brackets: square for lists, curly for dicts and sets.

If you've coded before

Dict comprehensions: {k: v for k, v in items if cond}. Set comprehensions: {expr for x in items if cond}. This lesson covers lookup tables, grouping, and unique value extraction using SmartNotes data.

James needs to look up a note by its title. With a list, he has to loop through every note until he finds the right one. With a dictionary, the lookup is instant.

"Build a lookup table," Emma says. "Title → Note. One expression."


Dict Comprehensions

A dict comprehension builds a dictionary from an iterable:

# List comprehension: [expression for item in iterable]
# Dict comprehension: {key: value for item in iterable}

names = ["Alice", "Bob", "Charlie"]
name_lengths = {name: len(name) for name in names}
print(name_lengths)

Output:

{'Alice': 5, 'Bob': 3, 'Charlie': 7}

The syntax is identical to a list comprehension except for the curly braces and the key: value pair before for.


Building a Note Lookup Table

Create a dictionary that maps titles to notes:

from dataclasses import dataclass, field

@dataclass
class Note:
title: str
body: str
word_count: int
author: str = "Anonymous"
is_draft: bool = True
tags: list[str] = field(default_factory=list)


notes = [
Note("Python Tips", "Learn basics", 50, "James", tags=["python"]),
Note("Debug Guide", "Fix errors", 120, "James", tags=["debug"]),
Note("Cooking", "Boil water", 30, "Emma", tags=["cooking"]),
]

# Build lookup: title → note
lookup = {note.title: note for note in notes}

# Instant access by title
print(lookup["Debug Guide"].word_count)
print(lookup["Cooking"].author)

Output:

120
Emma

No loop needed. The dictionary provides O(1) lookup by key. Compare this to searching a list:

# Without lookup: loop through every note
for note in notes:
if note.title == "Debug Guide":
print(note.word_count)
break

# With lookup: instant
print(lookup["Debug Guide"].word_count)

Dict Comprehension with Filtering

Add a condition to include only certain items:

# Word count lookup for non-draft notes only
published_counts = {
note.title: note.word_count
for note in notes
if not note.is_draft
}
print(published_counts)

Output:

{'Debug Guide': 120, 'Cooking': 30}

Grouping Notes by Tag

A common task: group notes so that each tag maps to the list of notes that have it. This requires a loop because one note can appear in multiple groups:

def group_notes_by_tag(notes: list[Note]) -> dict[str, list[Note]]:
"""Group notes by tag. Each tag maps to notes that have it."""
groups: dict[str, list[Note]] = {}
for note in notes:
for tag in note.tags:
if tag not in groups:
groups[tag] = []
groups[tag].append(note)
return groups


groups = group_notes_by_tag(notes)
for tag, tag_notes in groups.items():
titles = [n.title for n in tag_notes]
print(f"{tag}: {titles}")

Output:

python: ['Python Tips']
debug: ['Debug Guide']
cooking: ['Cooking']

This function uses a loop because the grouping logic (one note → many tags) is too complex for a single comprehension. That is fine. Not everything needs to be a comprehension.

A dict comprehension works well for the simpler case where each key maps to one value:

# author → count of notes by that author (using .get() from Chapter 48)
author_counts: dict[str, int] = {}
for note in notes:
author_counts[note.author] = author_counts.get(note.author, 0) + 1
print(author_counts)

# Simpler with a comprehension if you have unique keys:
author_to_first_note = {note.author: note.title for note in notes}
print(author_to_first_note)

Output:

{'James': 2, 'Emma': 1}
{'James': 'Debug Guide', 'Emma': 'Cooking'}

Notice that author_to_first_note has 'James': 'Debug Guide', not 'James': 'Python Tips'. When duplicate keys exist, the last value wins. Dict comprehensions process items in order, so the last note by James (Debug Guide) overwrites the first (Python Tips).


Set Comprehensions

A set is a collection of unique values. A set comprehension extracts unique values from an iterable:

# Set: automatically removes duplicates
unique_authors = {note.author for note in notes}
print(unique_authors)

Output:

{'James', 'Emma'}

The syntax is the same as a list comprehension, but with curly braces {} instead of square brackets [].

Nested iteration: Sometimes you need to loop through a list inside a list. Each note has a tags list, and you want all tags across all notes. The loop version looks like this:

# Loop version: two nested for loops
all_tags_loop: list[str] = []
for note in notes:
for tag in note.tags:
all_tags_loop.append(tag)
print(all_tags_loop)

A comprehension can do the same thing with two for clauses. Read it left to right, in the same order as the nested loop:

# Comprehension version: same two loops, one line
all_tags_list = [tag for note in notes for tag in note.tags]
print(all_tags_list)

# Set version: same thing, but removes duplicates
all_tags = {tag for note in notes for tag in note.tags}
print(all_tags)

Output:

['python', 'debug', 'cooking']
{'python', 'debug', 'cooking'}

In this example, there are no duplicates. But if two notes share a tag:

notes_with_overlap = [
Note("Tip 1", "Learn", 10, tags=["python", "beginner"]),
Note("Tip 2", "More", 15, tags=["python", "advanced"]),
Note("Recipe", "Cook", 20, tags=["cooking"]),
]

all_tags = {tag for note in notes_with_overlap for tag in note.tags}
print(all_tags)
print(f"Unique tags: {len(all_tags)}")

Output:

{'python', 'beginner', 'advanced', 'cooking'}
Unique tags: 4

"python" appears in two notes but only once in the set. Sets enforce uniqueness automatically.


Practical Set Operations

Sets support mathematical operations that lists do not. You can find items that are in both sets, in either set, or in one but not the other:

# Build tag sets for each author using loops (clear and explicit)
james_tags: set[str] = set()
emma_tags: set[str] = set()
for note in notes:
for tag in note.tags:
if note.author == "James":
james_tags.add(tag)
elif note.author == "Emma":
emma_tags.add(tag)

print(f"James: {james_tags}")
print(f"Emma: {emma_tags}")
print(f"Shared: {james_tags & emma_tags}") # Intersection: in both
print(f"All: {james_tags | emma_tags}") # Union: in either
print(f"James only: {james_tags - emma_tags}") # Difference: in James but not Emma

Output:

James: {'python', 'debug'}
Emma: {'cooking'}
Shared: set()
James only: {'python', 'debug'}
All: {'python', 'debug', 'cooking'}
OperationSyntaxMeaning
Intersectiona & bItems in both sets
Uniona | bItems in either set
Differencea - bItems in a but not b
Membership"python" in tagsCheck if item exists (very fast)

PRIMM-AI+ Practice: Predict the Output

Predict [AI-FREE]

Press Shift+Tab to enter Plan Mode.

students = [
{"name": "Alice", "grade": "A"},
{"name": "Bob", "grade": "B"},
{"name": "Charlie", "grade": "A"},
{"name": "Diana", "grade": "C"},
]

# A
grades = {s["name"]: s["grade"] for s in students}

# B
unique_grades = {s["grade"] for s in students}

# C
a_students = {s["name"] for s in students if s["grade"] == "A"}

What are the values of grades, unique_grades, and a_students?

Check your predictions
grades = {'Alice': 'A', 'Bob': 'B', 'Charlie': 'A', 'Diana': 'C'}
unique_grades = {'A', 'B', 'C'}
a_students = {'Alice', 'Charlie'}

grades is a dict mapping name to grade. unique_grades is a set of the three distinct grades. a_students is a set of names with grade "A".

Run

Press Shift+Tab to exit Plan Mode.

Create dict_set_practice.py and verify your predictions.

Investigate

For the grades dict from the Predict exercise, write one sentence explaining what happens if two students have the same name. Does the dict keep both entries?

If you want to go deeper, run /investigate @dict_set_practice.py in Claude Code and ask: "What is the difference between a set and a frozenset? Can I use a set as a dictionary key?"

Modify

Write a function note_index(notes: list[Note]) -> dict[str, list[str]] that returns a dictionary mapping each tag to the list of note titles that have that tag. Use a loop (grouping is too complex for a single comprehension). Test it with sample data.

Make [Mastery Gate]

Write tag_overlap(notes: list[Note]) -> dict[str, set[str]] that returns a dictionary mapping each author to their set of tags. Then compute which tags are shared between authors using set intersection. In Claude Code, type /tdg to guide you through the cycle:

  1. Write the stub
  2. Write 3+ tests (single author, overlapping tags, no overlap)
  3. Prompt AI to implement
  4. Verify with ruff, pyright, pytest

Try With AI

Opening Claude Code

If Claude Code is not already running, open your terminal, navigate to your SmartNotes project folder, and type claude. If you need a refresher, Chapter 44 covers the setup.

Prompt 1: Comprehension vs Loop

Here is my group_notes_by_tag function that uses a loop.
Can this be written as a dict comprehension? If not, why?
What is the simplest way to write it?

[paste the function]

What you're learning: Not every loop can become a comprehension. The AI explains the structural reasons (one-to-many mapping) and suggests alternatives like defaultdict.

Prompt 2: Real-World Dict Comprehensions

Show me three real-world examples of dict comprehensions
in production Python code. For each one, explain what it
builds and why a dict comprehension is the right choice.

What you're learning: Seeing dict comprehensions in real code builds pattern recognition. The AI shows patterns you can apply to your own projects.

Prompt 3: SmartNotes Analytics

In Claude Code, type:

/tdg

Use the TDG workflow to write and test note_statistics(notes: list[Note]) -> dict[str, int] that returns {"total": N, "drafts": N, "published": N, "total_words": N, "unique_tags": N}. Use dict and set comprehensions where appropriate.

What you're learning: Combining dict comprehensions, set comprehensions, and list comprehensions in one function. Each tool handles a different part of the aggregation.


James builds a tag-to-notes index and a title lookup dictionary. Two data structures, two comprehensions, instant access to any note by title or tag.

"At the warehouse," he says, "we had two lookup systems. The barcode scanner gave instant access to any item by its code. The zone map showed which items were in each storage area. The title lookup is the barcode scanner. The tag index is the zone map."

"Good analogy," Emma says. "And sets are the inventory count: how many unique items do we carry? Not how many boxes of each, just how many distinct products."

"So comprehensions handle the building part," James says. "Lists, dicts, sets, all built in one expression. But what happens when the data is too large for memory? If I have a million notes, building a list of all word counts means a million integers in RAM."

"That is where generators come in," Emma says. "They produce values one at a time, on demand, without storing the entire result in memory. Lesson 3."