Relationships & Joins
In L3, you created categories and expenses as separate records. But they're connected: every expense belongs to a category and a user.
Here's the problem: If you delete a category, what happens to its expenses? If you want all expenses for one user, how do you ask?
In raw SQL, you'd write JOIN statements. With SQLAlchemy, you define relationships once, and then access connected data like Python attributes. No JOIN syntax needed.
The Relationship Problem
You have three tables with foreign keys:
users
├── id (PK)
├── email
└── name
categories
├── id (PK)
├── name
└── color
expenses
├── id (PK)
├── user_id (FK → users.id)
├── category_id (FK → categories.id)
├── description
└── amount
The foreign keys create connections. But to USE those connections in Python, you need relationships.
Without relationships:
# To get a user's expenses, you'd write:
expenses = session.query(Expense).filter(Expense.user_id == user.id).all()
With relationships:
# Just access the attribute:
expenses = user.expenses
Same result. Less code. SQLAlchemy handles the query.
Defining Relationships
Add relationship() to your models. Here's the User model with a relationship to expenses:
from sqlalchemy.orm import relationship
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
email = Column(String(100), unique=True, nullable=False)
name = Column(String(100), nullable=False)
# Relationship: One user has many expenses
expenses = relationship("Expense", back_populates="user")
Output:
No SQL generated yet - relationship() is configuration.
SQLAlchemy now knows: User.expenses will return Expense objects.
And the Expense model needs the other side:
class Expense(Base):
__tablename__ = 'expenses'
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('users.id'), nullable=False)
category_id = Column(Integer, ForeignKey('categories.id'), nullable=False)
description = Column(String(200), nullable=False)
amount = Column(Float, nullable=False)
# Relationships
user = relationship("User", back_populates="expenses")
category = relationship("Category", back_populates="expenses")
Output:
Expense.user points back to User object.
Expense.category points back to Category object.
Both directions connected via back_populates.
Let's decode this:
| Part | Meaning |
|---|---|
relationship("Expense", ...) | This User connects to Expense objects |
back_populates="user" | The Expense model has an attribute called user pointing back |
relationship("User", ...) | This Expense connects to a User object |
back_populates="expenses" | The User model has an attribute called expenses pointing back |
Bidirectional: Both sides know about each other. Change one, the other reflects it.
Complete Models with Relationships
Here's the full Budget Tracker model setup:
from sqlalchemy import Column, Integer, String, Float, Date, DateTime, ForeignKey, create_engine
from sqlalchemy.orm import declarative_base, relationship, Session
from datetime import datetime, date, timezone
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
email = Column(String(100), unique=True, nullable=False)
name = Column(String(100), nullable=False)
created_at = Column(DateTime, default=lambda: datetime.now(timezone.utc))
expenses = relationship("Expense", back_populates="user", cascade="all, delete-orphan")
class Category(Base):
__tablename__ = 'categories'
id = Column(Integer, primary_key=True)
name = Column(String(50), unique=True, nullable=False)
color = Column(String(7), default="#FF6B6B")
expenses = relationship("Expense", back_populates="category", cascade="all, delete-orphan")
class Expense(Base):
__tablename__ = 'expenses'
id = Column(Integer, primary_key=True)
user_id = Column(Integer, ForeignKey('users.id'), nullable=False)
category_id = Column(Integer, ForeignKey('categories.id'), nullable=False)
description = Column(String(200), nullable=False)
amount = Column(Float, nullable=False)
expense_date = Column(Date, default=date.today)
user = relationship("User", back_populates="expenses")
category = relationship("Category", back_populates="expenses")
Output:
Three connected models:
- User has many Expenses (user.expenses)
- Category has many Expenses (category.expenses)
- Expense belongs to one User (expense.user)
- Expense belongs to one Category (expense.category)
Using Relationships in Queries
With relationships defined, access related data through attributes:
From User to Expenses
with Session(engine) as session:
user = session.query(User).filter(User.email == 'alice@example.com').first()
# Access all expenses through relationship
print(f"{user.name}'s expenses:")
for expense in user.expenses:
print(f" ${expense.amount:.2f}: {expense.description}")
Output:
Alice's expenses:
$52.50: Groceries
$18.75: Lunch
$45.00: Gas
SQLAlchemy generated: SELECT * FROM expenses WHERE user_id = 1
You didn't write the query - the relationship did.
From Expense to User
with Session(engine) as session:
expense = session.query(Expense).filter(Expense.id == 1).first()
# Access user through relationship
print(f"Expense by: {expense.user.name}")
print(f"Category: {expense.category.name}")
Output:
Expense by: Alice
Category: Food
Both directions work. One query, related objects available.
Computing with Related Data
with Session(engine) as session:
user = session.query(User).filter(User.name == 'Alice').first()
# Calculate total spending
total = sum(expense.amount for expense in user.expenses)
print(f"Total spending: ${total:.2f}")
# Group by category
by_category = {}
for expense in user.expenses:
cat = expense.category.name
by_category[cat] = by_category.get(cat, 0) + expense.amount
print("\nBy category:")
for cat, amount in by_category.items():
print(f" {cat}: ${amount:.2f}")
Output:
Total spending: $116.25
By category:
Food: $71.25
Transportation: $45.00
When You Actually Need join()
Relationships let you navigate data you already have. But sometimes you need to filter on related tables:
with Session(engine) as session:
# Find all expenses in the "Food" category
food_expenses = session.query(Expense).join(Category).filter(
Category.name == 'Food'
).all()
for expense in food_expenses:
print(f"${expense.amount}: {expense.description}")
Output:
$52.50: Groceries
$18.75: Lunch
SQLAlchemy inferred the join from the relationship.
You didn't write: ON expenses.category_id = categories.id
When to use .join() vs relationship attributes:
| Situation | Approach |
|---|---|
| Navigate from object you have | user.expenses (relationship) |
| Filter query on related table | .join(Category).filter(Category.name == 'Food') |
| Complex multi-table query | .join() with explicit conditions |
Cascade: What Happens on Delete?
When you delete a user, what happens to their expenses? Three options:
| Option | What Happens | Use When |
|---|---|---|
| Error (default) | Delete blocked if children exist | Protect data from accidents |
| Set NULL | Children's FK becomes NULL | Keep orphaned records |
| Cascade delete | Children deleted too | Clean removal |
The cascade="all, delete-orphan" setting means:
class User(Base):
# ...
expenses = relationship("Expense", back_populates="user", cascade="all, delete-orphan")
What this does:
- If user is deleted → all their expenses are deleted too
- If an expense is removed from
user.expenseslist → it's deleted from database
with Session(engine) as session:
alice = session.query(User).filter(User.name == 'Alice').first()
expense_count = len(alice.expenses)
print(f"Alice has {expense_count} expenses")
# Delete Alice
session.delete(alice)
session.commit()
# Check expenses
remaining = session.query(Expense).count()
print(f"Expenses remaining: {remaining}")
Output:
Alice has 3 expenses
Expenses remaining: 0
All of Alice's expenses were deleted with her.
Without cascade, that delete would either:
- Fail with foreign key constraint error
- Leave orphaned expenses with invalid user_id
Working With AI on Relationship Queries
Here's where collaboration helps. You know WHAT you want. AI helps with HOW to express it.
Your request: "Show me total spending by category for user Alice"
AI suggests:
with Session(engine) as session:
user = session.query(User).filter(User.name == 'Alice').first()
category_totals = {}
for expense in user.expenses:
cat_name = expense.category.name
category_totals[cat_name] = category_totals.get(cat_name, 0) + expense.amount
return category_totals
Your refinement: "Good, but I need this sorted by highest spending first."
AI adapts:
sorted_totals = sorted(category_totals.items(), key=lambda x: x[1], reverse=True)
What emerged: A complete solution that neither of you had fully formed at the start. AI suggested the traversal pattern; you added the sorting requirement; together you converged on working code.
What Happens Next
You've now defined relationships between tables. User has many Categories. User has many Expenses. Category has many Expenses. You can navigate this connected data efficiently.
But real-world operations rarely touch just one table. What if you need to transfer money between categories? Create an expense and update a user's balance at the same time? If either operation fails, both must roll back. That's where L5 comes in.
| Lesson | What You Learn | What You Add to Your Skill |
|---|---|---|
| L4 (now) | Connect tables with relationships | Relationship and join patterns |
| L5 | Atomic multi-step operations | Transaction patterns |
| L6 | Move to production cloud database | Neon deployment and pooling |
| L7 | Combine SQL + bash hybrid patterns | Tool choice framework |
| L8 | Integrate everything | Complete working Budget Tracker app |
Relationships let you structure your data correctly. Transactions ensure your operations are safe.
Try With AI
Prompt 1: Predict Relationship Behavior
What you're learning: Understanding how relationships connect data.
Given this code:
user = session.query(User).filter(User.email == 'bob@example.com').first()
category_totals = {}
for expense in user.expenses:
cat = expense.category.name
category_totals[cat] = category_totals.get(cat, 0) + expense.amount
print(category_totals)
And this data:
- User: Bob (id=2)
- Categories: Food (id=1), Entertainment (id=2)
- Expenses:
- id=5, user_id=2, category_id=1, amount=25.00, description="Lunch"
- id=6, user_id=2, category_id=1, amount=30.00, description="Dinner"
- id=7, user_id=2, category_id=2, amount=50.00, description="Concert"
1. What does category_totals contain at the end?
2. How many database queries does this code execute?
3. What would happen if Bob had no expenses?
After AI explains, trace through the code yourself. Does your mental model match?
Prompt 2: Build a Relationship Query
What you're learning: Constructing queries that use relationships.
Write SQLAlchemy code to find all expenses in the "Food" category,
sorted by amount (highest first).
Requirements:
1. Use join() to filter by category name
2. Return only expenses (not categories)
3. Order by amount descending
4. Print: description, amount, and the user's name who made each expense
Use the Budget Tracker models (User, Category, Expense) with relationships.
After AI responds, check:
- Does it use
.join(Category)? - Does it access
expense.user.namethrough the relationship? - Would this work with your model definitions?
Prompt 3: Update Your Skill
What you're learning: Documenting relationship patterns for reuse.
Add to my /database-deployment skill:
## Relationships & Joins
Include:
1. How to define relationship() and back_populates (both sides)
2. When to use cascade="all, delete-orphan"
3. How to query through relationships (user.expenses)
4. When to use join() explicitly (filtering on related table)
Use Budget Tracker examples:
- User ↔ Expense (one-to-many)
- Category ↔ Expense (one-to-many)
Format as markdown for SKILL.md.
After AI responds, review the patterns. Are they general enough for other projects?
Safety reminder: Cascade delete is powerful. In production, always test cascade behavior on non-production data first. A misconfigured cascade can delete more data than intended.
Checkpoint
Before moving to L5:
- You understand:
relationship()connects models bidirectionally - You can access related data through attributes (
user.expenses) - You understand:
back_populateslinks both sides of a relationship - You know when to use
.join()vs relationship attributes - You can explain what
cascade="all, delete-orphan"does - You've iterated with AI to build a relationship query (Prompt 2)
- You've documented relationship patterns in your
/database-deploymentskill
Ready for L5: Transactions and safe multi-operation updates.