Concurrency: Two Workers, One Claim
The Reconciler runs nightly. It is supposed to run as a single instance.
One night two copies of the Reconciler started by accident at the same time. A cron entry got duplicated in the deploy. Both Workers connected to the same Postgres database. Both ran the same query against pending_transactions looking for rows to process. Both saw the same first row. Both processed it.
By morning the database had two expenses rows for the same bank transaction, two confirmation_emails rows, and two real emails sat in the user's inbox.
Idempotency keys (Lesson 3) helped a little. The duplicate inserts that used the same key were caught at the database. But the Workers also called the email API. Side effects do not respect database constraints. The fix is not to guard the writes harder. The fix is to make sure only one Worker claims the row in the first place.
This lesson teaches the smallest claim pattern: SELECT ... FOR UPDATE SKIP LOCKED.
- Race condition: Two processes read the same row at the same time and each proceeds as if it were the only one.
- Row lock: A flag Postgres puts on a row inside a transaction so other transactions cannot modify it until the lock is released.
- SELECT ... FOR UPDATE: A SELECT that takes a row lock. Other transactions that try to lock the same row block until the first transaction commits or rolls back.
- SKIP LOCKED: A modifier on
FOR UPDATEthat tells Postgres to silently skip rows already locked by another transaction, instead of waiting. - Advisory lock: A named lock the application asks Postgres to hold for the length of a transaction or session. Used for "only one Worker should run THIS named action at a time".
- Claim: The act of marking a queue row as "I'm working on this now" so no other Worker picks it up.
Why Idempotency Is Not Enough
Lesson 3 stopped duplicate writes at the database. That is half the story.
| What you stopped in Lesson 3 | What you have not stopped |
|---|---|
The second INSERT into expenses for the same bank transaction. | The second Worker also calling the email API. |
The second INSERT into confirmation_emails for the same (run_id, expense_id). | The second Worker also charging a card. |
| Any side effect mediated entirely through the database. | Any side effect mediated through an external system. |
Idempotency stops double WRITES. Concurrency control stops double WORK.
The two patterns work together. You want both. You need claims when the Worker is going to do something with the world (send an email, call an API, ship a package) that cannot be quietly undone by the database refusing a row.
Three Patterns at Recognition Level
Postgres gives you three locking patterns. You read them. You do not write them.
SELECT ... FOR UPDATE
BEGIN;
SELECT id, amount FROM expenses WHERE id = 42 FOR UPDATE;
-- ... do work, then UPDATE ...
COMMIT;
Translation: "I am about to update row 42. Anyone else who tries to lock row 42 must wait until I commit or roll back." The other transaction blocks. When the first one commits, the other one wakes up and sees the new state.
Use it when two writers are racing to update the same known row.
SELECT ... FOR UPDATE SKIP LOCKED
BEGIN;
SELECT id FROM pending_transactions
WHERE status = 'pending'
ORDER BY id
LIMIT 1
FOR UPDATE SKIP LOCKED;
-- ... claim, then UPDATE status ...
COMMIT;
Translation: "Give me the next pending row that is not already locked. Do not wait. If the first row is taken, give me the second. If everything is taken, give me nothing." The losing Worker silently moves to the next row.
Use it when many Workers are pulling work from a queue. This is the dominant pattern for the Reconciler.
Advisory locks
BEGIN;
SELECT pg_advisory_xact_lock(hashtext('expense-reconciler'));
-- ... do the whole reconciler run while holding the lock ...
COMMIT;
Translation: "Only one transaction at a time may hold the named lock 'expense-reconciler'. Any other transaction asking for the same name waits." The lock is released automatically when the transaction ends.
Use it when only one Worker should be running THIS named action at a time, and you want the second instance to either wait or fail fast. The duplicate-cron problem in this lesson's opening would not have happened if the Reconciler took an advisory lock as its first SQL statement.
The Claim Recipe
This table is the most important reading aid in the lesson.
| Pattern | Use it for | What happens to the loser |
|---|---|---|
FOR UPDATE | Update the same known row in two writers | Loser blocks, then sees committed state |
FOR UPDATE SKIP LOCKED | Pull next item from a queue | Loser silently moves to next item |
| Advisory lock | "Only one Reconciler runs at a time" | Loser waits, or fails fast if requested |
For a queue, you want SKIP LOCKED. For a known shared row, you want FOR UPDATE. For a whole-Worker singleton, you want an advisory lock.
The Reconciler reading from pending_transactions is a queue. SKIP LOCKED is the right choice.
What the Reconciler's Claim Looks Like
The agent writes this. You read it.
BEGIN;
SELECT id
FROM pending_transactions
WHERE status = 'pending'
ORDER BY id
LIMIT 1
FOR UPDATE SKIP LOCKED;
-- The application captures the id from the result.
UPDATE pending_transactions
SET status = 'claimed',
claimed_by = $worker_id,
claimed_at = now()
WHERE id = $id;
COMMIT;
Five things to read:
- The SELECT carries
FOR UPDATE SKIP LOCKED. This is the claim. Skip the rest of the lesson if this clause is missing. - The SELECT has an
ORDER BY id. Without an order, two Workers can scan the table in different orders and both think they got "the first one". - The
LIMIT 1matches the policy of "one row per Worker per turn". Change this only if the brief actually asked for a batch claim. - The UPDATE flips the status to
claimed. After the COMMIT, no other Worker'sWHERE status = 'pending'will see this row. - Both statements run inside
BEGIN ... COMMIT. The lock is released only at COMMIT. If the Worker crashes between the SELECT and the UPDATE, the lock dies with the transaction and the row remainspendingfor the next attempt.
If any of these five are missing, the claim is not a claim. Reject and ask for the change.
PRIMM-AI+ Practice: Two Workers Race for One Pending Transaction
Predict [AI-FREE]
The pending_transactions table has three rows, all with status = 'pending', ordered by id. Two Reconciler Workers start at the same time. Each runs the claim block above.
Before running anything, write down:
- How many of the three rows end up
claimedafter both Workers finish their first turn? - Which Worker gets row 1, and which Worker gets row 2?
- Does row 3 stay
pendingor get claimed? - If a third Worker also started, would it get row 3 or wait?
- Your confidence score from 1 to 5.
You should be able to answer all four before running the demo.
Run
Open two separate sessions. Have the agent run the claim block in each.
In Claude Code, open two terminals against your Neon branch. In each, ask Claude to run the claim block from above. Start them as close to simultaneously as you can. After both have committed, ask for SELECT id, status, claimed_by FROM pending_transactions ORDER BY id; and read the result.
What you should see:
| Row | Status after first run | Claimed by |
|---|---|---|
| 1 | claimed | Worker A |
| 2 | claimed | Worker B |
| 3 | pending | (still available) |
Worker B did not error. Worker B did not block. Worker B silently moved to the next available row. That is the SKIP LOCKED contract.
Investigate
Write your own one-paragraph explanation:
- Worker A's
FOR UPDATE SKIP LOCKEDtook a row lock on row 1. - Worker B's
FOR UPDATE SKIP LOCKEDsaw row 1 was already locked, skipped it, and matched row 2 instead. - Both transactions committed independently. Each Worker now owns one distinct row.
- Row 3 is still
pendingbecause each Worker pulled onlyLIMIT 1.
Then ask the agent:
- "Why is silent skip the right default for a queue, instead of an error or a block?"
- "What would have happened if I removed
SKIP LOCKEDand leftFOR UPDATE? Walk through both Worker timelines." - "What would have happened if I removed
FOR UPDATEentirely? Show me the lost-update trap."
The third question is the dangerous one. Without FOR UPDATE, both Workers' SELECTs return row 1, both UPDATEs flip its status to claimed, and both Workers proceed to do the work. Two emails. Two API calls. The idempotency key catches the duplicate insert into expenses, but it cannot un-send the email.
Modify
Run the demo a second time. Replace FOR UPDATE SKIP LOCKED with FOR UPDATE (no SKIP).
Predict the difference. Then run it.
What you should see: Worker A locks row 1 and starts working. Worker B's SELECT blocks. It just sits there. When Worker A commits and the lock releases, Worker B's SELECT wakes up, but the row's status is now claimed, so the WHERE status = 'pending' filter excludes it. Worker B's SELECT returns zero rows.
Both behaviours are safe (no double work). SKIP LOCKED is faster because nobody waits. For a queue with many consumers, that throughput matters.
Now try one more variant: drop the FOR UPDATE clause entirely. Predict the disaster, then confirm by reading the claimed_by column on row 1 after both Workers finish. Whichever Worker committed second wrote its id, but both Workers also kicked off their downstream work. The lost-update trap.
Make [Mastery Gate]
The brief in business English:
"The Reconciler should claim a batch of up to 10 pending transactions per turn. It must never claim the same row as another Worker running in parallel. Order by id."
Hand this brief to the agent. Ask for the claim query.
You read what comes back. The gate passes when you can:
- Point at the
FOR UPDATE SKIP LOCKEDclause and say in plain English what it prevents. - Confirm the
LIMIT 10matches the brief. - Confirm the
ORDER BY idis present so two Workers do not race in different orders. - Confirm both the SELECT and the UPDATE are inside one
BEGIN ... COMMITblock.
If any of those four is missing, you reject and explain which failure mode the missing piece would let in.
Try With AI
Prompt 1: Claim-Pattern Recognition
For each of these three Worker scenarios, tell me which concurrency
pattern fits and why. Use the names FOR UPDATE, FOR UPDATE SKIP LOCKED,
or advisory lock.
A) Many Workers pulling next pending bank transaction from a queue.
B) Two Workers updating the same user's balance row.
C) Only one nightly cleanup job should ever run, even if cron fires twice.
What you're learning: Picking the right tool. The three patterns sound similar but solve different problems. Practising the match builds the reflex you need when the agent proposes the wrong one.
Prompt 2: Two-Session Race Drill
Walk me through what happens when two psql sessions run this block
at the same time:
BEGIN;
SELECT id FROM pending_transactions WHERE status = 'pending'
ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED;
-- imagine a 5-second pause here while the Worker does the work
UPDATE pending_transactions SET status = 'claimed' WHERE id = $id;
COMMIT;
For each session, describe what it sees and does, second by second.
End with the final state of the table.
What you're learning: Reasoning about the timeline. The lock lives for the length of the transaction. The pause inside the transaction is where most real Workers spend their time. Reading the timeline tells you why a long-running Worker holding a claim is a separate problem (covered in operations chapters).
Prompt 3: Advisory Lock Recognition
Show me the SQL line a Worker would run as its first statement to
ensure that only one instance of itself is running at any time. Use
pg_advisory_xact_lock. Then tell me, in one sentence each, what the
second instance experiences in three cases: (1) the first instance
is still running, (2) the first instance just committed, (3) the
first instance crashed.
What you're learning: Recognition of advisory locks. The crashed-instance case is the surprising one: an advisory transaction lock dies with the transaction, so the second instance can claim immediately. That is the right behaviour for a Worker singleton.
Checkpoint
- I can explain in one sentence why idempotency keys do not stop duplicate Worker work.
- I can name the three Postgres locking patterns and the Worker scenario each one fits.
- I can read a
SELECT ... FOR UPDATE SKIP LOCKEDblock and explain what each clause does. - I have run the two-session race demo and watched two Workers claim two distinct rows.
- I have predicted, then verified, the lost-update trap that occurs when
FOR UPDATEis missing. - I rejected at least one agent-proposed claim query during practice for missing
FOR UPDATEorORDER BY.