From One-Off to Worker: The Handoff to Manufacturing

1 Signal · 4 Promotions · 1 Fork

Two months ago, at the start of this track, Ana looked at her Monday task — sorting the week's customer messages into groups and writing a summary — and saw it had a Mode 2 future: a task she did every week, the same way, clearly worth building once. For now, though, it was still a Mode 1 job — she kept solving it by hand, using the seven principles, every Monday. It worked every time. It also ate every Monday morning.

Her colleague Diego has the same kind of weekly task. He also solves it by hand, well, every week. He is good at it.

A year later, Ana spends almost no Monday morning on it. She crossed: she turned her repeated Mode 1 solution into a worker that handles the routine on its own, and she only steps in to check the exceptions it flags. Diego has spent about fifty Mondays on his — roughly a hundred hours — and will spend fifty more next year. Same task, same skill. The only difference is that Ana stopped solving the problem and started manufacturing the solution.

This course is that crossing. It is the last stop in Mode 1 and the on-ramp to Mode 2.

Who this is for

Anyone who has a task they keep solving with an agent, the same way, over and over — and is starting to feel that doing it by hand every time is the wrong way to live. This course shows you when to stop solving it and turn it into a permanent worker, and what actually changes when you do.

A book for readers everywhere

This book is read all over the world, by people who work and study in many different languages. The examples here use plain English and everyday situations that mean the same thing no matter where you live. New words (like spec, eval, and runtime) are explained the first time they appear.

Prerequisites

Finish Problem Solving with General Agents first — this course assumes you can already solve a problem well in a single session using the seven principles. It also assumes you have done Is This an Agent Problem? and know what Mode 1 (solve it once) and Mode 2 (build a permanent worker) mean.

The rule in one line

You don't build a worker from scratch. You promote a solution you have already proven.

People hear "build an AI worker" and picture starting from nothing — a blank screen, a hard engineering project, weeks of work. That picture is wrong, and it scares people away from the most valuable thing they could do. The truth is the opposite: by the time a task is ready to become a worker, you have already done most of the work. Every Monday Ana solved her task by hand, she was — without knowing it — figuring out exactly what the worker would need to do. Manufacturing is not invention. It is taking a solution you have already proven by hand and making it permanent.

That reframe is the whole course. Below, you will see the one signal that tells you a solution is ready to cross, the four parts of your Mode 1 work that get promoted into a worker, and the one fork in the road where you choose what kind of worker to build.

The short version (four bullets)

Cross only what you have already proven. You can only manufacture a worker from a solution you have solved cleanly, by hand, several times. The repetition is not wasted time before automating — it is how you discover what the worker should actually do.
You are not starting over. Four things you already did by hand in Mode 1 — your brief, your check, the steps you drove, and the session itself — each get promoted into a permanent form. That is the whole build.
Making it permanent forks two ways. You can own a personal worker (lighter, just for you) or manufacture a Digital FTE (heavier, for an organisation). The fork is decided by one question: who is the worker for?
The payoff is task versus asset. Solving by hand is labour as a task — you spend the hours every single time. A worker is labour as an asset — you build once, and it works while you sleep.

Part 1 — The signal: is it actually ready to cross?

The mistake this prevents: "I built a permanent worker for a task I had not really figured out yet, so I spent a week building the wrong thing — and then had to rebuild it."

In Is This an Agent Problem?, Gate 2 already gave you the first half of the signal: a task is Mode 2 when all three dials are up — you do it often, it has the same shape each time, and it is worth the effort. The trigger was simple: the third time you do the same task the same way, stop and check.

But there is a second half that Gate 2 could not check, and it is the one people skip: have you actually solved it well yet?

You can only manufacture from a proven solution. If you build a worker out of a method you are still figuring out, you build the wrong worker — and you find out only after you have spent the effort. So the full signal to cross has two parts:

Gate 2 says Mode 2 (often, same, worth it), and
You have solved the task cleanly in Mode 1 enough times that the method has stopped changing.

That second part is the real test. Ask yourself: the last three times I did this, did I do it the same way? If yes, the shape is stable — you have found the worker. If you still solve it a bit differently each week — different steps, different checks, fresh decisions — then you have not found the stable shape yet. Keep solving it in Mode 1 until it settles. Those repetitions are not you failing to automate. They are you doing the research that tells the worker what to do.

The repetitions are the research, not the waste

It feels inefficient to keep doing a task by hand once you know it is a Mode 2 job. It is not. Each time you solve it, you discover an edge case, a better step order, a check that matters. A worker built on week one would miss all of that. A worker built on week five is built on five weeks of hard-won knowledge. Cross when the learning slows down, not before.

Part 2 — The reframe: the worker is hiding in the work

Here is the part that changes how the whole thing feels. Every time you solved the task well in Mode 1, you left a trail — and that trail is the raw material for the worker.

Think about what a good Mode 1 session actually produced. You wrote a brief (what to work from, what you wanted, what "done" meant). You asked for the output in a clear shape. You ran a check to make sure it was right. You saved the result to a file. None of that was throwaway. So manufacturing is two moves, not a from-scratch build: harvest those pieces, then harden each one so it can run without you. Be clear-eyed about that second move: hardening is real work — designing the exits, building an eval that grades itself, and standing up a runtime are genuine engineering, not just writing down what you already do. The reframe spares you the blank page; it does not make the build trivial.

This is also where the economics flip, and it is the heart of the whole book. When you solve a task by hand, your labour is a task: you spend the time, you get one result, and the time is gone. When you promote that solution into a worker, the same labour becomes an asset: you spend the time once to build it, and it produces results again and again, while you do something else. That is the entire difference between Diego, who spends a hundred hours a year, and Ana, who spent a few hours once.

A line chart of cumulative hours spent over a year. Diego's line rises steadily and never stops, because he solves the task by hand every week. Ana's line jumps up once (the cost of building the worker) and then stays almost flat. The two lines cross after a few weeks; from then on Ana is far ahead, and the gap keeps widening. Labour as a task keeps costing you. Labour as an asset costs once, then pays you back. The lines cross sooner than people expect.

Part 3 — The four promotions

Crossing is not one big build. It is four specific upgrades, and you already did the hard thinking for each one in Mode 1. Each promotion takes something you did by hand and turns it into something the worker does on its own. Each one also hands you to the Mode 2 course that teaches it in full.

You are not adding four new things. You are upgrading four things you already have.

Promotion 1 — Your brief becomes a spec

In Mode 1 you wrote a quick brief each time: works from, want at the end, done when (the three lines from Gate 3). It lived in your head or a scratch note, and you could adjust it on the fly.

A worker cannot read your mind or ask what you meant, so that brief has to become a spec — short for specification, a written document the worker reads on every single run that says exactly what it does, on what, and to what standard. The spec is the same three lines you already wrote, made explicit, complete, and permanent. Skip it, and the worker fills the gaps with guesses — a little differently every run.

Learn it in: Spec-Driven Development.

Promotion 2 — Your check becomes an eval

In Mode 1 you verified the output yourself (the third principle): you read it, you checked the numbers against the source, you trusted it because you looked.

A worker runs without you watching, often many times a day. "You reading it each time" does not scale, and it does not catch the moment the worker quietly starts getting things wrong. So your check becomes an eval — short for evaluation, a saved set of example inputs paired with their known-good answers. (For fuzzier work, a "known-good answer" may be a label, a rubric score, or a checklist the output must satisfy — not always one perfect block of text.) The worker's results are graded against them automatically, so the checking happens without you and warns you the moment the worker starts to drift. Your one-time reading becomes a test that runs forever. Skip it, and the drift is silent — you hear about it from an unhappy customer, not from a check.

Learn it in: Eval-Driven Development.

Promotion 3 — You leave the loop, and design the exits

In Mode 1 you were in the loop (the sixth and seventh principles): you watched each step, redirected when it wandered, approved before it moved on. You were the safety net.

A worker runs the steps itself, with no one watching. Here is the part people get wrong: the routine is the easy part — your Mode 1 method already handles that. The hard part is the edges — the unusual input, the case your method never had to deal with. You must decide, in advance, what the worker does when it hits something it cannot handle. Almost always the answer is: stop and call a human. Designing those exits — when to escalate, to whom, with what information — is the real work of this promotion, and it is the work that keeps a worker safe to trust. And every task has edges, even the ones that feel simple: a worker that writes code stops and asks you when its change breaks the tests; a worker that pays invoices flags anything above a set amount instead of paying it; a worker that files documents sets aside the one it cannot confidently sort rather than guessing. The shape never changes — handle the routine, escalate the exception. Skip it, and the worker will, sooner or later, handle the one case it should have flagged — confidently and wrongly.

Learn it in: Build AI Agents and Building a Digital FTE.

Promotion 4 — Your session becomes a runtime

In Mode 1 the work lived in a session you opened. You closed the laptop and it stopped existing. Saving anything for next time (the fifth principle) was you, by hand, putting things in files.

A worker has to keep existing when you are not there. That needs a runtime — software that keeps the worker alive and running on its own — and somewhere for it to live so it is reachable and reliable. Its memory persists by itself, not because you remembered to save it. Skip it, and there is no worker — only you, opening a session by hand, which is exactly where you began.

Learn it in: Deploy the Agent Harness. (If the worker is just for you, there is a lighter path — see the fork below.)

The four promotions at a glance:

What you had in Mode 1	What it becomes in Mode 2	Where to learn it
Your brief (works from / want at end / done when)	A spec the worker reads every run	Spec-Driven Development
Your own eyeball check	An eval that grades the worker automatically	Eval-Driven Development
You watching and redirecting	The worker runs the loop and escalates at the edges	Build AI Agents · Building a Digital FTE
A session you open and close	A runtime the worker lives on	Deploy the Agent Harness

That is the whole crossing. Four upgrades, each one a thing you already understood from doing it by hand.

What you carry over, not rebuild: your plugins

The four promotions are what changes when you cross. One important thing does not change much: your plugins — the skills (packaged know-how an agent reuses) and connectors (links to your other apps and data) you already used while solving in Mode 1. Because they are built on open, cross-runtime formats, the same skills and connectors carry across claude.ai, the general agents you drove (Claude Code, OpenCode, Cowork, OpenWork), and personal harnesses — and, often with only light adaptation, into the workers you manufacture. As long as a plugin sticks to those open formats, it comes across the crossing largely as-is — one more reason building a worker is mostly promotion, not invention. New to these? See Skills & Connectors.

Why this works (the research behind it) — optional

Two old ideas explain why "promote, don't build from scratch" is the right order.

The first is from Fred Brooks, who led one of the largest software projects of the 1960s and wrote about it in The Mythical Man-Month (1975). His famous advice: plan to throw one away — you will, anyway. The first thing you build teaches you what you should have built; the version worth keeping is the second one, made with what you learned. Your repeated Mode 1 solves are exactly those throwaways. You are not wasting weeks before automating — you are running the experiment that tells you what the permanent worker should be. Building the worker on week one would mean building the version you were always going to throw away.

The second is from Lisanne Bainbridge's Ironies of Automation (1983), one of the most-cited papers on automating work. Her finding: when you automate the routine parts of a job, you do not remove the human — you leave the human responsible for exactly the rare, difficult cases the automation cannot handle, and the more reliable the automation, the more important (and harder) those rare interventions become. That is precisely why Promotion 3 is about designing the exits, not the routine. The routine is the easy part to automate; the value and the danger both live at the edges, so you design the escalation deliberately rather than hoping it never comes up.

Sources: Brooks, F. P. (1975). The Mythical Man-Month. Addison-Wesley. Bainbridge, L. (1983). "Ironies of Automation," Automatica, 19(6), 775–779.

Part 4 — The fork: two ways to make it permanent

Once you decide to cross, "build a durable worker" forks into two different roads. They use the same four promotions, but they are built for different people, and the difference matters.

Same proven solution, two destinations. The deciding question is who relies on the worker.

Own it — a personal harness. If the worker is for you — your inbox, your code, your errands — the lighter path is a personal harness (software you run and own yourself that keeps a worker alive for you). You do the four promotions, but lightly: the spec is your own notes, the eval is a handful of your own examples, the escalation is the worker messaging you. You may never need the full Mode 2 track to get there. This is the road the Personal Agent Harnesses section teaches, using OpenClaw and Hermes.

Manufacture it — a Digital FTE. If the worker is for an organisation — something other people will rely on, that must run reliably, be governed, and scale (and perhaps be sold) — that is a Digital FTE (a "digital full-time employee"), and you do the four promotions rigorously: the spec is shared and reviewed, the eval is a gate the whole team trusts, the escalation goes to a named human or team, and the runtime is real production infrastructure. This is the full Mode 2 — Manufacturing track.

The deciding question is one line: who is the worker for, and who relies on it? For you alone → personal harness. For an organisation → Digital FTE. Same crossing, two depths of rigour.

What you build with. The two roads are the constant; the tools are the variable, and they change often. Here is the variable as it stands in 2026 — you have already chosen the road, so you only need the column that matches it:

The road	What you build with (2026)	Where to learn it
Own it — a personal harness	OpenClaw or Hermes — open-source harnesses you run and own yourself	Personal Agent Harnesses
Manufacture it — a Digital FTE	The OpenAI Agents SDK, or a managed Claude agent setup	The Mode 2 track; Choosing Agentic Architectures helps you pick

The general agent you have been driving in Mode 1 (Claude Code, OpenCode, Cowork, or OpenWork) does not disappear here — on either road it is the tool you use to build and install the worker. It just stops being the thing that does the task each time, and becomes the thing that makes the thing that does the task.

A running cost appears when you cross

One thing to budget for. A durable worker runs on the API, where every model call is metered and paid for — and that is true on both roads: a personal harness (OpenClaw, Hermes) runs on the API just as much as a manufactured Digital FTE does. This is different from using AI inside claude.ai, the web app, where a plugin's calls come out of your subscription or the free tier with no separate per-call bill. So crossing turns "thinking" from something bundled into your plan into a real, per-call cost. It is one more reason the "worth it?" signal matters: a worker has to repay not just your build time, but the model bill it will run up every time it works. (For a personal harness you pay that bill yourself; for a Digital FTE you or the organisation do — and if you sell the worker, that per-call cost is the number you price around.)

This is not a third mode

Owning a personal harness is not a new mode sitting between Mode 1 and Mode 2 — it is the same "build a durable worker" activity, scaled down to one person. Mode asks whether you solve once or build to last; ownership asks whether the worker is yours or an organisation's. Two separate questions. You can run either mode on a harness you own.

Ana's blueprint, filled in

Before you do your own, here is Ana's Monday task, crossed — the same four promotions, filled in. This is what "done" looks like, and notice that every line is just something she already did by hand for two months.

Brief → spec. Every Monday, read the new messages in the Support folder. Put each one in exactly one group — complaint, question, order, or other — and write a one-page summary with the count per group and the three most common complaints. (Her Gate 3 three lines, written down for good.)
Check → eval. Twelve past messages she had already sorted by hand, each saved next to its correct group. Whenever she changes the worker's instructions, it is run against those twelve first; if it mis-sorts more than one, she fixes it before it ever touches real mail.
You → exits. If a message is in a language the worker does not handle, or asks for a refund bigger than a limit she set, the worker does not guess — it flags the message and pings Ana. Everything else it handles alone.
Session → runtime. The worker runs every Monday morning on a small always-on machine and keeps its own list of what it has already processed — so Ana opens nothing and remembers nothing.

None of that was invented on crossing day. It is two months of Mondays, made permanent.

Your turn

Take a real task you already solve with an agent, the same way, more than once. Run it through the crossing, filling in the three steps below.

1Your Work

Fill in the three steps for your own task. The grader checks whether the task is genuinely proven, whether your four promotions are concrete, and, most important, whether your exit is a real escalation case rather than a skipped edge.

Step 1, Is it proven? The last three times you did this, did you do it the same way? (If the method still changes, it is not ready, say so.)

Step 2, the four promotions, one line each: Brief to spec; Check to eval (2-3 examples + known-good answers); You to exits (the one case it should stop and call a human); Session to runtime (where it lives to run without you).

Step 3, pick the fork: who is this worker for? You alone (personal harness) or an organisation (Digital FTE)?

2Get Your Score

Discuss with an AI. Question your scores.
Come back when you have your BEST evaluation.

If you can fill in all three steps, you are not "thinking about maybe building an agent someday." You are holding the blueprint for one. The Mode 2 track is just the four promotions, done for real.

Where this hands you off

This is the crossing. Mode 1 was labour as a task — you spent the hours every time. Mode 2 is labour as an asset — you build once, and it works while you sleep. This course is where one becomes the other.

You now enter the Mode 2 — Manufacturing track with a blueprint, not a blank page. Each course there builds one of the four promotions for real:

Python in the AI Era — the language manufacturing is built in (you direct; the agent writes most of it).
Build AI Agents — the worker that runs the loop and escalates.
Eval-Driven Development — the eval that grades it.
Building a Digital FTE — all four promotions assembled into a worker an organisation can trust.
Deploy the Agent Harness — the runtime it lives on.

Walk in with a proven Mode 1 solution and a filled-in blueprint, and you are not starting from nothing. You are promoting something that already works.

References

The ideas behind this course's claims, for anyone who wants the originals.

Brooks, F. P. (1975). The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley. The "plan to throw one away" argument — why your proven Mode 1 solves are the prototype the real worker is built from — is in the chapter of that name. (overview)
Bainbridge, L. (1983). "Ironies of Automation." Automatica, 19(6), 775–779. doi:10.1016/0005-1098(83)90046-8. Why automating the routine leaves the human responsible for the rare, hard cases — the reason Promotion 3 is about designing the exits, not the routine. (readable summary)
Munroe, R. "Is It Worth the Time?" xkcd 1205. The break-even logic behind the task-versus-asset chart: how often a task repeats decides whether building the worker is worth it.

Flashcards Study Aid

Knowledge Check

A quick gated self-check on the ideas you just ran through.

Checking access...

Who this is for​

The rule in one line​

The short version (four bullets)​

Part 1 — The signal: is it actually ready to cross?​

Part 2 — The reframe: the worker is hiding in the work​

Part 3 — The four promotions​

Promotion 1 — Your brief becomes a spec​

Promotion 2 — Your check becomes an eval​

Promotion 3 — You leave the loop, and design the exits​

Promotion 4 — Your session becomes a runtime​

Part 4 — The fork: two ways to make it permanent​

Ana's blueprint, filled in​

Your turn​

Where this hands you off​

References​

Flashcards Study Aid​

Knowledge Check​