p1: Prompt chaining
You start the Patterns track here. In Foundations each challenge was one model call. From now on you compose several of them, in code you write. This one teaches the simplest composition: a chain. You break a task into small steps, run them in order, and put a gate between them so a bad result never flows downstream.
Quick path
Section titled “Quick path”In a hurry? These three steps are the whole challenge. Everything below is the why and the how.
- Run
make p1and read the draft, which usually skips the £500 budget or the call to action, with no review and no repair path. - Edit
start/agent.py: do TODO 1 (call the reviewer), TODO 2 (the plain-code gate plus themissinglist), TODO 3 (the editor’s instructions plus its run), and TODO 4 (re-review the rewrite). - Done when a failing draft is repaired and the final re-review reports both checks true.
The task is to produce a trip pitch that mentions the traveller’s £500 budget and ends with a call to action. Asking one prompt for all of that at once is hopeful: you cannot tell whether the model did it. A chain makes the work easier to debug. Draft the pitch, judge the draft, fix only what the judge flagged, then check the fix.
The gate is what matters. The reviewer returns a typed verdict, so the decision
to ship or to fix is an ordinary if. That is what makes a workflow a workflow: you hold
the control flow, in code you can read and test.
Mental model
Section titled “Mental model”The reviewer returns a typed verdict, so a plain if in your code decides whether to ship or to fix.
The mechanic, in another domain
Section titled “The mechanic, in another domain”Forget travel. Say you gate a pull-request description: it must summarise the change and give a test plan. A reviewer returns a typed verdict, so the decision is a plain if, not more prose:
Three moves. The reviewer’s output_type turns its judgement into booleans (the f3 trick), so the gate is an ordinary if. The missing list holds only the failed checks, so the editor repairs those and leaves the rest alone. And the reviewer is a separate call from the writer, so its verdict is independent of the thing it judges. Below you write that gate and the editor’s brief for TripMate.
The setup
Section titled “The setup”Open start/agent.py. The writer and the reviewer are provided. The writer is deliberately never told about the £500 budget or the call to action, so the gate has something real to catch; the reviewer returns a typed QualityCheck (mentions_budget, has_call_to_action, notes) verdict. What is blank: the gate, the editor’s instructions, and the calls that wire them.
Run it
Section titled “Run it”The writer produces a vivid Lisbon pitch. Read it back: it often skips the £500 budget, and it often does not ask you to book anything. Nothing checks its work yet, so you have a nice paragraph and no way to repair it.
The rest of the file fixes that dead end.
Build it
Section titled “Build it”-
Run it and read the draft. Run
make p1and look at what comes back. The writer was never asked for the budget or a call to action, so it usually drops one or both. There is no review, no gate, no repair path. -
Call the reviewer and read the verdict (TODO 1). Run the
revieweron thedraftand read its typed.output, the same call-an-agent, read-.outputmove you used in f3. Print the two booleans and the notes, so you can see what the gate will act on. -
Write the gate (TODO 2). This is the heart of the pattern, and it is plain code, not a model call. Read the two booleans on
verdict. If both pass, print[gate] passedandreturn, and the draft ships as is. Otherwise build amissinglist naming only the failed checks, in words the editor can act on (the budget mention, the call to action), and drop the ones that passed. Thatmissinglist is the whole point of the gate: it carries forward what still needs fixing and nothing else. -
Write the editor’s instructions and call it (TODO 3). First fill in the empty
instructionson theeditoragent: keep its brief tight so the edit stays faithful, fix only the missing points, preserve the draft’s voice, keep the length, return only the rewritten pitch. Then calleditor.run(...)with a prompt that hands it two things, themissingpoints to fix and thedraftto rewrite, and read.outputintoimproved. Sending only the failed points, not “make it better”, is what keeps the edit small and faithful. -
Re-review the edit (TODO 4). Run the same review call you made in step 2, this time on
improvedinstead ofdraft, and read its.outputinto a final verdict you can print. Without this you only know you attempted a repair, not whether it worked. -
Try to skip the chain (poke it). Add
"mention the £500 budget and end with a call to action"to the writer’s instructions, so the draft attempts everything in one shot. Run it a few times. A small model still drops one requirement now and then. The gate turns “usually” into “every time”, and it is the part a single prompt cannot give you. -
Check you’ve got it. You should be able to point at the gate line and say why it is code and not a model call, and say in one sentence how a workflow differs from an agent. Scroll up to the trace too: you will see separate
agent runspans in sequence (writer, reviewer, maybe editor, then final reviewer), each with its own token usage. When the gate passes first time, the editor and final-review spans are absent, because the repair path never ran.
Stuck? finish/agent.py is the canonical version. Read it after you’ve had a real go.
- A gate that is not plain code. If you find yourself asking a model “should I rewrite this?”, stop. The reviewer already returned booleans; the decision is an
if. - Feeding the editor everything. Send only the failed points, not “make it better”. A specific instruction produces a faithful edit; a vague one rewrites the voice away.
- Over-checking. Two concrete, checkable criteria beat ten fuzzy ones. The reviewer is only as reliable as its criteria are concrete.
- No exit. This chain runs each step once. If you ever loop a fix-and-recheck (the p4 evaluator-optimizer), bound the loop so it cannot run forever.
A couple of things worth knowing
Section titled “A couple of things worth knowing”Workflows vs agents
This is the distinction the rest of the workshop turns on, from Anthropic’s “Building effective agents”.
A workflow orchestrates model calls through code paths you write. You decide the order and the branches. That is this challenge.
An agent is a model that directs its own process and tool use in a loop. The model decides the order. That is the agentic challenge later in the workshop.
Workflows give you predictability and a place to put checks; agents give you flexibility when you cannot predict the steps. You are learning to choose between them.
Why split the writer and the reviewer at all?
You could ask one agent to “write a pitch, on budget, with a call to action, and tell me if you succeeded”. On a large model that often works. It degrades on a small one, and it hides the failure: the same call that writes the pitch also grades it, so a miss and a false “looks good” arrive together.
Splitting the roles gives each call one job and gives you an independent verdict you can gate on. Anthropic gives this “separation of concerns” as the reason for chaining and routing.
When is a chain the wrong tool?
When you cannot predict the steps. A chain is a fixed path: draft, review, edit, in that order, always. If the work needs a different number or order of steps depending on the input (search this, then maybe search that, then maybe call a tool), a fixed chain fights you and an agent fits better. The agentic challenge is that case.
Next up is p2, where the gate becomes a fork. You classify the input first, then send it down a different path depending on what it is.