f1: Hello, and the two inputs
Make your first agent call. You need two things to get a reply out of a model: a system prompt for the agent, and the thing you want to ask it right now.
The AI SDK calls these instructions and prompt. Learning to tell them apart is
most of this first challenge. The last move changes how the reply arrives: streamed
token by token, the way every shipped AI app delivers text.
Quick path
Section titled “Quick path”In a hurry? These three steps are the whole challenge. Everything below is the why and the how.
- Run
npm run f1and watch it print two pitches: both invent details, and both are shaped by the same sharedinstructions. - Edit
start/agent.ts: TODO 1 (add a rule toinstructionsand watch both pitches change), TODO 2 (add a one-off rule to the firstpromptonly), TODO 3 (swap both.generate()calls for.stream()loops), TODO 4 (optional: settemperatureto 0 then 1.2). - Done when a rule in
instructionschanges both answers, a one-off rule in the firstpromptaffects only pitch[1], both pitches type themselves out token by token, and you can explain why the price was invented.
Mental model
Section titled “Mental model”Change the instructions and every answer shifts; change the prompt and only this one does.
Open start/agent.ts. We build an agent once with new ToolLoopAgent(...), hand it a model and its instructions, then call it with a
prompt:
The instructions are the agent’s system prompt: its role, its tone, its rules. You set
them once and they apply to every call the agent makes.
The prompt is the one thing you’re asking this time.
Change the instructions and you change how it answers everything. Change the prompt and you change what it answers just this once.
Run it:
You’ll get two pitches from the same agent.
Read the invented price and hotel back from pitch [1]. The agent has no way to know either
one. It has no tools and no data, so it filled the gaps with plausible-sounding text.
Now look across both outputs. They are different asks, but they share one thing: the same
instructions. That is the setup for the whole exercise.
That is fabrication, and it is the reason the rest of the workshop exists. Tools start fixing it in f4.
Build it
Section titled “Build it”-
Run it and spot the invention. Run
npm run f1and read the pitch. Pick out the price and the hotel, and notice you have no way to check either. The model made them up. -
Add a rule to the instructions (TODO 1). Write a rule of your own into
instructions, after “You are TripMate, a friendly trip planner.” Keep both prompts exactly the same, pick a direction, predict what it’ll do, then run. Reach for whatever you like: a length limit, a format (“answer as three bullets”), or a persona (“talk like a pirate”). The wording is yours. Both pitches change, even though you only wrote the rule once. That’s the instructions doing their job on every call. -
Add a rule to one prompt (TODO 2). Leave your instructions rule in place, and append a one-off rule (say “Answer in one sentence.”) to the first
promptonly. Run it again. Pitch[1]obeys it and pitch[2]does not, while both still obey the instructions rule. A standing rule lives ininstructions; a one-off ask lives in theprompt. Same kind of words, different reach. -
Make it stream (TODO 3). So far each pitch lands all at once after a silent wait. No shipped AI app works that way: the reply types itself out as the model writes it, and the silent wait reads as hung. The swap is
.generate()for.stream()..stream()returns right away, and itstextStreamis an async iterable that yields the reply a few characters at a time, so youfor awaitand write each chunk as it arrives. On a different agent, the shape is:Convert both calls. There is no
.textto log at the end any more; the chunks are the text, so print the[1]label before the loop and a newline after it. Run it and watch the pitch type itself. Same agent, same work, same wait; it just stops feeling like one. -
Turn the temperature up (TODO 4, optional stretch).
temperaturecontrols how much the model varies its wording. Add it to the agent config and run the same prompt a few times at each value, predicting before each run:At
0the pitch barely moves. At1.2it swings around. Watch the one thing that never improves, though: a higher temperature gives you a different invented price, not a truer one. Wording is all temperature touches. -
Check you’ve got it. You can add a rule to
instructionsand watch both answers change, append a one-off rule to the firstpromptand see[1]obey it while[2]does not (with the instructions rule still on both), make both pitches stream token by token, and explain why the invented price or hotel was made up.
Stuck? finish/agent.ts is the canonical version. Read it after you’ve had a real go.
A couple of things worth knowing
Section titled “A couple of things worth knowing”Why ToolLoopAgent and not a one-off function?
The AI SDK can do a single call with generateText, and plenty of tutorials start there.
We use ToolLoopAgent from the first challenge because it’s the shape you keep: an
object that holds a model, instructions, and soon some tools, that you call with
.generate() or .stream(). Meeting it now means nothing structural changes when tools
show up later. It also lines up with the Python path, where the same idea is
Agent(instructions=...).
When do I stream, and when do I .generate()?
Stream when a human is watching the reply arrive; .generate() when code consumes it.
Streaming is the industry default for anything user-facing, which is why it’s in lesson
one. But several later challenges feed one call’s reply into more code: a JSON schema
parse (f3), a one-character verdict gate (f5), a chain of edits (p1). Code can’t act on
half a sentence, so those use .generate() and wait for the whole text. Same agent
either way; you pick the delivery per call.
Is instructions just a system prompt?
Same idea, nicer name. Some SDKs call the system prompt the “system prompt”; the AI SDK’s
ToolLoopAgent calls it instructions, which reads better and matches the Python path.
Either way, it’s the text the model treats as its role on every turn, separate from the
user’s message.
What is experimental_telemetry doing here?
It tells the AI SDK to emit a trace for every call: a structured record of what went out,
what came back, how long it took, and how many tokens it used. The flag is on from f1, but
a trace only goes somewhere once something is listening, and f1 isn’t listening yet. f2
turns that on by loading instrument.ts, and the console traces begin there. For now the
flag costs nothing.
That is the foundation the whole workshop sits on. Next up is f2, where we open the hood: you’ll see the message loop underneath this single call, count its tokens, and read the run as a trace in your console instead of guessing at what happened.