f3: Structured output

So far TripMate has answered in prose. That reads fine to a person, but it’s awkward for code: if your app needs the packing list as an actual array, you’re left scanning the text and hoping the model formatted it the way you expected.

Most of the time you want data from a model: a shape your code can read field by field, with nothing to parse. This challenge makes the same call return a typed object. Output.object({ schema }) is the switch that turns structured output on; the .describe() text you write on each field tells the model what to put in each slot.

Quick path

In a hurry? Here’s the whole challenge. Everything below is the why and the how.

Run npm run f3 and watch it print a flowing paragraph you’d have to scrape the packing list out of.
Read the worked example below to learn the mechanic on a tiny throwaway shape.
Edit start/agent.ts to make TripMate a recommendation agent: wire Output.object, then write a .describe() for each field of the recommendation schema (the field names are given; the descriptions are yours to author).
Done when it prints labelled fields instead of prose, result.output.packingEssentials is a real array your code could .map over, and you’ve watched a vague .describe() produce a worse value than a sharp one.

Mental model

prompt  ->  model  ->  "Lisbon is lovely, pack layers..."     (prose: you parse it and hope)
                   \->  { destination, whyToVisit, packing: [...] }  (schema: typed, use it directly)

The schema turns a paragraph you have to scrape into fields your code can read. Each field’s .describe() text is what tells the model what to put there.

The mechanic, on a throwaway shape

Before you touch the recommendation agent, here’s the whole mechanic on something tiny and unrelated, a country fact card. Two moves: describe the shape with Zod, pass it as output.

import { Output, ToolLoopAgent } from "ai";
import { z } from "zod";

const factCard = z.object({
  destination: z.string().describe("the country's capital city"),
  languagesSpoken: z.array(z.string()).describe("the official languages, most common first"),
});

const agent = new ToolLoopAgent({
  model,
  output: Output.object({ schema: factCard }),   // <- the lever: prose mode -> typed mode
  instructions: "Give a short fact card for the country the user names.",
});

const result = await agent.generate({ prompt: "Portugal" });

const card = result.output;          // typed: no parsing
card.languagesSpoken;                // a real string[]

Three things to take from this:

Output.object({ schema }) is the lever. It switches the agent from text mode to typed mode, and the result lands on result.output instead of result.text. Without it, result.output is undefined.
This is .generate(), not the .stream() you ended f1 on. A typed object is consumed by code, and code can’t act on half a JSON object. Stream when a human watches; .generate() when code consumes.
Field names matter, the model reads them, but the .describe() text does the heavy lifting. "the country's capital city" is why destination comes back as a city and not the country name. A vague description gives a vague fill.
The shape above is throwaway. Your job is to apply the same two moves to a schema that actually matters.

Run it first

npm run f3

Out of the box you get a flowing paragraph under result.text. It reads well. Now ask how you’d pull the packing list out of it in code: search for a “Packing:” line, split on commas or bullets, and redo it every time the model rewords things. That fragility is the problem the schema fixes.

Your challenge: TripMate the recommendation agent

Open start/agent.ts. The aim is a TripMate that takes a traveller and returns a typed recommendation. The schema’s field names are given so your output lines up with the person next to you, but every field ships with no description, and that’s the part you write.

Build it

Wire structured output on (TODO 1). Import Output and z, define the recommendation schema, add output: Output.object({ schema: recommendation }) to the agent, and swap the prose log for a typed read off result.output. You did each of these moves in the worked example above, apply them here. Don’t copy the factCard schema; you’re building recommendation.
Write the descriptions (TODO 2), this is the real work. Each field starts bare:
```
const recommendation = z.object({
  destination: z.string(),
  whyToVisit: z.string(),                  // .describe(...), what should this say?
  packingEssentials: z.array(z.string()),  // .describe(...), how many? how concrete?
});
```
Add a .describe() to each field. Predict what each one will produce before you run. Want two sentences of reasoning? Say so. Want 3–5 concrete items, not “appropriate clothing”? Say that. The description is the only instruction the model has for that field.
Break a description on purpose (TODO 3). Once it works, make one field’s description deliberately vague, whyToVisit: z.string().describe("some text"), and run again. Watch that field get vaguer or drift while the well-described fields hold. Same model, same prompt; the only thing you changed was the words in .describe(). That’s the lesson: the description is the interface to the field. Tighten it and the value sharpens.
Check you’ve got it. You can run npm run f3 and watch it print a paragraph first, then labelled fields after your edits; point at result.output.packingEssentials as an array you could pass straight to a component; and show a sharp description and a vague one producing visibly different values for the same field.

Stuck on the wiring? finish/agent.ts is one complete version, read it after you’ve tried, and notice your .describe() text reads differently from the reference. There’s no single right answer; that’s the point.

A couple of things worth knowing

Why a schema instead of asking for JSON in the prompt?

You could write “reply as JSON with these fields” in the prompt, but then you’re trusting the model to format it perfectly every time and parsing whatever comes back.

Output.object does better. The Zod schema is converted to JSON schema and sent to the provider, which constrains the model’s output format. Whatever comes back is validated against the same schema before your code sees it. If the model puts a number where a string should go, you get a clear Zod error, never a half-parsed string.

Where did result.text go?

In typed mode the model’s job is to fill the schema, so result.output is the thing you read. result.text may be empty or hold the raw JSON. Read output when you set a schema, text when you don’t. The console trace from f2 shows the structured response attached to the call.

What are the other Output types?

Output.object is the one you reach for most, but there’s also Output.array for a list, Output.text for plain text (the default), and a few others. The pattern is the same in each: describe the shape, the SDK fills and validates it.

If a field comes back the wrong length or vague, that’s the .describe() text doing, or failing to do, its job; tighten it. Small local models honour descriptions loosely, so a “two sentences” field will sometimes run long. If a run errors mentioning JSON or schema, the model produced something the schema rejected, so loosen the type with .nullable() or .optional().

You just learned that descriptions are an interface: the words you write on a field decide what the model puts there. Hold onto that, in f6 the same idea shows up one level out, where a tool’s description decides which tool the model reaches for. Descriptions steer the model everywhere; here it’s fields, there it’s tools.

Next up is f4, where we give the agent a tool. You’ll watch it call out for information it cannot know, and watch a tool result change its answer.