Vercel AI SDK path (TypeScript)
- Patterns p1–p7 orchestrate & compose Live · in the room
- RAG r1–r2 ground it in your data Live · in the room
- Full-Stack the agent behind a UI Live · in the room
- MCP tools from a server Self-serve · take-home
- Resilience errors as data Self-serve · take-home
- Output Guardrails redact on the way out Self-serve · take-home
Build a trip-planner agent, TripMate, with the Vercel AI SDK in a 90-minute block.
It runs locally on Ollama, so you need no API key to start. If you have a Google Gemini key, swap to that instead.
You need no prior AI SDK experience. The challenges build on each other, so each one adds a single idea to the one before it.
First five minutes
Section titled “First five minutes”cd vercel-ai-sdknpm installollama pull granite4.1:3b(the default local model; skip if you’ll use Gemini)npm run verify: don’t go further until you see every check passnpm run f1: your first agent call
What you’re building
Section titled “What you’re building”By the end you have built that agent, streamed it, given it tools from a server you did not write, put it behind a web UI, and made it survive a tool failure.
How it runs
Section titled “How it runs”You pick a provider by editing one line in shared/model.ts:
Ollama is the default and needs no API key.
For Gemini, set GOOGLE_GENERATIVE_AI_API_KEY in your environment and swap the line.
Already have a key for another provider (OpenAI, Anthropic, Mistral, …)? Install its
AI SDK package (e.g. npm install @ai-sdk/openai) and assign model the same way; the
providers list has them all.
One edit moves every challenge onto your provider.
Every challenge from f2 on is traced. The best way to read a run is autotel-devtools, a local browser trace viewer. Start it in one terminal, then run challenges in another:
Open http://127.0.0.1:4445 and you get the run as a tree of spans you can click into: the model, the prompt and response, the timings, and the token usage.
The same spans also print to your console as a tree, so a run is readable even with no
viewer open. This is autotel (debug: "pretty" for the console, devtools: true for
the browser), wired in through instrument.ts. The Python path does the same through
logfire, so a run looks the same whichever language you are in.
Running the challenges
Section titled “Running the challenges”Start with npm run f1, which runs the first challenge once and prints the
result.
For an interactive runner, run npm run dev to get a menu, pick a challenge, and
it watches that file and re-runs on every save. If you already know which one you want,
npm run dev -- p1 skips the menu and watches p1 directly.
Challenges
Section titled “Challenges”A common Foundations trunk, then the tracks you choose between. Everyone does Foundations, then picks one track for the room session (Patterns, RAG, or Full-Stack). The others are take-home. Jump straight to a track if you already know the basics.
The workshop ends with a 20-minute discussion
(see DISCUSSION.md): wherever you got to, the closing questions are
the same — was working with the model what you expected, what surprised you, what did
you learn. You can stop after Foundations and still contribute fully.
The arc follows Anthropic’s “Building effective agents”: first the augmented LLM, then workflows + agents (the Patterns track), retrieval (the RAG track), or a real app (the Full-Stack track).
Foundations (f1–f7): the augmented LLM
| # | Challenge | Goal | Command |
| - | --------- | ---- | ------- |
| f1 | Hello + the two inputs | Call an agent; shape it with instructions vs prompt; stream the reply | npm run f1 |
| f2 | See the loop + tokens | Read the message loop, count tokens, read the console trace | npm run f2 |
| f3 | Structured output | Typed TripPitch via Output.object, no parsing | npm run f3 |
| f4 | Tools | A tool fills a gap the model cannot; the loop needs steps to use it | npm run f4 |
| f5 | Guardrails | A cheap check runs first and refuses off-topic or unsafe requests | npm run f5 |
| f6 | Descriptions (authoring lab) | A description routes the model to the right tool | npm run f6 |
| f7 | Testing | Prove the gate’s branches with a fake agent you inject, no real model call | npm run f7 |
Don’t just complete Foundations: experiment. Rerun each challenge with different instructions and watch what changes. How short can a prompt get before the agent loses the plot? Find where you have to spell things out, and where the model works it out on its own.
Then your choice of track. (The Discussion closes the workshop at the end.)
Patterns track (p1–p7): you orchestrate, then the model does
| # | Challenge | Goal | Command |
| - | --------- | ---- | ------- |
| p1 | Prompt chaining | Draft, check with a code gate, then fix only what failed | npm run p1 |
| p2 | Routing | Classify the input, then branch in code to a specialist | npm run p2 |
| p3 | Parallelization | Fan out independent reviewers at once, then aggregate | npm run p3 |
| p4 | Evaluator-optimizer | Score and improve in a loop until a bar or a cap | npm run p4 |
| p5 | Agentic | One agent sequences three tools itself, no if/else | npm run p5 |
| p6 | Delegation | An orchestrator agent whose tools are other agents | npm run p6 |
| p7 | Conversation | A multi-turn chat loop that streams and remembers | npm run p7 |
RAG track (r1–r2): ground the model in your own data
Needs a local embedding model: ollama pull embeddinggemma (separate from the chat model,
and needed even if you chat on Gemini).
| # | Challenge | Goal | Command |
| - | --------- | ---- | ------- |
| r1 | Retrieval | Embed your docs, rank by similarity, hand the top matches to the model as a tool | npm run r1 |
| r2 | Chunking | Split long documents into passages so a specific question matches a specific paragraph | npm run r2 |
Full-stack track: the agent behind a web UI (assumes some frontend comfort)
You start from a ready, production-shaped template, a single Hono server on Node
serving both the React useChat UI and the AI SDK agent, and build on it. Clone it
with the workshop CLI, then add your own tool and watch its tool-call card render. See
app/fullstack/ for the full lesson.
Every challenge has a solution:<id> (e.g. npm run solution:f3). If the room moves on, run
the solution and keep going; pace beats perfect completion.
Three self-serve tracks go further, top-level siblings of patterns and rag (do them
any time after Foundations, not in the live block): npm run mcp (tools from a separate
server), npm run resilience (errors as data on its own), and
npm run guardrails-middleware (the output half of f5’s guardrail).
A conference talk in talk/ previews the same TripMate and the same tool
shapes. Run npx tsx talk/demo.ts to list its patterns.
How a challenge is laid out
Section titled “How a challenge is laid out”Each one has:
start/agent.ts: runs as-is, with TODOs to dofinish/agent.ts: the reference solution (npm run solution:<id>)README.md: the lesson. It walks through the real code, gives you numbered build steps, lists the common traps, and ends with a verify step
Each challenge is self-contained. Open one file and the whole world is there, with the
tools and data inlined. The only shared file is shared/model.ts.
Tools are duplicated across challenges on purpose, so you never have to dig through shared folders to follow a lesson.
The full-stack track is shaped differently: it is a separate template you clone with the
workshop CLI (npx @jagreehal/ai-workshop fullstack-hono) and build on, not a start/
finish pair. See app/fullstack/.
The mcp track is also multi-process: a standalone MCP server in one terminal
(npm run mcp:server) and the agent in another (npm run mcp).
The mental model
Section titled “The mental model”If you get lost, ask four questions:
- What does the model know right now?
- What tool is available to help it?
- What shape of input does that tool expect?
- What does the tool return back into the loop?
Everything in the workshop is a variation on that loop: the model generates, decides when it needs a tool, you run the tool, the result goes back, and it repeats until the model can answer.
Prerequisites
Section titled “Prerequisites”- Node.js 22+
- One of: Ollama with
granite4.1:3bpulled, or a Google Gemini API key - For the RAG track only: Ollama with
embeddinggemmapulled (a local embedding model, needed even on Gemini) - 8GB RAM recommended (for the local model)
npm run verify checks that embeddinggemma is pulled when Ollama is up; that check is for the RAG track only and is not required if you are doing Foundations, Patterns, or Full-Stack.
Tech stack
Section titled “Tech stack”- Vercel AI SDK (
ai):ToolLoopAgent, the agent class used in every challenge - Zod: typed tool argument schemas
- Ollama (
ai-sdk-ollama) or Google (@ai-sdk/google): the model - autotel: OpenTelemetry tracing, printed to the console with
debug: "pretty" - Model Context Protocol (
@ai-sdk/mcp,@modelcontextprotocol/sdk): mcp - Vite + React (
@ai-sdk/react): the full-stack track’s chat UI - Playwright: the full-stack track’s end-to-end check
Stuck? See TROUBLESHOOTING.md.
License
Section titled “License”- Code (starter and solution files, scripts) — MIT; the
LICENSEfile ships alongside it. - Lessons (this and the challenge READMEs, diagrams) — CC BY-NC 4.0: share and adapt with attribution, not commercially. Build on the code freely; don’t resell the lessons.
© 2026 Jag Reehal.