Output guardrails (middleware)
Self-serve track. The output half of f5’s guardrail. Not part of the live 90-minute block; do it any time after Foundations f5. Run with
npm run guardrails-middleware(reference:npm run solution:guardrails-middleware).
In f5 you built an input guardrail: a cheap check that refused off-topic or unsafe requests before the model ran. This is the other half. An output guardrail inspects what the model generated and cleans it before the reply leaves your system, here by redacting personal data from the text.
Quick path
Section titled “Quick path”In a hurry? These three steps are the whole challenge. Everything below is the why and the how.
- Run
npm run guardrails-middlewareand watch TripMate read the traveller’s email and phone number straight back into the reply. - Edit
start/agent.ts: write awrapGeneratemiddleware that redacts email and phone from the text, wrapmodelwithwrapLanguageModel, and give the wrapped model to the agent. - Done when the same reply comes back with
<redacted email>and<redacted phone>in place of the real values, and you did not change the agent or its prompt to get there.
Mental model
Section titled “Mental model”f5 gated the input; this filters the output. The agent does not know the filter is there: you wrap the model once and hand it over.
The mechanic, on a throwaway middleware
Section titled “The mechanic, on a throwaway middleware”Middleware is the AI SDK’s language-model-agnostic way to add cross-cutting behavior: guardrails, logging, caching, RAG. You build it once against the model interface and it works with any provider. Here’s the entire shape on a throwaway middleware that just SHOUTS every reply, nothing to do with redaction:
Two moves to carry over to the redaction guardrail: a wrapGenerate that awaits
doGenerate() and maps the text parts of result.content (passing non-text parts
through), and a wrapLanguageModel that wraps your model so the agent never knows the filter
is there. Your version is the same shape with redact(part.text) in place of
.toUpperCase(), plus the redact helper you write.
Run it
Section titled “Run it”Build it
Section titled “Build it”-
Run it and watch the leak. Run
npm run guardrails-middleware. The prompt hands TripMate an email and a phone number and asks it to confirm them back. With the raw model, it does exactly that, and the contact details land in the reply. -
Write the middleware (TODO). Add a
redacthelper (two regexes are enough: one for email, one for phone) and apiiGuardrailmiddleware whosewrapGenerateawaitsdoGenerate()and returns the result with its text parts redacted. ImportwrapLanguageModeland theLanguageModelMiddlewaretype fromai. -
Wrap the model and hand it over. Build
safeModelwithwrapLanguageModeland pass it as the agent’smodel. Run again: the reply now shows<redacted email>and<redacted phone>, and you did not touch the agent’s instructions or the prompt to get there. That is the point of middleware: the guardrail is a property of the model, not something every caller has to remember.
Stuck? finish/agent.ts is the canonical version. Read it after you’ve had a real go.
-
Nothing is redacted. You wrapped the model but left the agent pointing at the raw
model, or you returned the originalresultinstead of the mapped copy. The middleware only matters if the wrapped model is the one the agent uses. -
You looked for
result.text. Older examples destructureconst { text } = await doGenerate(). In AI SDK v6 the text is inresult.contentas{ type: "text", text }parts; there is no top-leveltexton the generate result.
A couple of things worth knowing
Section titled “A couple of things worth knowing”Why middleware instead of editing the reply after `.generate()`?
You could redact result.text yourself at the call site, and for one script that is fine.
Middleware wins when the rule has to hold everywhere: wrap the model once and every agent
and every call that uses it inherits the filter, with no caller able to forget it. It is
also reusable and provider-agnostic, so the same guardrail works whether you are on Ollama
or Google. The same seam is where logging, caching, and RAG belong for the same reason.
Why is streaming harder?
wrapGenerate sees the whole reply at once, so redacting it is a string replace. The
streaming hook, wrapStream, sees the text in pieces as it is produced, and a value you
want to catch (an email, a card number) can be split across chunks. Doing it properly means
buffering and matching across the stream, which is why output guardrails are easiest to
reason about on the non-streaming path, and why f5 chose an input gate over filtering a
stream.
That is the output half of guardrails. With f5’s input gate in front and a filter like this behind, you have both sides of the pattern a production agent usually wants.