Framework Integrations
Vercel AI SDK
The Vercel AI SDK doesn't have a native retriever concept — retrievals typically happen inside tool calls or before the generation. Sobuzo-sdk/vercel-ai focuses on LLM output capture, which is what unlocks CITED_FLAGGED attribution. Pair it with recordRetrieval for the retrieval side.
Install
npm
npm install buzo-sdk aigenerateText
Wrap the call with withBuzoGeneration. It runs your original call, captures text, usage, and modelId, and records a generation trace after the promise resolves.
Next.js route handler
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { Buzo } from 'buzo-sdk'
import { withBuzoGeneration } from 'buzo-sdk/vercel-ai'
import { randomUUID } from 'node:crypto'
export const buzo = new Buzo({
apiKey: process.env.BUZO_API_KEY!,
outputCapture: 'redacted',
outputRedactPatterns: [
{ pattern: /[\w.-]+@[\w.-]+\.\w+/g, replacement: '<EMAIL>' },
],
})
export async function POST(req: Request) {
const { question } = await req.json()
// One correlation id per request: stamped as parentQueryId on the
// retrieval and as runId on the generation so Buzo can correlate them.
const correlationId = randomUUID()
// 1. Record retrievals separately (wrap your vector store client or call
// recordRetrieval directly).
const hits = await vectorStore.query({ topK: 5, vector: embed(question) })
buzo.recordRetrieval({
collectionId: 'prod-support-kb',
parentQueryId: correlationId,
query: { text: question },
results: hits.matches.map((m) => ({ id: m.id, score: m.score })),
latencyMs: hits.latencyMs,
})
// 2. Wrap the generation.
const result = await withBuzoGeneration(
buzo,
{ runId: correlationId, collectionId: 'prod-support-kb', agentId: 'support-v3' },
() => generateText({
model: openai('gpt-4o'),
prompt: buildPrompt(question, hits.matches),
}),
)
return Response.json({ answer: result.text })
}streamText
For streaming endpoints, use attachBuzoGeneration — it records the trace after the stream's text promise resolves, without interfering with your response.
Edge runtime
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { attachBuzoGeneration } from 'buzo-sdk/vercel-ai'
export const runtime = 'edge'
export async function POST(req: Request, { waitUntil }: { waitUntil: (p: Promise<unknown>) => void }) {
const { question } = await req.json()
const runId = crypto.randomUUID()
const stream = streamText({
model: openai('gpt-4o'),
prompt: question,
})
attachBuzoGeneration(buzo, { runId, collectionId: 'prod-support-kb' }, stream)
waitUntil(buzo.flush()) // ship traces before the isolate is evicted
return stream.toDataStreamResponse()
}What gets captured
| Field | Source |
|---|---|
output.text | result.text (post-redaction if enabled) |
promptTokens / completionTokens | result.usage |
model | result.response.modelId |
runId | Your correlation id. Use the same value as parentQueryId on the matching recordRetrieval. |
Correlation is on you. Unlike LangChain, the Vercel AI SDK does not generate a shared run id across retrieval and generation. Mint a UUID at the start of each request and stamp it as
parentQueryId on recordRetrieval and as runId on the withBuzoGeneration / attachBuzoGeneration context.
Buzo