Framework Integrations

Vercel AI SDK

The Vercel AI SDK doesn't have a native retriever concept — retrievals typically happen inside tool calls or before the generation. Sobuzo-sdk/vercel-ai focuses on LLM output capture, which is what unlocks CITED_FLAGGED attribution. Pair it with recordRetrieval for the retrieval side.

Install

npm

npm install buzo-sdk ai

generateText

Wrap the call with withBuzoGeneration. It runs your original call, captures text, usage, and modelId, and records a generation trace after the promise resolves.

Next.js route handler

import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { Buzo } from 'buzo-sdk'
import { withBuzoGeneration } from 'buzo-sdk/vercel-ai'
import { randomUUID } from 'node:crypto'

export const buzo = new Buzo({
  apiKey: process.env.BUZO_API_KEY!,
  outputCapture: 'redacted',
  outputRedactPatterns: [
    { pattern: /[\w.-]+@[\w.-]+\.\w+/g, replacement: '<EMAIL>' },
  ],
})

export async function POST(req: Request) {
  const { question } = await req.json()
  // One correlation id per request: stamped as parentQueryId on the
  // retrieval and as runId on the generation so Buzo can correlate them.
  const correlationId = randomUUID()

  // 1. Record retrievals separately (wrap your vector store client or call
  //    recordRetrieval directly).
  const hits = await vectorStore.query({ topK: 5, vector: embed(question) })
  buzo.recordRetrieval({
    collectionId: 'prod-support-kb',
    parentQueryId: correlationId,
    query: { text: question },
    results: hits.matches.map((m) => ({ id: m.id, score: m.score })),
    latencyMs: hits.latencyMs,
  })

  // 2. Wrap the generation.
  const result = await withBuzoGeneration(
    buzo,
    { runId: correlationId, collectionId: 'prod-support-kb', agentId: 'support-v3' },
    () => generateText({
      model: openai('gpt-4o'),
      prompt: buildPrompt(question, hits.matches),
    }),
  )

  return Response.json({ answer: result.text })
}

streamText

For streaming endpoints, use attachBuzoGeneration — it records the trace after the stream's text promise resolves, without interfering with your response.

Edge runtime

import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { attachBuzoGeneration } from 'buzo-sdk/vercel-ai'

export const runtime = 'edge'

export async function POST(req: Request, { waitUntil }: { waitUntil: (p: Promise<unknown>) => void }) {
  const { question } = await req.json()
  const runId = crypto.randomUUID()

  const stream = streamText({
    model: openai('gpt-4o'),
    prompt: question,
  })

  attachBuzoGeneration(buzo, { runId, collectionId: 'prod-support-kb' }, stream)
  waitUntil(buzo.flush())  // ship traces before the isolate is evicted

  return stream.toDataStreamResponse()
}

What gets captured

Field	Source
`output.text`	`result.text` (post-redaction if enabled)
`promptTokens` / `completionTokens`	`result.usage`
`model`	`result.response.modelId`
`runId`	Your correlation id. Use the same value as `parentQueryId` on the matching `recordRetrieval`.

Correlation is on you. Unlike LangChain, the Vercel AI SDK does not generate a shared run id across retrieval and generation. Mint a UUID at the start of each request and stamp it as parentQueryId on recordRetrieval and as runId on the withBuzoGeneration / attachBuzoGeneration context.