Concepts

Capture modes

Three independent dials control what leaves the SDK: queryCapture for retrieval queries, outputCapture for LLM generations, and resultsCapture for retrieved-vector content. Queries default to plaintext because that is where the analytical value is. Generations and results default to off / ids-only respectively — both commonly carry data that should only egress deliberately.

queryCapture

The user's raw query text is the single most useful signal for grouping findings by topic, surfacing common retrieval failure modes, and attributing retrieval health over time.

Mode	What ships	When to use
`plaintext` (default)	The full query text, UTF-8.	Default. Maximum analytical value. The customer is responsible for PII handling under the DPA with Buzo.
`hash`	SHA-256 hex of the query. No plaintext ever leaves the SDK.	Regulated deployments where queries may carry direct identifiers. Buzo can still detect repeated queries and join with reads, but cannot see the text.
`redact`	Query text with customer-supplied `redactPatterns` replaced in-place (e.g. emails → `<EMAIL>`).	Most queries are safe but some patterns must be scrubbed. The regexes run client-side before any network I/O.

Configuration

new Buzo({
  apiKey: process.env.BUZO_API_KEY!,
  queryCapture: 'redact',
  redactPatterns: [
    { pattern: /[\w.-]+@[\w.-]+\.\w+/g, replacement: '<EMAIL>' },
    { pattern: /\b\d{3}-\d{2}-\d{4}\b/g, replacement: '<SSN>' },
  ],
})

outputCapture v0.2+

LLM outputs are richer and more sensitive than queries — they often echo back parts of the user's message or include PII the user authored. Output capture is opt-in on purpose.

Mode	What ships	When to use
`off` (default)	Nothing. The LangChain `handleLLMStart`/`handleLLMEnd` hooks short-circuit — no map allocation, no timing tracked.	Default. Pick this unless you have actively decided to ship generations to Buzo.
`redacted`	Output text with `outputRedactPatterns` replaced before egress. Separate from `redactPatterns`.	You want `CITED_FLAGGED` attribution but must scrub known PII patterns (emails, SSNs, card numbers, etc.) on the way out.
`plaintext`	The full LLM generation text.	Maximum signal for citation matching. Reserve for environments with a DPA in place and customer-approved handling.

No hash mode for outputs. A SHA-256 of a long LLM generation cannot be substring-matched against retrieved vector content — it carries no analytical value for attribution. Use redacted instead.

resultsCapture v0.4+

Controls whether each retrieved vector's pageContent is shipped alongside its id and score. Defaults to ids-only — the pre-0.4 wire format — so upgrading the SDK does not change what leaves the network.

Citation matching relies on comparing retrieved content against the LLM output. With ids-only, Buzo can only match vectors that have a server-side content_snapshot from a prior scan. Opting in to plaintext or redacted makes every retrieved vector matchable, including those never scanned by Buzo.

Mode	What ships	When to use
`ids-only` (default)	Only `{ id, score }` per result. Identical to pre-0.4 behaviour.	Default. Zero change to wire payload. Citation matching works for the scanned subset of the corpus.
`redacted`	Each result's `content` with `resultsRedactPatterns`replaced before egress.	Retrieved chunks may echo user-authored data (prior messages, support tickets). Scrub known patterns on the way out.
`plaintext`	Full `content` per retrieved vector, UTF-8.	Maximum signal for citation matching across the entire corpus. Reserve for environments with a DPA in place.

Configuration

new Buzo({
  apiKey: process.env.BUZO_API_KEY!,
  resultsCapture: 'redacted',
  resultsRedactPatterns: [
    { pattern: /[\w.-]+@[\w.-]+\.\w+/g, replacement: '<EMAIL>' },
    { pattern: /\b\d{16}\b/g, replacement: '<CC>' },
  ],
})

Payload cap. Per-result content is capped server-side at 16 KB. Oversized chunks are rejected rather than truncated — truncate retriever-side if your chunks exceed that limit.

What the SDK does with each mode

Capture is synchronous. Redaction and hashing happen before the event enters the buffer, so a subsequent flush() is guaranteed to include it.
Modes are independent. You can run queryCapture: 'plaintext' with outputCapture: 'redacted' and resultsCapture: 'ids-only', or any combination.
Disabled entirely. Set disabled: true in tests or local dev — no network I/O, no buffer growth, capture modes are irrelevant.