fix(ai): undo strict-mode null-widening before structured-output validation#732
fix(ai): undo strict-mode null-widening before structured-output validation#732DrewHoo wants to merge 2 commits into
Conversation
…dation Optional fields are widened to required+nullable for strict structured output, so providers return `null` for an absent optional. Validating that `null` against the original schema failed (`.optional()` is `T | undefined`, not `T | null`), surfacing as a StandardSchemaValidationError — most visibly through @tanstack/ai-openrouter, whose adapter preserves provider nulls. Add `undoNullWidening(value, schema)` to @tanstack/ai-utils: a schema-aware counterpart to `transformNullsToUndefined` that drops only synthesized nulls (those the original JSON Schema disallows) while preserving the ones a `.nullable()`/`.nullish()` field genuinely allows. The chat activity runs it on the structured-output result before Standard Schema validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR adds a schema-aware undoNullWidening utility to ai-utils, re-exports it, integrates it into chat structured-output validation (applies before Standard Schema parsing when outputSchema is a Standard JSON Schema), adds unit and end-to-end tests, and records the change in a changeset with dependency updates. ChangesStructured Output Null Normalization
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/ai-utils/src/transforms.ts`:
- Around line 84-95: resolveSchema currently returns the first
object/array-shaped variant from schema.anyOf/schema.oneOf which can be wrong
when multiple non-null branches match; change the logic in resolveSchema to
collect all non-null matching variants (using the same shape-check: array vs
object heuristics), and only return a variant when exactly one non-null match is
found, otherwise return the original schema to avoid destructive rewrites that
strip valid null branches; reference the resolveSchema function and the local
variables variants, match, isArray when making this change.
- Around line 110-115: The array handling currently applies resolved.items[0] to
every element; change the mapping to pick the schema per index when
resolved.items is an array: compute itemSchemaForIndex =
Array.isArray(resolved.items) ? resolved.items[i] ||
resolved.items[resolved.items.length - 1] || resolved.additionalItems ||
resolved.items[0] : resolved.items, then call value.map((item, i) => walk(item,
itemSchemaForIndex)); update the block around resolveSchema, resolved, and walk
to use the index-aware schema selection so tuple-style schemas are handled
correctly.
In `@packages/ai-utils/tests/transforms.test.ts`:
- Around line 2-6: The import order violates ESLint import/order: combine the
type and value imports into a single statement or place the value imports before
the type import so types don't come first; specifically adjust the imports
involving JsonSchemaNode, transformNullsToUndefined, and undoNullWidening so
they are imported in one consolidated line or with transformNullsToUndefined and
undoNullWidening imported before the type JsonSchemaNode to satisfy the linter.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dd2ec226-feb0-42be-b3f6-5e7d26a784d9
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (7)
.changeset/structured-output-undo-null-widening.mdpackages/ai-utils/src/index.tspackages/ai-utils/src/transforms.tspackages/ai-utils/tests/transforms.test.tspackages/ai/package.jsonpackages/ai/src/activities/chat/index.tspackages/ai/tests/chat-structured-output-null-normalization.test.ts
…oNullWidening Addresses CodeRabbit review on TanStack#732: - resolveSchema now descends only when exactly one non-null anyOf/oneOf branch matches the value's shape; ambiguous unions keep the original schema rather than risk stripping a null a sibling branch allows. - Array walking applies tuple-style `items: [a, b, …]` schemas per index instead of always using the first. Adds coverage for both and fixes the test's import order. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thanks @coderabbitai — all three addressed in 3f44678:
Added unit coverage for both the tuple and ambiguous-union cases. |
|
Tip For best results, initiate chat on the files or code changes.
(ノ✧▽✧)ノ 🐇 |
Summary
Structured output rejects a valid provider response when the schema has an optional field. To satisfy OpenAI-style strict schemas, optional fields are widened to
required+ nullable, so the provider returnsnullfor an absent optional — but thatnullis then validated against the original schema, where.optional()meansT | undefined, notT | null. Validation throws.This adds a schema-aware
undoNullWideningto@tanstack/ai-utilsand runs it on the structured-output result before Standard Schema validation, so the synthesized nulls are dropped while genuine.nullable()/.nullish()nulls are preserved.Fixes #731
Reproduction
The outbound schema sent to the provider widens
noteto{ "type": ["string", "null"] }and adds it torequired, so the model returns{ "title": "…", "note": null }. The original Valibot schema rejects thatnull.Root cause
convertSchemaToJsonSchema(schema, { forStructuredOutput: true })widens optional →required+ nullable (correct, and required for strict mode). The round trip just never undoes it: the response is validated against the un-widened original viaparseWithStandardSchema, which rejects thenullthe widening told the model to send.Fix
@tanstack/ai-utilsgainsundoNullWidening(value, schema)— a schema-aware counterpart totransformNullsToUndefined. It consults the original (un-widened) JSON Schema and drops anullonly where the schema disallows one (a synthesized widening), preserving nulls a.nullable()/.nullish()field genuinely allows, and leaving nulls under shapes the schema doesn't describe untouched.@tanstack/airuns it inside the structured-outputvalidatecallback (activities/chat), beforeparseWithStandardSchema, usingconvertSchemaToJsonSchema(outputSchema)(noforStructuredOutput) as the nullability source of truth.Why core, not the adapter
By the time the schema reaches the adapter it's already been widened (
forStructuredOutput: true), so the adapter can no longer tell a synthesized null from a genuine one. Only the activity layer still holds the original schema, so the normalization has to live there. This also fixes it for every adapter on the validated path at once, rather than per-adapter.Tests
@tanstack/ai-utils: unit coverage forundoNullWidening— optional vs. nullable, mixed, nested nullable objects (viaanyOf), array items, schemaless passthrough.@tanstack/ai: end-to-endchat({ outputSchema })regression — a providernullon an optional field now resolves (key absent), and anullon a.nullable()field is preserved.@tanstack/aisuite (1005 tests) +tsc+ eslint pass.Out of scope / follow-up
There's a symmetric latent issue in the
@tanstack/openai-baseadapters: they blind-strip all nulls (transformNullsToUndefined), so a genuinely.nullable()required field that returnsnullis stripped and then fails validation as missing — the inverse of the bug fixed here.optional(T | undefined)nullablerequired (T | null)Unifying that is intentionally not in this PR: the openai-base null-strip runs inside
structuredOutputStream(shared by the non-streaming Promise path and the streamed path, which re-emits the adapter chunk), so fixing it cleanly means either threading the original schema down to the adapters or normalizing the streaming chunk pipeline — a larger, more invasive change. Happy to follow up with that if you'd prefer a single unified pass;undoNullWideningis exported and ready to be reused there.🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
New Features
Tests