Skip to content

Fix tuple schema compatibility with OpenAI structured output API#619

Open
datvo06 wants to merge 48 commits intomasterfrom
dn-fix-tuple-encoding-openai
Open

Fix tuple schema compatibility with OpenAI structured output API#619
datvo06 wants to merge 48 commits intomasterfrom
dn-fix-tuple-encoding-openai

Conversation

@datvo06
Copy link
Copy Markdown
Contributor

@datvo06 datvo06 commented Mar 23, 2026

Addresses the prefixItems rejection from OpenAI for fixed-length tuple types (newly rediscovered in #617) by porting the Annotated-based approach.

Changes:

  • Add _safe_tuple_type: rewrites tuple[T1, T2, ...] into Annotated[..., PlainValidator, PlainSerializer, WithJsonSchema] so schema, validation, and serialization are handled in one place via Pydantic's extension API
  • Add _rewrite_tuple_annotations: applies _safe_tuple_type recursively for
    dataclasses containing tuple fields
  • TupleEncodable.encode/deserialize now use TupleItems model instances consistently;
    shared field access via _extract_items
  • SequenceEncodable.encode/deserialize return lists to preserve encode idempotency
  • _inline_refs: inlines $ref pointers in WithJsonSchema values (workaround for
    TypeAdapter fails when object.__get_pydantic_json_schema__ has $ref and $defs pydantic/pydantic#12145)

_safe_tuple_type is a self-contained port of _pydantic_type_tuple from #584, scoped to the current Encodable system. _rewrite_tuple_annotations handles the dataclass-with-tuple-field case until #584 lands.

datvo06 added 17 commits March 22, 2026 14:50
… env var

Replace provider-specific environment variable checks (OPENAI_API_KEY,
ANTHROPIC_API_KEY) and skip markers (requires_openai, requires_anthropic)
with a single EFFECTFUL_LLM_MODEL environment variable that controls
which model is used for all LLM integration tests.

- Remove hardcoded model names from all test files in favor of LLM_MODEL
  read from EFFECTFUL_LLM_MODEL env var
- Replace requires_openai/requires_anthropic markers with requires_llm
- Remove model parametrization that was cross-provider; tests now use
  whichever model the env var specifies
- Use litellm.supports_vision() to conditionally skip vision tests
- Remove default model from LiteLLMProvider (make model required)
- Update CI workflow to pass EFFECTFUL_LLM_MODEL as a matrix parameter,
  making it easy to add parallel CI stages for different providers
- Rename/remove fixture files to match updated test names

Closes #589
…ename test

- Move LLM_MODEL and requires_llm definitions to tests/conftest.py;
  all four LLM test files now import from there
- Fix import ordering in test_handlers_llm_encoding.py (stdlib before local)
- Rename test_agent_tool_names_are_openai_compatible_integration to
  test_agent_tool_names_are_valid_integration since it's no longer
  OpenAI-specific
- Restore LiteLLMProvider default model via env var fallback so existing
  callers (template.py, test_handlers_llm_template.py, notebook) are not
  broken: model=os.environ.get("EFFECTFUL_LLM_MODEL", "gpt-4o")
- Move LLM_MODEL/requires_llm from conftest.py to tests/_llm_helpers.py
  since conftest.py should not be imported directly
- Fix import placement in test_handlers_llm_provider.py (was between
  module-level constants instead of grouped with imports)
- Remove redundant @requires_llm on vision tests where the skipif
  condition already covers the not-LLM_MODEL case
The env var belongs in test infrastructure, not the library API.
LiteLLMProvider should have a clean, explicit default.
Remove provider-specific environment variable checks and skip markers
from LLM tests in favor of a single EFFECTFUL_LLM_MODEL env var.

- Add LLM_MODEL and requires_llm to tests/conftest.py; LLM_MODEL
  defaults to "gpt-4o-mini" and is overridable via EFFECTFUL_LLM_MODEL
- Live tests (tool calling, encoding, agent tool names) use LLM_MODEL
  and are gated by requires_llm which checks for any provider API key
- Integration tests use make_provider() which returns a live
  LiteLLMProvider when API keys are available, else falls back to
  ReplayLiteLLMProvider for offline replay from fixtures
- Replay-only tests (simple_prompt_multiple_models, cross_endpoint,
  caching) keep hardcoded model names and always run since they never
  call the API
- Update CI workflow to pass EFFECTFUL_LLM_MODEL as a matrix parameter
  for easy parallel stages with different providers

Closes #589
Three fixes in encoding.py:

1. TupleEncodable.encode() now returns a TupleItems model instance
   (not a raw tuple), and deserialize() returns the model directly.
   This fixes pydantic validation in litellm integration tests for
   NamedTuple and fixed-tuple types.

2. Add _TupleSafeJsonSchema that overrides pydantic's tuple_schema()
   to produce object schemas (item_0, item_1 properties) instead of
   prefixItems arrays. Applied via _BoxEncoding.model_json_schema()
   so dataclasses containing tuple fields produce OpenAI-compatible
   schemas.

3. SequenceEncodable.encode() returns a list (not tuple) to preserve
   encode idempotency — nested_type on a list dispatches to the
   sequence encoder, avoiding a mismatch with TupleEncodable.

Also adds test_handlers_llm_encoding.py back to CI workflow.
Revert encoding.py to master and remove encoding tests from CI
workflow. Tuple schema fixes will be in a dedicated PR.
Three fixes in encoding.py:

1. TupleEncodable.encode() returns a TupleItems model instance (not a
   raw tuple), and deserialize() returns the model directly. This fixes
   pydantic validation in litellm integration tests for NamedTuple and
   fixed-tuple types. Extract shared field access into _extract_items().

2. Add _TupleSafeJsonSchema that overrides pydantic's tuple_schema() to
   produce object schemas (item_0, item_1 properties) instead of
   prefixItems arrays. Applied via _BoxEncoding.model_json_schema() so
   dataclasses containing tuple fields produce OpenAI-compatible schemas.
   Can be removed once #584 replaces the Encodable system.

3. SequenceEncodable.encode()/deserialize() return lists (not tuples) to
   preserve encode idempotency — nested_type on a list dispatches to the
   sequence encoder, avoiding a mismatch with TupleEncodable.
Use Pydantic's Annotated extension points (PlainValidator, PlainSerializer,
WithJsonSchema) to handle tuples — adapted from _pydantic_type_tuple in #584.

- _safe_tuple_type: rewrites fixed-length tuple[T1, T2, ...] into an
  Annotated type with custom validation (item_N dict → tuple), serialization
  (tuple → item_N dict), and JSON schema (object with item_0/item_1
  properties instead of prefixItems). Single mechanism for schema, validation,
  and serialization.

- _rewrite_tuple_annotations: recursively applies _safe_tuple_type to
  dataclass fields, creating an Annotated proxy so nested tuples inside
  objects also get safe schemas.

- TupleEncodable.encode() returns TupleItems model instances; deserialize()
  returns the model directly. Shared field access via _extract_items().

- SequenceEncodable.encode()/deserialize() return lists to preserve encode
  idempotency with the new TupleEncodable return type.
@datvo06 datvo06 force-pushed the dn-fully-parametric-model-test branch from be46f6a to 7aad873 Compare March 23, 2026 17:22
@datvo06 datvo06 requested a review from eb8680 March 23, 2026 17:24
Use Pydantic's Annotated extension points (PlainValidator, PlainSerializer,
WithJsonSchema) to handle standalone tuple types — adapted from
_pydantic_type_tuple in #584.

- _safe_tuple_type: rewrites fixed-length tuple[T1, T2, ...] into an
  Annotated type with custom validation (item_N dict -> tuple), serialization
  (tuple -> item_N dict), and JSON schema (object properties instead of
  prefixItems). Single mechanism for schema, validation, and serialization.

- TupleEncodable.encode() returns TupleItems model instances; deserialize()
  returns the model directly; shared field access via _extract_items()

- _inline_refs applied to tool parameter schemas to avoid $ref rejection

- SequenceEncodable.encode()/deserialize() return lists to preserve encode
  idempotency with the new TupleEncodable return type

Dataclasses containing tuple fields (dc-with-tuple) are xfailed — fixing
them requires recursive type rewriting that would duplicate the systematic
approach in #584. Better to wait for that PR than to build a piecemeal
shadow type system on top of Pydantic.
@datvo06 datvo06 force-pushed the dn-fix-tuple-encoding-openai branch from bd50bda to afe8338 Compare March 23, 2026 20:57
Copy link
Copy Markdown
Contributor

@eb8680 eb8680 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by the purpose of this PR. There are no new tests that were failing without and passing with the behavior changes, and there are existing tests xfailing with the changes that previously passed. Did we not fix the OpenAI API tuple encoding already? What is new here?

It also seems to have introduced a fragile new invariant that Encodable.encode must never return tuples or values that contain tuples that are not fields of some enclosing BaseModel, but that's not documented anywhere and can't be checked statically by a type checker.

Comment thread tests/test_handlers_llm_encoding.py Outdated
Comment on lines +713 to +722
# Dataclass with tuple field: Pydantic produces prefixItems schema that
# OpenAI rejects. Proper fix requires recursive type rewriting (#584).
if case_id == "dc-with-tuple":
marks.append(_provider_response_format_xfail)
# SequenceEncodable.enc is a generic alias, not a BaseModel — litellm rejects it.
if case_id in ("tuple-bare", "tuple-variadic"):
marks.append(_provider_response_format_xfail)
# LLM may return a URL instead of base64 for image tuples.
if case_id == "tuple-img-str":
marks.append(_provider_response_format_xfail)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these new xfails being added? If this PR fixes the issue, I would expect the relevant tests to fail before the PR and pass after, but this seems to be saying the opposite?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I should have noted this down. During #617 I recognized that test_llm.yml (the live integration workflow with real API keys) runs test_handlers_llm_provider.py, test_handlers_llm_tool_calling_*.py, but not test_handlers_llm_encoding.py. So the provider-path tests in test_handlers_llm_encoding.py were never run with a real API key in CI.
I was having this plan to fix this:

  • dc-with-tuple: Thinking of deferring to Replace Encodable implementations with Pydantic #584, should say so explicitly in the xfail reason string.
  • tuple-bare / tuple-variadic - SequenceEncodable.enc is a list[T] generic alias, not a BaseModel subclass, so litellm rejects it as response_format. I'm thinking that I can create another PR if the structure introduced here work.
  • tuple-img-str - This is supported. Let me remove this xfail.

Comment thread effectful/handlers/llm/encoding.py Outdated
return tuple(self.element_encoder.encode(elem) for elem in value)
# Return a list so that nested_type routes to the sequence dispatcher
# (not the tuple dispatcher), preserving encode idempotency.
return list(self.element_encoder.encode(elem) for elem in value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change and similar ones necessary?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encode idempotency law (encode(encode(v)) == encode(v)) requires that nested_type on the output dispatches back to the same Encodable. If encode returns a tuple, nested_type infers tuple[T1, T2, ...] and dispatches to TupleEncodable instead of SequenceEncodable, breaking idempotency. Returning a list keeps the dispatch consistent. Agreed this is fragile; I'll add a note to the Encodable.encode contract and consider whether the dispatcher should be more explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Noting that this needed to change due to the safe_tuple_type fix)

@datvo06
Copy link
Copy Markdown
Contributor Author

datvo06 commented Apr 1, 2026

The PR splits out two distinct things, which I should have been clearer about in the description:

  1. The fix _safe_tuple_type rewrites fixed-length tuple[T1, T2, ...] annotations into an Annotated form that produces a valid OpenAI schema (no prefixItems). This was the original failing case from Parametric LLM model tests #617. I discovered that previously we were ignoring the test_handler_llm_encoding integration in the integration test, so the integration between encoding and LLM was never triggered.
  2. The xfails are newly-failing cases exposed by this PR's changes, not pre-existing failures. Specifically:
    - dc-with-tuple: this PR handles top-level tuple fields but _rewrite_tuple_annotations doesn't recurse into nested dataclasses. This is a known gap, and I was concerned whether this overlap with the pending Replace Encodable implementations with Pydantic #584.
    - tuple-bare / tuple-variadic: SequenceEncodable.enc is a list[T] generic alias, which litellm rejects as response_format. This was masked before because the old code path hit a different error first.

The invariant is a must given how nested_type works. nested_type has a dedicated handler for tuple values that infers a fixed-length tuple[T1, T2, ...] type:

  @nested_type.register
  def _(value: tuple):
      if type(value) != tuple or len(value) == 0:
          return nested_type.dispatch(collections.abc.Sequence)(value)
      else:
          return Box(tuple[tuple(nested_type(item).value for item in value)])

If SequenceEncodable.encode returns a tuple, and that encoded value is later passed through Encodable.define, nested_type routes it to TupleEncodable instead of SequenceEncodable: this breaks the semantics. Returning list sidesteps this because list hits the Collection handler which routes correctly.

I'll add a docstring to Encodable.encode to note this.

Base automatically changed from dn-fully-parametric-model-test to master April 1, 2026 22:40
datvo06 added 2 commits April 9, 2026 13:46
Replace the hand-rolled Encodable ABC and its 10+ subclasses with a
recursive type rewriter (TypeToPydanticType) that produces Annotated
wrappers with Pydantic validators/serializers. This fixes #626 (tuple
fields in dataclasses) and #631 (Callable fields in dataclasses) by
recursively rewriting field annotations before Pydantic sees them.

- Add TypeEvaluator ABC to unification.py for recursive type traversal
- Rewrite encoding.py: Encodable[T] now returns Annotated types via
  TypeToPydanticType, used with pydantic.TypeAdapter
- Update completions.py: Encodable.define() -> TypeAdapter(Encodable[T])
- Update template.py: extract _collect_tools to completions.py
- Update all test files for new API
@datvo06
Copy link
Copy Markdown
Contributor Author

datvo06 commented Apr 9, 2026

Update: Fixes #626, #631. Supersedes #584.

  • Summary
    • Replace the hand-rolled Encodable ABC and its 10+ subclasses with TypeToPydanticType, a recursive type rewriter that produces Annotated wrappers with Pydantic validators/serializers
    • Add TypeEvaluator ABC to unification.py - a generic recursive type-expression walker using singledispatchmethod
    • Migrate all Encodable.define(T).encode/decode call sites to TypeAdapter(Encodable[T]).dump_python/validate_python
    • Extract _collect_tools() and to_content_blocks() helpers in completions.py

Why

- Add field rewriting for dataclass and BaseModel types whose fields
  contain special types (Callable, tuple, Image, etc.). Uses a proxy
  Pydantic model for validation/serialization while preserving the
  original type identity on roundtrip. Fixes #631.
- Add additionalProperties:false to all object schemas in _inline_refs
  for OpenAI strict mode compatibility.
- Update test_handlers_llm_provider.py for new encoding API:
  response_format=Encodable.define(T) -> response_type=T, tools= -> env=
- Add composite type regression tests for #626 and #631.
- Fix ruff formatting in test_internals_unification.py.
datvo06 added 12 commits April 9, 2026 15:30
- Remove unused type: ignore in unification.py
- Fix _call_assistant mock signature (tools/response_format -> env/response_type)
- Fix mock tool call args to remove old {"value": ...} wrapping
- Fix test_synthesized_function_roundtrip to use dump_python(mode="json")
- Rebuild all provider test fixtures with gpt-4o for new API format
- Fix caching test fixtures for deterministic replay
- Update template formatting tests: int values no longer wrapped in {"value": N}
- Fix tool call mock args in template test (remove {"value": ...} wrapping)
- Add _strict_json_schema() to completions.py (provider boundary)
- Apply to tool specs in call_assistant
- Wrap response_format schema in object wrapper for encoding integration tests
- Apply _strict_json_schema to tool spec tests
- Enhance _strict_json_schema to inline $ref/$defs recursively
- Add items fallback for arrays (required by OpenAI strict mode)
- Handle prefixItems → items conversion for fixed-length tuples
- Use dual type: ignore[attr-defined,unused-ignore] for CI/local mypy compat
- Fix xfail pattern matching to use startswith for precise case filtering
Image types can't roundtrip through LLM (returns URLs, not data URIs).
Tool/DecodedToolCall types have schemas incompatible with OpenAI strict
mode when used as nested parameter types. Apply PROVIDER_CASES (with
xfails) to tool-as-param and tool-as-return tests, and extend the
pattern to catch composite image types (e.g. tuple-img-str, list-img).
Return types don't affect the tool spec schema sent to OpenAI, so
all cases pass without xfails. Only tool-as-param needs PROVIDER_CASES.
- response_model: xfail img/tool/dtc (LLM can't return images, Tool
  schema incompatible with strict mode)
- tool-as-param: xfail only tool/dtc (image types produce valid param
  schemas)
- tool-as-return: no xfails needed (return type not in tool spec)
The mypy type-check tests call mypy in-process. When xdist schedules
them on separate workers simultaneously, mypy's C extensions segfault.
Use xdist_group("mypy") to ensure all mypy tests run on the same
worker, and set --dist loadgroup as default to respect the grouping.
Copy link
Copy Markdown
Contributor

@eb8680 eb8680 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be attempting to transform arbitrary record types. One motivation for the new Encodable interface in #584 was to avoid the need for this, at least to start.

There is also a bunch of schema-munging happening that is likely not necessary.

Comment thread tests/test_handlers_llm_encoding.py Outdated
enc: pydantic.TypeAdapter[Any] = pydantic.TypeAdapter(
Encodable[type(tool)] # type: ignore[misc]
)
tool_spec = _strict_json_schema(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should happen internally when applying Encodable to Tool. Tests and user code should never have to invoke _strict_json_schema directly.

Comment thread tests/test_handlers_llm_encoding.py Outdated
enc: pydantic.TypeAdapter[Any] = pydantic.TypeAdapter(
Encodable[type(tool)] # type: ignore[misc]
)
tool_spec = _strict_json_schema(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should happen internally when applying Encodable to Tool. Tests and user code should never have to invoke _strict_json_schema directly.

Comment thread tests/test_handlers_llm_encoding.py Outdated
PROVIDER_CASES = _cases_with_provider_xfails(ROUNDTRIP_CASES)

# response_model: image types can't roundtrip (LLM returns URLs, not data URIs),
# and Tool/DecodedToolCall schemas are incompatible with OpenAI strict mode.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment about Tool seems like a bug, it should be fixed rather than xfailed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On master, Tool/DecodedToolCall were xfailed for response_format integration tests. We maintain the same xfails. We still support encoding/deserializing them for tool calling.

Tool as response_format fails because OpenAI rejects ChatCompletionToolParam's schema (bare "type": "object" without additionalProperties: false).
Fixing it is non-trivial and gory because we'd have to write a recursive schema traversal and add that additionalProperties annotation.

I tried finding tools from Litellm, OpenAI, or Pydantic, but none of them provide exact tools for this.

Comment thread tests/test_handlers_llm_encoding.py Outdated
elif hasattr(tool_spec_obj, "model_dump"):
return dict(tool_spec_obj.model_dump())
raise TypeError(f"Unexpected encoded tool spec type: {type(tool_spec_obj)}")
# tool-as-param: only Tool/DecodedToolCall schemas fail (nested function spec).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment about Tool seems like a bug, it should be fixed rather than xfailed.

Comment thread tests/test_handlers_llm_encoding.py Outdated
)
response = litellm.completion(
model=EFFECTFUL_LLM_MODEL,
response_format={
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a Pydantic model rather than a raw JSON schema.

Comment thread effectful/handlers/llm/encoding.py Outdated

@TypeToPydanticType.register(object)
def _pydantic_type_base(ty: type) -> Any:
if dataclasses.is_dataclass(ty) and isinstance(ty, type):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am skeptical of the correctness of this transformation for general polymorphic dataclasses, and it has almost no test coverage. I would strongly prefer to remove it and require users to add Encodable annotations manually unless/until we come up with a more convincing specification for this kind of automated behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it to only handle base type (return the type) instead of rewriting record types.

Comment thread effectful/handlers/llm/encoding.py Outdated

def decode(self, encoded_value: str) -> str:
return encoded_value
def _rewrite_fields(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct in general, especially not shared across different record type families, and I suspect it would be very difficult to cover all the edge cases, especially once polymorphism is included. I would strongly prefer to remove it from this PR in order to keep the changes tractable and sound.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I removed it and added dataclass encodable in PairManual (dataclass with tuple) in the test.

Comment thread effectful/handlers/llm/encoding.py Outdated
@dataclass
class NamedTupleEncodable[T](TupleEncodable[T]):
"""Tuple encodable that reconstructs the original NamedTuple type on decode."""
def _strict_json_schema(schema: dict) -> dict:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be two versions of this function. I suspect this transformation is not actually necessary to begin with given that it has not been a problem before - perhaps litellm or pydantic enforces it somehow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one provided by OpenAI and replaced it!

Comment thread effectful/handlers/llm/completions.py Outdated
value: T


def _strict_json_schema(schema: dict) -> dict:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be two versions of this function (this one and one in encoding.py). I also suspect it is not necessary or at least is already available in some litellm utility, otherwise nothing would have worked before with the OpenAI API.

Comment thread effectful/handlers/llm/completions.py Outdated

Applies all transformations required by the OpenAI strict-mode API:

* Inlines ``$ref``/``$defs`` (OpenAI does not support JSON Schema references).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have never had issues with this before, either, which strengthens my suspicion that litellm or pydantic are doing this somewhere already. The _inline_refs I originally implemented in #584 was a workaround for a Pydantic bug, not a necessary invariant for OpenAI API compatibility.

datvo06 added 9 commits April 13, 2026 21:38
…operties_true

Symmetric to litellm.utils._remove_additional_properties (which removes false
for Vertex AI). Strips additionalProperties: true from litellm/OpenAI models
that use extra="allow", then lets _ensure_strict_json_schema apply false.

Also switch DecodedToolCall schema to OpenAI's ChatCompletionMessageToolCall
(has actual fields: id, function, type) instead of litellm's (empty dict).

Remove tool/dtc xfails — schemas are now strict-mode compatible.
Litellm's ChatCompletionMessageToolCall has no fields (extra="allow"),
so model_validate is a no-op. Switch to OpenAI's type which validates
id, function.name, function.arguments fields. Also fix serializer to
use type="function" (OpenAI const) and json.dumps for arguments.
…codedToolCall

_ensure_strict_json_schema forces all properties into required, breaking
schemas with optional fields (e.g. ChatCompletionToolParam's parameters,
description, strict, cache_control). _make_strict_safe makes optional
properties nullable before requiring them, so they're safe for OpenAI
strict mode.

Also removes _ensure_strict_json_schema from Image and Callable handlers
where it was unnecessary, and provides encoded tool context in the
response_model integration test so the LLM can produce valid
DecodedToolCall instances.
Tool/DecodedToolCall as response_model or tool parameter are
fundamentally incompatible with OpenAI strict mode (optional fields,
bare "type": "object" without properties). This matches master's
behavior which also xfailed these cases.

Removes _make_strict_safe, restores _ensure_strict_json_schema from
openai for Tool/DecodedToolCall WithJsonSchema schemas.
Only used for Tool/DecodedToolCall WithJsonSchema schemas which are
already xfailed in integration tests. No test depends on it.
@datvo06
Copy link
Copy Markdown
Contributor Author

datvo06 commented Apr 14, 2026

I did more cleaning up. Now the dataclass encodable is defined by user, it will be fine as long as user have dataclass fields being encodable:

@dataclass
class _PairManual:
    values: Encodable[tuple[int, str]]
    count: int

But there is still one test marked xfail (both here and on master), which is the round-trip Tool and DecodedToolCall.

On master, Tool/DecodedToolCall were xfailed for response_format integration tests. We maintain the same xfails. We still support encoding/deserializing them for tool calling.

Tool as response_format fails because OpenAI rejects ChatCompletionToolParam's schema (bare "type": "object" without additionalProperties: false).
Fixing it is non-trivial and gory because we'd have to write a recursive schema traversal and add that additionalProperties annotation.

I tried finding tools from Litellm, OpenAI, or Pydantic, but none of them provide exact tools for this.

@datvo06 datvo06 requested a review from eb8680 April 14, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants