refactor(llm): add LiteLLM-backed provider abstraction#2363
refactor(llm): add LiteLLM-backed provider abstraction#2363
Conversation
Introduce an internal LLMProvider helper that wraps LiteLLM provider parsing, provider model info lookup, and provider-aware API base inference. Use the helper in LLM and Telemetry so Bedrock auth handling and cost-calculation model/provider routing stop manually splitting model strings. Extend model_features to accept an LLMProvider so provider-aware rules can operate on parsed provider/model data while keeping the raw model string available. Add focused tests for the new helper and the updated feature lookup paths. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
||||||||||||||||||||||||||||||||||||||||
Rework the new LLMProvider helper so it no longer keeps both raw and parsed model strings internally. Accept full model strings at the SDK boundary, then immediately normalize to LiteLLM's parsed provider/model pair for transport, telemetry, and feature checks. Unknown models still remain opaque strings, but the provider abstraction itself now avoids duplicate raw-vs-parsed state. Also route chat/responses calls through parsed provider kwargs and tighten the associated tests. Co-authored-by: openhands <openhands@all-hands.dev>
Keep LLMProvider for LiteLLM-facing transport logic only and use the canonical model string for capability/feature lookups. Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands Babysit this PR, monitor CI, fix until green. |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
Final summary:
Conciseness:
Current state:
PR link: #2363 |
|
@OpenHands /codereview-roasted |
|
I'm on it! enyst can track my progress at all-hands.dev |
enyst
left a comment
There was a problem hiding this comment.
🔴 Needs improvement
[CRITICAL ISSUES]
- [openhands-sdk/openhands/sdk/llm/llm.py, Lines 495-499, 893-899, 1049-1056] Breaking change / duplicate state:
_provider_infois cached once during initialization and then reused for every transport call. That would be fine ifLLMwere actually immutable, but it isn't:model_copy(update=...)shallow-copies private attrs and does not rerun validators. So a copy with an updatedmodelorbase_urlnow keeps stale provider metadata. I verified this locally: copying agpt-4oLLM withmodel_copy(update={"model": "ollama/llama2"})leaves_provider_infoasopenai/gpt-4o, so transport kwargs are wrong. The old code re-inferred provider fromself.model/self.base_urlon each call, so this is a real regression introduced by the refactor.
[TESTING GAPS]
- [tests/sdk/llm/test_llm.py, Lines 419-430] Missing regression test: the new tests only prove
_provider_infois initialized on construction. Add coverage formodel_copy(update={"model": ...})andmodel_copy(update={"base_url": ...})(or make those operations impossible). Right now the exact stale-cache failure mode above isn't exercised at all.
VERDICT: ❌ Needs rework
KEY INSIGHT: this refactor adds a second source of truth for provider routing, and because LLM instances are still copyable/mutable, that cached state can drift away from the public model/base_url fields.
Leaving this as a comment review rather than approval because this touches LLM runtime behavior and could affect eval-facing execution paths.
This comment was marked as duplicate.
This comment was marked as duplicate.
|
[Automatic Post]: It has been a while since there was any activity on this PR. @enyst, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
1 similar comment
|
[Automatic Post]: It has been a while since there was any activity on this PR. @enyst, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
Summary
Refactor SDK LLM provider handling around an internal
LLMProviderhelper backed by LiteLLM, so provider-specific logic stops re-splitting rawprovider/modelstrings across the SDK.This PR now takes the simpler direction discussed in issue #2274 and PR review:
provider+modelview for LiteLLM-facing runtime pathsLLMinstance instead of refreshing provider state during each transport callConcretely, this PR:
openhands.sdk.llm.utils.litellm_provider.LLMProvideras a thin wrapper around LiteLLM provider parsing, provider model info lookup, API base inference, and call kwargs generationLLMProviderfocused on LiteLLM's parsed runtime shape (provider, parsedmodel, resolved API base) rather than storing duplicaterequested_*fieldsLLMchat/responses transport calls to use provider-owned LiteLLM kwargs generation (model+custom_llm_provider, plus Bedrock-awareapi_keyforwarding) instead of repeatedly passing or re-splitting the full stringapi_keyhandling intoLLMProviderso provider-sensitive transport behavior lives alongside provider parsingmodel_featureslookups on the canonical model string (model_canonical_nameormodel) instead of introducing a second provider abstractionLLMsetup, so chat/responses transport paths reuse the same resolved provider metadata for the lifetime of the instanceCloses #2274.
Checklist
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:2d5b6e4-pythonRun
All tags pushed for this build
About Multi-Architecture Support
2d5b6e4-python) is a multi-arch manifest supporting both amd64 and arm642d5b6e4-python-amd64) are also available if needed