UN-3114 FIX: Handle LLM refusal responses to prevent NoneType errors#1799
UN-3114 FIX: Handle LLM refusal responses to prevent NoneType errors#1799chandrasekharan-zipstack merged 8 commits intomainfrom
Conversation
When Anthropic (or other providers) refuse to generate a response due to safety restrictions, LiteLLM returns content=None with finish_reason="refusal". Previously, this caused a crash in _post_process_response when calling .find() on None, producing the unhelpful error: "'NoneType' object has no attribute 'find'". Now detect None content early in complete(), acomplete(), and stream_complete(), check finish_reason, and raise a descriptive LLMError that clearly indicates the refusal or empty response cause. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughCentralized streamed-chunk handling into Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant SDK_LLM
participant StreamSource
participant CallbackManager
Caller->>SDK_LLM: stream_complete(request)
SDK_LLM->>StreamSource: open stream / receive chunk
StreamSource-->>SDK_LLM: chunk (choices[0].delta, finish_reason?)
SDK_LLM->>SDK_LLM: _process_stream_chunk(chunk, callback_manager, has_yielded_content)
alt delta empty
SDK_LLM-->>SDK_LLM: return None (suppress empty delta)
else delta present
SDK_LLM->>CallbackManager: on_stream(text) (if present)
SDK_LLM-->>Caller: yield LLMResponseCompat (delta=text)
SDK_LLM->>SDK_LLM: set has_yielded_content = true
end
alt finish_reason in REFUSAL_FINISH_REASONS and not has_yielded_content
SDK_LLM-->>Caller: raise _raise_for_empty_response(finish_reason)
else finish_reason in REFUSAL_FINISH_REASONS and has_yielded_content
SDK_LLM->>SDK_LLM: suppress raise, continue streaming
end
loop until stream ends
StreamSource-->>SDK_LLM: next chunk...
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds explicit handling for LLM provider refusal responses (e.g. Anthropic's Key changes:
Notable issue:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[LLM.complete / acomplete] --> B[litellm.completion]
B --> C{response_text is None?}
C -- No --> D[_post_process_response]
D --> E[Return LLMResponseCompat]
C -- Yes --> F[_raise_for_empty_response]
F --> G{finish_reason in REFUSAL_FINISH_REASONS?}
G -- Yes --> H[LLMError 400\nrefusal / content_filter]
G -- No --> I[LLMError 500\nempty response]
J[LLM.stream_complete] --> K[litellm.completion stream=True]
K --> L[For each chunk]
L --> M{chunk.get choices empty?}
M -- Yes --> L
M -- No --> N{finish_reason in\nREFUSAL_FINISH_REASONS?}
N -- Yes, content\nalready yielded --> O[Log warning\nreturn None]
N -- Yes, no content\nyielded yet --> P[_raise_for_empty_response\nLLMError 400]
N -- No --> Q{text empty?}
Q -- Yes --> L
Q -- No --> R[Yield LLMResponseCompat\nhas_yielded_content = True]
R --> L
O --> L
Prompt To Fix All With AIThis is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/llm.py
Line: 314-336
Comment:
**Silent empty stream — asymmetry with non-streaming path**
In `complete()` and `acomplete()`, any `None` content (regardless of `finish_reason`) raises an `LLMError` with a clear message. In `stream_complete()`, however, if a provider returns a non-refusal empty stream (e.g. `finish_reason="stop"` but all delta chunks have `content=None`), `_process_stream_chunk` returns `None` for every chunk and the generator simply exhausts with no yields and no exception.
A caller iterating over `stream_complete()` has no way to distinguish "LLM produced an empty response due to a server-side issue" from "the prompt genuinely produced no output". To match the non-streaming behaviour, consider detecting the "nothing was ever yielded" case and raising an appropriate error after the loop:
```python
has_yielded_content = False
for chunk in litellm.completion(...):
...
response = self._process_stream_chunk(chunk, callback_manager, has_yielded_content)
if response is not None:
has_yielded_content = True
yield response
# After the loop – mirror the non-streaming empty-response check
if not has_yielded_content:
# Use the finish_reason from the last usage/terminal chunk if available
self._raise_for_empty_response(finish_reason=None)
```
This would surface the same descriptive `LLMError` (status 500) that non-streaming callers already receive.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/utils/common.py
Line: 150-165
Comment:
**Constructor type annotation doesn't match defensive guards**
`__init__` is annotated `text: str`, but `__str__` now returns `self.text or ""` and `__repr__` explicitly guards against `self.text is None`. These defensive guards are only necessary if `text` can be `None` — yet the constructor annotation says it cannot.
This inconsistency means:
- Static type checkers will flag `LLMResponseCompat(None)` as an error, giving false confidence that `None` can never be stored.
- In practice `text` is a public, mutable attribute (`self.text = text`), so any code that sets it to `None` after construction is silently allowed at runtime while being a type error.
Either update the annotation to match the defensive behaviour:
```suggestion
def __init__(self, text: str | None) -> None:
```
...or, if `None` text is truly never expected at construction time and the guards are purely defensive, document that clearly so future readers understand why they exist.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/llm.py
Line: 499-502
Comment:
**Class variable defined mid-class after instance methods**
`REFUSAL_FINISH_REASONS` is a class-level constant but is placed after several instance methods (`__init__`, `test_connection`, `complete`, `stream_complete`, `acomplete`, `get_context_window_size`, `get_max_tokens`, `get_model_name`, `get_metrics`, `get_usage_reason`, `_record_usage`). Python resolves class variable access at runtime so there is no functional issue, but the convention is to declare class-level attributes at the top of the class (alongside `SYSTEM_PROMPT`, `MAX_TOKENS`, `JSON_REGEX`, etc.), making them easy to discover and configure.
Consider moving `REFUSAL_FINISH_REASONS` next to the other class-level constants near line 39.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "Update unstract/sdk1..." |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/llm.py (1)
539-540: Consider documenting theon_streamcallback interface or removing if unused.The
on_streammethod is called only at this location and is not defined anywhere in the codebase. Thehasattrcheck safely guards against AttributeError, but the interface is undocumented and has no implementations. If this is meant to support custom callback manager implementations, define a Protocol to document the expected interface. Otherwise, remove the call as potentially dead code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 539 - 540, The call to callback_manager.on_stream(text) is undocumented and has no implementations; either formally declare the expected interface or remove the dead call. Fix option A: define a Protocol (e.g., StreamCallbackProtocol with on_stream(self, text: str) -> None) in your callbacks/typing module, annotate callback_manager accordingly in the LLM class/function and add a brief docstring describing on_stream semantics so custom managers can implement it. Fix option B: if streaming callbacks are not used, remove the hasattr check and the on_stream call from the code path (in the function where callback_manager is referenced) and delete any related unused imports or comments to avoid confusing dead code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 485-512: In _raise_for_empty_response change the finish_reason
comparison from "refusal" to "content_filter" so the LLMError branch triggers
for LiteLLM-normalized safety blocks; update the identical check in the
streaming path (same file) to use "content_filter" as well, ensuring both
_raise_for_empty_response and the streaming handler use the valid LiteLLM
finish_reason string.
---
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 539-540: The call to callback_manager.on_stream(text) is
undocumented and has no implementations; either formally declare the expected
interface or remove the dead call. Fix option A: define a Protocol (e.g.,
StreamCallbackProtocol with on_stream(self, text: str) -> None) in your
callbacks/typing module, annotate callback_manager accordingly in the LLM
class/function and add a brief docstring describing on_stream semantics so
custom managers can implement it. Fix option B: if streaming callbacks are not
used, remove the hasattr check and the on_stream call from the code path (in the
function where callback_manager is referenced) and delete any related unused
imports or comments to avoid confusing dead code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5940690f-63ac-4d94-8fad-f930d9a5eddb
📒 Files selected for processing (2)
unstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/src/unstract/sdk1/utils/common.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Praveen Kumar <praveen@zipstack.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 224-226: In acomplete() move the usage recording to occur before
raising on empty/refusal responses: when response_text is None (or when
_raise_for_empty_response would be called using finish_reason), call
_record_usage(...) and Audit().push_usage_data(...) first—matching
stream_complete()'s order—then call
self._raise_for_empty_response(finish_reason); update both occurrences around
lines referencing response_text/finish_reason so all three APIs record usage for
the same provider outcome.
- Line 485: The function _raise_for_empty_response currently uses a
string-quoted return annotation "NoReturn" but NoReturn is not imported (only
cast is imported from typing), causing an F821; remove the quotes around
NoReturn in the def signature to use a real type and add NoReturn to the typing
imports (e.g., alongside cast) so the annotation is valid and resolves correctly
for the _raise_for_empty_response method.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bb107552-dd56-49cf-9141-6922a4a1f9ae
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/llm.py
If a provider streams partial content then emits a final chunk with finish_reason="refusal", raising an error after the caller has already consumed chunks would be confusing. Now track whether content has been yielded and log a warning instead of raising in that case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…te() Move _record_usage() call before the refusal check so that token usage is always audited, even when the provider refuses. This matches stream_complete() where usage recording happens per-chunk before refusal detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/llm.py (1)
539-551: Guard against missing"delta"key in malformed stream chunks.Line 549 assumes the
"delta"key exists inchunk["choices"][0]. If a malformed or edge-case chunk arrives without this key, aKeyErrorwill be raised. Consider using defensive access:- text = chunk["choices"][0]["delta"].get("content", "") + delta = chunk["choices"][0].get("delta", {}) + text = delta.get("content", "") if delta else ""This also applies to line 539 where
chunk["choices"][0]is accessed—though this is less likely to fail since usage chunks also have this structure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 539 - 551, The code in the LLM streaming handler assumes chunk["choices"][0]["delta"] exists and will KeyError on malformed chunks; update the handler around the lines using chunk, finish_reason, and text (the block that reads chunk["choices"][0] and chunk["choices"][0]["delta"]) to defensively validate the chunk structure: first safely obtain choice = (chunk.get("choices") or [None])[0] or check isinstance(chunk.get("choices"), list) and that choice is a dict, then read finish_reason = choice.get("finish_reason") and delta = choice.get("delta", {}); if choice is missing or not a dict treat it as a non-content chunk (log a warning via logger and return None) and preserve the existing has_yielded_content and self._raise_for_empty_response behavior when finish_reason == "refusal".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 539-551: The code in the LLM streaming handler assumes
chunk["choices"][0]["delta"] exists and will KeyError on malformed chunks;
update the handler around the lines using chunk, finish_reason, and text (the
block that reads chunk["choices"][0] and chunk["choices"][0]["delta"]) to
defensively validate the chunk structure: first safely obtain choice =
(chunk.get("choices") or [None])[0] or check isinstance(chunk.get("choices"),
list) and that choice is a dict, then read finish_reason =
choice.get("finish_reason") and delta = choice.get("delta", {}); if choice is
missing or not a dict treat it as a non-content chunk (log a warning via logger
and return None) and preserve the existing has_yielded_content and
self._raise_for_empty_response behavior when finish_reason == "refusal".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 190b9fd9-ff44-495a-84e6-270a9f734335
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/llm.py
- Extend refusal detection to include OpenAI/Azure's "content_filter" finish_reason alongside Anthropic's "refusal", returning status 400 instead of a misleading 500. - Guard against empty choices list in _process_stream_chunk to handle trailing usage-only chunks from providers like OpenAI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
unstract/sdk1/src/unstract/sdk1/llm.py (2)
559-561: Guard against missingdeltakey for defensive coding.Direct access to
["delta"]could raiseKeyErrorif an unexpected chunk structure is received. Since this method already defensively handles emptychoices, consider extending that pattern todelta.🛡️ Suggested defensive fix
- text = chunk["choices"][0]["delta"].get("content", "") + text = chunk["choices"][0].get("delta", {}).get("content", "")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 559 - 561, The code directly indexes chunk["choices"][0]["delta"] which can raise KeyError; update the extraction in the method that processes streaming chunks to safely handle missing delta by using choices[0].get("delta", {}) (or check for "delta" in choices[0]) and then .get("content", "") so text = chunk["choices"][0].get("delta", {}).get("content", "") and retain the existing empty-text early return; alternatively wrap the access in a try/except KeyError around the same extraction to return None on malformed chunks.
489-492: Clarify the comment about LiteLLM finish_reason normalization.The comment states
"refusal": Anthropic, but LiteLLM normalizes all provider-specific finish_reasons to OpenAI-compatible values ("stop","length","tool_calls","content_filter","function_call"). Anthropic refusals are typically mapped to"content_filter"or"stop"by LiteLLM, not exposed as"refusal".Including
"refusal"defensively is harmless, but the comment is misleading about why it's there.📝 Suggested comment update
- # Finish reasons indicating a safety/policy refusal across providers: - # - "refusal": Anthropic - # - "content_filter": OpenAI / Azure OpenAI + # Finish reasons indicating a safety/policy refusal. + # LiteLLM normalizes to OpenAI-compatible values; "content_filter" is the + # standard indicator. "refusal" is included defensively for edge cases. REFUSAL_FINISH_REASONS = {"refusal", "content_filter"}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 489 - 492, Update the comment above REFUSAL_FINISH_REASONS to clarify that LiteLLM normalizes provider-specific finish_reason values to OpenAI-compatible ones (e.g., "stop", "length", "tool_calls", "content_filter", "function_call") so Anthropic refusals are typically mapped to "content_filter" or "stop"; note that "refusal" is retained here defensively for completeness. Keep the constant name REFUSAL_FINISH_REASONS as-is and ensure the comment directly references LiteLLM normalization behavior and why "refusal" is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 559-561: The code directly indexes chunk["choices"][0]["delta"]
which can raise KeyError; update the extraction in the method that processes
streaming chunks to safely handle missing delta by using choices[0].get("delta",
{}) (or check for "delta" in choices[0]) and then .get("content", "") so text =
chunk["choices"][0].get("delta", {}).get("content", "") and retain the existing
empty-text early return; alternatively wrap the access in a try/except KeyError
around the same extraction to return None on malformed chunks.
- Around line 489-492: Update the comment above REFUSAL_FINISH_REASONS to
clarify that LiteLLM normalizes provider-specific finish_reason values to
OpenAI-compatible ones (e.g., "stop", "length", "tool_calls", "content_filter",
"function_call") so Anthropic refusals are typically mapped to "content_filter"
or "stop"; note that "refusal" is retained here defensively for completeness.
Keep the constant name REFUSAL_FINISH_REASONS as-is and ensure the comment
directly references LiteLLM normalization behavior and why "refusal" is present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9fab3dd6-bfed-477c-a404-9b29a921159c
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/llm.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
stop_reason: "refusal") in SDK1'sLLM.complete(),acomplete(), andstream_complete()methods.LLMErrorinstead of crashing with'NoneType' object has no attribute 'find'.Why
content=Nonewithfinish_reason="refusal". The SDK did not check forNonecontent before passing it to_post_process_response(), which called.find()onNone."Error from LLM provider 'anthropic': 'NoneType' object has no attribute 'find'"— making it very hard to diagnose the actual issue (a safety refusal).How
llm.py—complete()andacomplete(): After extractingresponse_textfrom the LiteLLM response, check if it'sNone. If so, inspectfinish_reasonand raiseLLMErrorwith a clear message.llm.py—stream_complete(): Extracted chunk processing into_process_stream_chunk()helper. Checksfinish_reasonon each chunk for refusal.llm.py—_raise_for_empty_response(): New helper method that raisesLLMErrorwith status 400 for refusals, 500 for other empty responses.common.py—LLMResponseCompat: Hardened__str__()and__repr__()to handleNonetext without crashing.Before:
"Error from LLM provider 'anthropic': 'NoneType' object has no attribute 'find'"After:
"The LLM refused to generate a response due to safety restrictions. Please review your prompt and try again."Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
Nonecontent before existing processing logic runs. Normal responses (wherecontentis a string) are completely unaffected — the new check is skipped. The only behavioral change is that refusal/empty responses now raise a descriptiveLLMErrorinstead of an opaqueAttributeError.Database Migrations
Env Config
Relevant Docs
stop_reason: "refusal""refusal"through asfinish_reasonRelated Issues or PRs
Dependencies Versions
Notes on Testing
content=None,finish_reason="refusal"), empty responses, and normal responses. All 13 new tests pass alongside the existing 70 SDK1 tests (83 total).Checklist
I have read and understood the Contribution Guidelines.