feat(llmrails): add check_async method for input/output rails validation#1604
feat(llmrails): add check_async method for input/output rails validation#1604
Conversation
Add a new `check_async` method to `LLMRails` that allows standalone validation of messages against input/output rails without requiring a full conversation flow. **Key features:** - Automatically determines which rails to run based on message roles: - User messages only → input rails - Assistant messages only → output rails - Both user and assistant → input and output rails - Returns a simple `RailsResult` with status (PASSED/MODIFIED/BLOCKED), content, and blocking rail name
Greptile OverviewGreptile SummaryThis PR adds a new Key additions:
Issues found:
|
| Filename | Overview |
|---|---|
| nemoguardrails/rails/llm/llmrails.py | Added check_async method and helper functions for standalone rails validation. Critical logic bug in modification detection when both input/output rails run - only tracks output changes, missing input modifications. |
| nemoguardrails/rails/llm/options.py | Added RailStatus enum and RailsResult model for rails validation results. Minor documentation inconsistency about rail field usage. |
| tests/test_llmrails_check_async.py | Comprehensive test coverage for check_async method and helper functions. Missing test case for detecting input modifications when both rails run with unchanged output. |
Sequence Diagram
sequenceDiagram
participant Client
participant LLMRails
participant DetermineRails as _determine_rails_from_messages
participant NormalizeMsg as _normalize_messages_for_rails
participant GenerateAsync as generate_async
participant InputRails as Input Rails Flow
participant OutputRails as Output Rails Flow
participant GetBlocking as _get_blocking_rail
participant GetContent as _get_response_content
Client->>LLMRails: check_async(messages)
LLMRails->>DetermineRails: Analyze message roles
alt No user/assistant messages
DetermineRails-->>LLMRails: None
LLMRails-->>Client: RailsResult(PASSED)
else Has user and assistant
DetermineRails-->>LLMRails: {"rails": ["input", "output"]}
else Has only user
DetermineRails-->>LLMRails: {"rails": ["input"]}
else Has only assistant
DetermineRails-->>LLMRails: {"rails": ["output"]}
end
Note over LLMRails: target_role = "assistant" if output in rails<br/>original_content = last message with target_role
LLMRails->>NormalizeMsg: Normalize messages for rails
alt Output rails only without user message
NormalizeMsg-->>LLMRails: Add empty user message
else All other cases
NormalizeMsg-->>LLMRails: Return unchanged
end
LLMRails->>GenerateAsync: Process with rails enabled
alt Input rails enabled
GenerateAsync->>InputRails: Validate/modify user message
InputRails-->>GenerateAsync: Modified user message or block
end
alt Output rails enabled
GenerateAsync->>OutputRails: Validate/modify bot message
OutputRails-->>GenerateAsync: Modified bot message or block
end
GenerateAsync-->>LLMRails: GenerationResponse
LLMRails->>GetBlocking: Check for blocking rails
GetBlocking-->>LLMRails: blocking_rail name or None
LLMRails->>GetContent: Extract final content
GetContent-->>LLMRails: result_content
alt Blocking rail exists
LLMRails-->>Client: RailsResult(BLOCKED, rail=name)
else result_content != original_content
LLMRails-->>Client: RailsResult(MODIFIED)
else No changes
LLMRails-->>Client: RailsResult(PASSED)
end
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@greptileai re-review |
| target_role = "assistant" if "output" in options["rails"] else "user" | ||
| original_content = _get_content_by_role(messages, target_role) |
There was a problem hiding this comment.
When both rails run (["input", "output"]), target_role is always "assistant", so only output modifications are tracked. If input rails modify the user message but output rails don't change the assistant message, the comparison at line 1500 will miss the input modification and incorrectly return PASSED instead of MODIFIED.
Example:
messages = [
{"role": "user", "content": "modify"}, # input rail changes to "modified input"
{"role": "assistant", "content": "hello"} # output rail doesn't change
]
original_content= "hello" (assistant)result_content= "hello" (unchanged assistant)- Result:
PASSED❌ (should beMODIFIED)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 1485:1486
Comment:
When both rails run (`["input", "output"]`), `target_role` is always `"assistant"`, so only output modifications are tracked. If input rails modify the user message but output rails don't change the assistant message, the comparison at line 1500 will miss the input modification and incorrectly return `PASSED` instead of `MODIFIED`.
Example:
```
messages = [
{"role": "user", "content": "modify"}, # input rail changes to "modified input"
{"role": "assistant", "content": "hello"} # output rail doesn't change
]
```
- `original_content` = "hello" (assistant)
- `result_content` = "hello" (unchanged assistant)
- Result: `PASSED` ❌ (should be `MODIFIED`)
How can I resolve this? If you propose a fix, please make it concise.| async def test_input_output_both_passed(self, mock_rails): | ||
| messages = [ | ||
| {"role": "user", "content": "hello"}, | ||
| {"role": "assistant", "content": "hi there"}, | ||
| ] | ||
| result = await mock_rails.check_async(messages) | ||
| assert result.status == RailStatus.PASSED | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_input_output_input_blocked(self, mock_rails): | ||
| messages = [ | ||
| {"role": "user", "content": "block"}, | ||
| {"role": "assistant", "content": "hi there"}, | ||
| ] | ||
| result = await mock_rails.check_async(messages) | ||
| assert result.status == RailStatus.BLOCKED | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_input_output_output_blocked(self, mock_rails): | ||
| messages = [ | ||
| {"role": "user", "content": "hello"}, | ||
| {"role": "assistant", "content": "block output"}, | ||
| ] | ||
| result = await mock_rails.check_async(messages) | ||
| assert result.status == RailStatus.BLOCKED |
There was a problem hiding this comment.
Missing test case for input modification with unchanged output. Add:
@pytest.mark.asyncio
async def test_input_output_input_modified_output_passed(self, mock_rails):
messages = [
{"role": "user", "content": "modify"},
{"role": "assistant", "content": "hello"}
]
result = await mock_rails.check_async(messages)
assert result.status == RailStatus.MODIFIED # Currently fails - returns PASSED
assert result.content == "hello"This would catch the modification detection bug in llmrails.py:1485-1500.
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_llmrails_check_async.py
Line: 324:348
Comment:
Missing test case for input modification with unchanged output. Add:
```python
@pytest.mark.asyncio
async def test_input_output_input_modified_output_passed(self, mock_rails):
messages = [
{"role": "user", "content": "modify"},
{"role": "assistant", "content": "hello"}
]
result = await mock_rails.check_async(messages)
assert result.status == RailStatus.MODIFIED # Currently fails - returns PASSED
assert result.content == "hello"
```
This would catch the modification detection bug in `llmrails.py:1485-1500`.
How can I resolve this? If you propose a fix, please make it concise.| class RailsResult(BaseModel): | ||
| status: RailStatus = Field(description="Status of the rails check: passed, modified, or blocked.") | ||
| content: str = Field(description="The content after rails processing.") | ||
| rail: Optional[str] = Field(default=None, description="Name of the rail that blocked or modified the content.") |
There was a problem hiding this comment.
The rail field description says "Name of the rail that blocked or modified", but in llmrails.py:1500-1501, the rail field is only set for BLOCKED status, not for MODIFIED status. Either update the description to "Name of the rail that blocked" or populate this field when status is MODIFIED.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/options.py
Line: 97:97
Comment:
The `rail` field description says "Name of the rail that blocked or modified", but in `llmrails.py:1500-1501`, the `rail` field is only set for `BLOCKED` status, not for `MODIFIED` status. Either update the description to "Name of the rail that blocked" or populate this field when status is `MODIFIED`.
How can I resolve this? If you propose a fix, please make it concise.
Description
Add a new
check_asyncmethod toLLMRailsthat allows standalone validation of messages against input/output rails without requiring a full conversation flow.Key features:
RailsResultwith status (PASSED/MODIFIED/BLOCKED), content, and blocking rail name