Skip to content

feat(llmrails): add check_async method for input/output rails validation#1604

Closed
Pouyanpi wants to merge 2 commits intodevelopfrom
feat/check-method-draft
Closed

feat(llmrails): add check_async method for input/output rails validation#1604
Pouyanpi wants to merge 2 commits intodevelopfrom
feat/check-method-draft

Conversation

@Pouyanpi
Copy link
Copy Markdown
Collaborator

Description

Add a new check_async method to LLMRails that allows standalone validation of messages against input/output rails without requiring a full conversation flow.

Key features:

  • Automatically determines which rails to run based on message roles:
    • User messages only → input rails
    • Assistant messages only → output rails
    • Both user and assistant → input and output rails
  • Returns a simple RailsResult with status (PASSED/MODIFIED/BLOCKED), content, and blocking rail name

Add a new `check_async` method to `LLMRails` that allows standalone
validation of messages against input/output rails without requiring a
full conversation flow.

**Key features:**
- Automatically determines which rails to run based on message roles:
  - User messages only → input rails
  - Assistant messages only → output rails
  - Both user and assistant → input and output rails
- Returns a simple `RailsResult` with status (PASSED/MODIFIED/BLOCKED),
content, and blocking rail name
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 27, 2026

Greptile Overview

Greptile Summary

This PR adds a new check_async method to LLMRails that enables standalone validation of messages against input/output rails without requiring a full conversation flow. The method intelligently determines which rails to run based on message roles and returns a structured RailsResult with validation status.

Key additions:

  • New check_async method that auto-detects whether to run input, output, or both rails based on message roles
  • RailsResult model with PASSED/MODIFIED/BLOCKED status tracking
  • Helper functions for message processing and rail determination
  • Comprehensive test coverage (392 lines)

Issues found:

  • Critical logic bug: When both input and output rails run, only output modifications are tracked. If input rails modify the user message but output rails don't change the assistant message, the method incorrectly returns PASSED instead of MODIFIED (line 1485-1500)
  • Minor documentation inconsistency in RailsResult.rail field description
  • Missing test case for the input modification scenario

Confidence Score: 2/5

  • This PR has a critical logic bug that causes incorrect modification detection when both input and output rails run together
  • The modification detection logic at lines 1485-1500 only compares the assistant message content, missing modifications made by input rails to user messages. This breaks the core functionality when both rails are active. While the code is well-tested, the tests don't cover this critical edge case.
  • Pay close attention to nemoguardrails/rails/llm/llmrails.py lines 1485-1500 where the modification detection logic needs fixing

Important Files Changed

Filename Overview
nemoguardrails/rails/llm/llmrails.py Added check_async method and helper functions for standalone rails validation. Critical logic bug in modification detection when both input/output rails run - only tracks output changes, missing input modifications.
nemoguardrails/rails/llm/options.py Added RailStatus enum and RailsResult model for rails validation results. Minor documentation inconsistency about rail field usage.
tests/test_llmrails_check_async.py Comprehensive test coverage for check_async method and helper functions. Missing test case for detecting input modifications when both rails run with unchanged output.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LLMRails
    participant DetermineRails as _determine_rails_from_messages
    participant NormalizeMsg as _normalize_messages_for_rails
    participant GenerateAsync as generate_async
    participant InputRails as Input Rails Flow
    participant OutputRails as Output Rails Flow
    participant GetBlocking as _get_blocking_rail
    participant GetContent as _get_response_content

    Client->>LLMRails: check_async(messages)
    
    LLMRails->>DetermineRails: Analyze message roles
    alt No user/assistant messages
        DetermineRails-->>LLMRails: None
        LLMRails-->>Client: RailsResult(PASSED)
    else Has user and assistant
        DetermineRails-->>LLMRails: {"rails": ["input", "output"]}
    else Has only user
        DetermineRails-->>LLMRails: {"rails": ["input"]}
    else Has only assistant
        DetermineRails-->>LLMRails: {"rails": ["output"]}
    end
    
    Note over LLMRails: target_role = "assistant" if output in rails<br/>original_content = last message with target_role
    
    LLMRails->>NormalizeMsg: Normalize messages for rails
    alt Output rails only without user message
        NormalizeMsg-->>LLMRails: Add empty user message
    else All other cases
        NormalizeMsg-->>LLMRails: Return unchanged
    end
    
    LLMRails->>GenerateAsync: Process with rails enabled
    
    alt Input rails enabled
        GenerateAsync->>InputRails: Validate/modify user message
        InputRails-->>GenerateAsync: Modified user message or block
    end
    
    alt Output rails enabled
        GenerateAsync->>OutputRails: Validate/modify bot message
        OutputRails-->>GenerateAsync: Modified bot message or block
    end
    
    GenerateAsync-->>LLMRails: GenerationResponse
    
    LLMRails->>GetBlocking: Check for blocking rails
    GetBlocking-->>LLMRails: blocking_rail name or None
    
    LLMRails->>GetContent: Extract final content
    GetContent-->>LLMRails: result_content
    
    alt Blocking rail exists
        LLMRails-->>Client: RailsResult(BLOCKED, rail=name)
    else result_content != original_content
        LLMRails-->>Client: RailsResult(MODIFIED)
    else No changes
        LLMRails-->>Client: RailsResult(PASSED)
    end
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread nemoguardrails/rails/llm/llmrails.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 27, 2026

Codecov Report

❌ Patch coverage is 97.05882% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/rails/llm/llmrails.py 96.61% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Pouyanpi
Copy link
Copy Markdown
Collaborator Author

@greptileai re-review

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +1485 to +1486
target_role = "assistant" if "output" in options["rails"] else "user"
original_content = _get_content_by_role(messages, target_role)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When both rails run (["input", "output"]), target_role is always "assistant", so only output modifications are tracked. If input rails modify the user message but output rails don't change the assistant message, the comparison at line 1500 will miss the input modification and incorrectly return PASSED instead of MODIFIED.

Example:

messages = [
    {"role": "user", "content": "modify"},      # input rail changes to "modified input"
    {"role": "assistant", "content": "hello"}   # output rail doesn't change
]
  • original_content = "hello" (assistant)
  • result_content = "hello" (unchanged assistant)
  • Result: PASSED ❌ (should be MODIFIED)
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 1485:1486

Comment:
When both rails run (`["input", "output"]`), `target_role` is always `"assistant"`, so only output modifications are tracked. If input rails modify the user message but output rails don't change the assistant message, the comparison at line 1500 will miss the input modification and incorrectly return `PASSED` instead of `MODIFIED`.

Example:
```
messages = [
    {"role": "user", "content": "modify"},      # input rail changes to "modified input"
    {"role": "assistant", "content": "hello"}   # output rail doesn't change
]
```
- `original_content` = "hello" (assistant)
- `result_content` = "hello" (unchanged assistant)
- Result: `PASSED` ❌ (should be `MODIFIED`)

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +324 to +348
async def test_input_output_both_passed(self, mock_rails):
messages = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "hi there"},
]
result = await mock_rails.check_async(messages)
assert result.status == RailStatus.PASSED

@pytest.mark.asyncio
async def test_input_output_input_blocked(self, mock_rails):
messages = [
{"role": "user", "content": "block"},
{"role": "assistant", "content": "hi there"},
]
result = await mock_rails.check_async(messages)
assert result.status == RailStatus.BLOCKED

@pytest.mark.asyncio
async def test_input_output_output_blocked(self, mock_rails):
messages = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": "block output"},
]
result = await mock_rails.check_async(messages)
assert result.status == RailStatus.BLOCKED
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test case for input modification with unchanged output. Add:

@pytest.mark.asyncio
async def test_input_output_input_modified_output_passed(self, mock_rails):
    messages = [
        {"role": "user", "content": "modify"},
        {"role": "assistant", "content": "hello"}
    ]
    result = await mock_rails.check_async(messages)
    assert result.status == RailStatus.MODIFIED  # Currently fails - returns PASSED
    assert result.content == "hello"

This would catch the modification detection bug in llmrails.py:1485-1500.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_llmrails_check_async.py
Line: 324:348

Comment:
Missing test case for input modification with unchanged output. Add:

```python
@pytest.mark.asyncio
async def test_input_output_input_modified_output_passed(self, mock_rails):
    messages = [
        {"role": "user", "content": "modify"},
        {"role": "assistant", "content": "hello"}
    ]
    result = await mock_rails.check_async(messages)
    assert result.status == RailStatus.MODIFIED  # Currently fails - returns PASSED
    assert result.content == "hello"
```

This would catch the modification detection bug in `llmrails.py:1485-1500`.

How can I resolve this? If you propose a fix, please make it concise.

class RailsResult(BaseModel):
status: RailStatus = Field(description="Status of the rails check: passed, modified, or blocked.")
content: str = Field(description="The content after rails processing.")
rail: Optional[str] = Field(default=None, description="Name of the rail that blocked or modified the content.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rail field description says "Name of the rail that blocked or modified", but in llmrails.py:1500-1501, the rail field is only set for BLOCKED status, not for MODIFIED status. Either update the description to "Name of the rail that blocked" or populate this field when status is MODIFIED.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/options.py
Line: 97:97

Comment:
The `rail` field description says "Name of the rail that blocked or modified", but in `llmrails.py:1500-1501`, the `rail` field is only set for `BLOCKED` status, not for `MODIFIED` status. Either update the description to "Name of the rail that blocked" or populate this field when status is `MODIFIED`.

How can I resolve this? If you propose a fix, please make it concise.

@Pouyanpi Pouyanpi closed this Jan 28, 2026
@Pouyanpi Pouyanpi deleted the feat/check-method-draft branch January 28, 2026 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant