-
Notifications
You must be signed in to change notification settings - Fork 2
Closed
Labels
area: review-pipelineReview pipeline, context, promptsReview pipeline, context, promptsenhancementNew feature or requestNew feature or requestpriority: mediumMedium priorityMedium priority
Description
Problem
LLMs frequently produce malformed structured output. Qodo Merge has 9 sequential fallback strategies for parsing YAML output from LLMs — this is a hard-won production lesson. DiffScope parses LLM output via parsing/ modules but may not be as resilient to malformed responses.
How Qodo Does It
From pr_agent/algo/utils.py, function load_yaml() with try_fix_yaml():
- Direct
yaml.safe_load()— the happy path - YAML literal block (
|-) conversion - Pipe character replacement (
|→|2) - Root-level indentation fixes
- Snippet extraction between backticks (
```yaml ... ```) - Curly bracket removal (JSON-like output from LLM)
- Key-range extraction (grab just the relevant YAML section)
- Leading
+character removal (diff artifacts bleeding into output) - Tab-to-space conversion and encoding fallbacks (latin-1, utf-16)
Proposed Solution
Audit and harden DiffScope's LLM output parsing:
1. Audit Current Parsing
- Review
parsing/llm_response.rsandparsing/smart_review_response.rs - Identify failure modes from production use
- Add error tracking to measure parse failure rates
2. Implement Fallback Chain
fn parse_llm_response(raw: &str) -> Result<Vec<Comment>> {
// Strategy 1: Direct JSON parse
if let Ok(comments) = serde_json::from_str(raw) { return Ok(comments); }
// Strategy 2: Extract JSON from markdown code blocks
if let Some(json_block) = extract_code_block(raw, "json") {
if let Ok(comments) = serde_json::from_str(&json_block) { return Ok(comments); }
}
// Strategy 3: Fix common JSON issues (trailing commas, single quotes)
let fixed = fix_common_json_issues(raw);
if let Ok(comments) = serde_json::from_str(&fixed) { return Ok(comments); }
// Strategy 4: Extract individual comment objects with regex
if let Ok(comments) = extract_comments_regex(raw) { return Ok(comments); }
// Strategy 5: Line-by-line structured extraction
if let Ok(comments) = parse_structured_text(raw) { return Ok(comments); }
// Strategy 6: Ask the LLM to reformat (last resort)
Err(anyhow!("Failed to parse LLM output after all strategies"))
}3. Track Parse Failures
- Log parse failure rate per strategy
- Surface in metrics/analytics
- Feed back into prompt engineering (if certain models consistently produce bad output)
Priority
Medium — production reliability. Low effort, high resilience. Every tool in the space has learned this lesson the hard way.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area: review-pipelineReview pipeline, context, promptsReview pipeline, context, promptsenhancementNew feature or requestNew feature or requestpriority: mediumMedium priorityMedium priority