Skip to content

fix(llm+swe): add tool_calls support to LiteLlmClient and fix pipeline bugs#17

Merged
echobt merged 8 commits intomainfrom
fix/litellm-tool-calls-and-pipeline-bugs
Feb 18, 2026
Merged

fix(llm+swe): add tool_calls support to LiteLlmClient and fix pipeline bugs#17
echobt merged 8 commits intomainfrom
fix/litellm-tool-calls-and-pipeline-bugs

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Feb 18, 2026

Summary

Adds full function-calling (tool_calls) support to LiteLlmClient and fixes several bugs across the SWE mining pipeline — patch extraction, workspace validation, and test generation.

Changes

LiteLlmClient (src/llm/litellm.rs)

  • Add tools and tool_choice fields to internal ApiRequest struct (with skip_serializing_if)
  • Add ApiToolCall and ApiToolCallFunction deserialization structs
  • Add reasoning, reasoning_content, and tool_calls fields to ApiMessage
  • Make finish_reason optional in ApiChoice (some providers omit it)
  • Parse and return tool_calls in response conversion, matching OpenRouterProvider behavior
  • Handle content extraction priority: content > reasoning_content > reasoning
  • Update serialization tests to cover new optional fields

Patch Extraction (src/swe/extractor.rs)

  • Fix diff extraction to use git diff instead of git show for proper base-to-merge diffs
  • Handle all three cases: base+merge, merge-only, and fallback (HEAD~1..HEAD)

Workspace Validator (src/swe/workspace_validator.rs)

  • Install language runtimes (Go, Node.js, Rust, Java) in validation containers before running install/test commands

Docker Sandbox & Test Generator

  • Formatting cleanup for readability (rustfmt-consistent style)
  • Fix serde_json::from_str call to avoid borrowing a temporary reference

Miscellaneous

  • Add mine_test.log and test-easy-output/ to .gitignore

…e bugs

LiteLlmClient tool_calls support (src/llm/litellm.rs):
- Add tools and tool_choice fields (as serde_json::Value) to ApiRequest with
  skip_serializing_if to avoid sending null to providers
- Add ApiToolCall and ApiToolCallFunction deserialization structs for parsing
  function call responses from the API
- Add reasoning and reasoning_content fields to ApiMessage for reasoning models
- Make finish_reason optional in ApiChoice (some providers omit it)
- Implement full tool_calls parsing in generate() response conversion, mapping
  API tool calls to ToolCallInfo/ToolCallFunction types matching OpenRouterProvider
- Add content extraction priority: content > reasoning_content > reasoning,
  with fallback to tool call arguments when content is empty
- Update test to verify tools/tool_choice are properly excluded when None

Patch extractor fix (src/swe/extractor.rs):
- Replace git show with git diff for extracting PR patches, which produces
  cleaner unified diffs without commit metadata that confused downstream parsing
- Handle all three cases: base..merge, HEAD..merge, and HEAD~1..HEAD

Test generator clippy fix (src/swe/test_generator.rs):
- Remove unnecessary reference on result.stdout.trim() call

Workspace validator runtime install (src/swe/workspace_validator.rs):
- Add language runtime installation step before running validation tests
  in Docker containers (Go, Node.js, Rust, Java) since the base image
  may not include them, causing false setup_error failures

Formatting (docker_sandbox.rs, harness.rs, test_generator.rs, extractor.rs):
- Apply cargo fmt to fix formatting across all modified SWE pipeline files

Gitignore (.gitignore):
- Add mine_test.log and test-easy-output/ to prevent committing test artifacts
… library code

- Add validate_git_ref() and validate_repo_name() to prevent command
  injection via shell metacharacters in user-controlled inputs
- Validate commit hashes in extractor.rs before interpolating into
  git fetch/diff commands
- Validate repo name and base_commit in docker_sandbox.rs before
  interpolating into git clone/checkout commands
- Validate repo name and base_commit in harness.rs before
  interpolating into git clone/checkout commands
- Replace expect() with Result return types in LiteLlmClient::new()
  and LiteLlmClient::new_with_defaults() per AGENTS.md rules
- Add comprehensive unit tests for validation functions
…chBlock.lines field

- Remove #[allow(dead_code)] from ApiToolCall: all fields are either
  accessed in generate() or suppressed by underscore prefix (_tool_type)
- Remove #[allow(dead_code)] from ApiToolCallFunction: both fields
  (name, arguments) are accessed in generate()
- Remove unused PatchBlock.lines field in extractor.rs: accumulated
  but never read (split_solution_and_tests only returns .patch)
…results

Replace 'let _ = sandbox.write_file(...)' with proper error logging
using tracing::warn in test_generator.rs and workspace_validator.rs.
These were silently swallowing errors during test file restoration
in validation cleanup paths.
…th traversal

- validate_git_ref: reject empty refs, '..' sequences, and leading '-' (flag injection)
- validate_repo_name: reject parts starting with '.' or '-' (path traversal, flag injection)
- Add validate_file_path: reject shell metacharacters, null bytes, '..', absolute paths
- docker_sandbox: validate file paths in write_file, write_file_abs, read_file
- docker_sandbox: restrict write_file_abs to /tools/ prefix, validate tool_name
- harness: validate file paths in docker_write_file
- litellm: replace unwrap_or(Null) with proper error propagation for tools serialization
- extractor: guard validate_git_ref calls with empty checks for optional refs
Add validate_file_path() check before interpolating tf.path into the
mkdir shell command in evaluate_task(). Previously, the mkdir command
at line 340 used the unvalidated path from deserialized test files,
while docker_write_file() on the next line did validate. This created
a window where a malicious path could inject shell commands via the
mkdir step before validation occurred.
Replace 3 instances of 'let _ =' that silently discard Docker command
errors with 'if let Err(e)' + tracing::debug! for proper error
observability while maintaining best-effort cleanup semantics.

Files changed:
- src/swe/docker_sandbox.rs: start() stale container removal, destroy()
- src/swe/harness.rs: docker_rm() helper
@echobt echobt merged commit 0c643e3 into main Feb 18, 2026
9 checks passed
@echobt echobt deleted the fix/litellm-tool-calls-and-pipeline-bugs branch February 18, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments