Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions .claude/skills/fix-integration-bug/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
name: fix-integration-bug
description: Workflow for fixing bugs in Ruby SDK integrations. Covers reproducing the bug, using appraisals, adding test cases, and TDD-based fixes.
---

# Fixing Integration Bugs

**This skill is for fixing bugs in existing integrations.** Follow this workflow to reproduce, test, and fix integration issues.

## 1. Reproduce the Bug

First, understand and reproduce the issue:

```bash
# Run with console logging to see actual trace output
BRAINTRUST_ENABLE_TRACE_CONSOLE_LOG=true bundle exec appraisal provider ruby examples/provider.rb

# Or create a minimal reproduction script
BRAINTRUST_ENABLE_TRACE_CONSOLE_LOG=true bundle exec appraisal provider ruby -e '
require "braintrust"
# minimal reproduction code
'
```

## 2. Appraisal Commands

Test against specific gem versions:

```bash
# Install dependencies for all appraisals
bundle exec appraisal install

# Run tests for specific appraisal
bundle exec appraisal provider rake test

# Run single test file
bundle exec appraisal provider rake test TEST=test/braintrust/trace/provider_test.rb

# Run with specific seed (useful for reproducing flaky test failures from CI)
bundle exec appraisal provider rake test[12345]

# Run all appraisals
bundle exec appraisal rake test

# Re-record VCR cassettes
VCR_MODE=all bundle exec appraisal provider rake test
```

## 3. Add Failing Test Case

Write a test that reproduces the bug:

```ruby
def test_bug_description
# Arrange: Set up the scenario that triggers the bug

# Act: Call the method

# Assert: Verify expected behavior (this should FAIL initially)
end
```

## 4. Add Example Case (if applicable)

Add a case to the internal example that exercises the buggy code path:

- **Location**: `examples/internal/provider.rb`
- **Purpose**: Demonstrates the fix works end-to-end
- Follow existing example patterns (nest under root span, print output)

## 5. TDD Fix Cycle

1. Run failing test: `bundle exec appraisal provider rake test`
2. Implement minimal fix in `lib/braintrust/trace/contrib/provider.rb`
3. Run tests again (should pass)
4. Lint: `bundle exec rake lint:fix`
5. Run all appraisals: `bundle exec appraisal rake test`

## 6. Verify with MCP

Query traces to confirm the fix:

```ruby
mcp__braintrust__btql_query(query: "SELECT input, output, metrics FROM project_logs LIMIT 5")
```

## Reference Files

- Integrations: `lib/braintrust/trace/contrib/{openai,anthropic,ruby_llm}.rb`
- Tests: `test/braintrust/trace/{openai,anthropic,ruby_llm}_test.rb`
- Examples: `examples/internal/{openai,anthropic,ruby_llm}.rb`
- VCR cassettes: `test/fixtures/vcr_cassettes/provider/`
51 changes: 51 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Braintrust Ruby SDK

Ruby SDK for Braintrust observability (tracing, evals, logging).

## Code Structure

- `lib/braintrust/` - Main SDK code
- `trace.rb`, `trace/` - Tracing/spans for LLM calls
- `eval.rb`, `eval/` - Evaluation framework
- `api.rb`, `api/` - API client
- `state.rb` - Global state management
- `test/` - Tests mirror lib/ structure
- `examples/` - Usage examples

## Commands

```bash
rake # Run lint + all appraisal tests (CI)
rake test # Run tests
rake lint # StandardRB linter
rake lint:fix # Auto-fix lint issues
rake -T # List all tasks
```

## TDD

Reproduce the issue in a failing test before fixing it.

## Testing

**Prefer real code over mocks.** Use VCR to record/replay HTTP interactions.

```bash
rake test # Run with VCR cassettes
VCR_MODE=all rake test # Re-record all cassettes
VCR_MODE=new_episodes rake test # Record new, keep existing
```

### Appraisals (Optional Dependencies)

The SDK integrates with optional gems (openai, anthropic, ruby_llm). Tests run against multiple versions:

```bash
bundle exec appraisal list # Show scenarios
bundle exec appraisal openai rake test # Test with openai gem
bundle exec appraisal openai-uninstalled rake test # Test without
```

## Linting

Uses StandardRB. Run `rake lint:fix` before committing.
47 changes: 47 additions & 0 deletions examples/internal/ruby_llm.rb
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,53 @@ def execute(operation:, a:, b:)
puts "✓ Gracefully caught error: #{e.class.name}"
puts " Message: #{e.message}"
end

# Feature 8: Image Attachments (Issue #71 fix)
# This demonstrates proper handling of RubyLLM Content objects with attachments
puts "\n" + "=" * 80
puts "Feature 8: Image Attachments"
puts "=" * 80

tracer.in_span("feature_image_attachments") do
require "tempfile"

# Create a minimal valid PNG image (10x10 red square)
png_data = [
0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x00, 0x00, 0x0d,
0x49, 0x48, 0x44, 0x52, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x00, 0x00, 0x0a,
0x08, 0x02, 0x00, 0x00, 0x00, 0x02, 0x50, 0x58, 0xea, 0x00, 0x00, 0x00,
0x12, 0x49, 0x44, 0x41, 0x54, 0x78, 0xda, 0x63, 0xf8, 0xcf, 0xc0, 0x80,
0x07, 0x31, 0x8c, 0x4a, 0x63, 0x43, 0x00, 0xb7, 0xca, 0x63, 0x9d, 0xd6,
0xd5, 0xef, 0x74, 0x00, 0x00, 0x00, 0x00, 0x49, 0x45, 0x4e, 0x44, 0xae,
0x42, 0x60, 0x82
].pack("C*")

# Create a temp PNG file
tmpfile = Tempfile.new(["test_image", ".png"])
tmpfile.binmode
tmpfile.write(png_data)
tmpfile.close

begin
puts "\n[OpenAI - gpt-4o-mini with Image Attachment]"
chat_openai = RubyLLM.chat(model: "gpt-4o-mini")

# Use RubyLLM's Content class with attachment
# This triggers the Content object behavior (issue #71)
content = RubyLLM::Content.new("What color is this image? Reply in one word.")
content.add_attachment(tmpfile.path)

chat_openai.add_message(role: :user, content: content)
response = chat_openai.complete

puts "Q: What color is this image? (with PNG attachment)"
puts "A: #{response.content}"
puts "Tokens: #{response.to_h[:input_tokens]} in, #{response.to_h[:output_tokens]} out"
puts "Note: The trace includes the base64-encoded image attachment"
ensure
tmpfile.unlink
end
end
end

puts "\n" + "=" * 80
Expand Down
97 changes: 84 additions & 13 deletions lib/braintrust/trace/contrib/github.com/crmne/ruby_llm.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
require "json"
require_relative "../../../tokens"
require_relative "../../../../logger"
require_relative "../../../../internal/encoding"

module Braintrust
module Trace
Expand Down Expand Up @@ -422,18 +423,14 @@ def self.format_message_for_input(msg)

# Handle content
if msg.respond_to?(:content) && msg.content
# Convert Ruby hash notation to JSON string for tool results
content = msg.content
if msg.role.to_s == "tool" && content.is_a?(String) && content.start_with?("{:")
# Ruby hash string like "{:location=>...}" - try to parse and re-serialize as JSON
begin
# Simple conversion: replace Ruby hash syntax with JSON
content = content.gsub(/(?<=\{|, ):(\w+)=>/, '"\1":').gsub("=>", ":")
rescue
# Keep original if conversion fails
end
raw_content = msg.content

# Check if content is a Content object with attachments (issue #71)
formatted["content"] = if raw_content.respond_to?(:text) && raw_content.respond_to?(:attachments) && raw_content.attachments&.any?
format_multipart_content(raw_content)
else
format_simple_content(raw_content, msg.role.to_s)
end
formatted["content"] = content
end

# Handle tool_calls for assistant messages
Expand All @@ -450,6 +447,74 @@ def self.format_message_for_input(msg)
formatted
end

# Format multipart content with text and attachments
# @param content_obj [Object] Content object with text and attachments
# @return [Array<Hash>] array of content parts
def self.format_multipart_content(content_obj)
content_parts = []

# Add text part
content_parts << {"type" => "text", "text" => content_obj.text} if content_obj.text

# Add attachment parts (convert to Braintrust format)
content_obj.attachments.each do |attachment|
content_parts << format_attachment_for_input(attachment)
end

content_parts
end

# Format simple text content
# @param raw_content [Object] String or Content object with text
# @param role [String] the message role
# @return [String] formatted text content
def self.format_simple_content(raw_content, role)
content = raw_content
content = content.text if content.respond_to?(:text)

# Convert Ruby hash string to JSON for tool results
if role == "tool" && content.is_a?(String) && content.start_with?("{:")
begin
content = content.gsub(/(?<=\{|, ):(\w+)=>/, '"\1":').gsub("=>", ":")
rescue
# Keep original if conversion fails
end
end

content
end

# Format a RubyLLM attachment to OpenAI-compatible format
# @param attachment [Object] the RubyLLM attachment
# @return [Hash] OpenAI image_url format for consistency with other integrations
def self.format_attachment_for_input(attachment)
# RubyLLM Attachment has: source (Pathname), filename, mime_type
if attachment.respond_to?(:source) && attachment.source
begin
data = File.binread(attachment.source.to_s)
encoded = Internal::Encoding::Base64.strict_encode64(data)
mime_type = attachment.respond_to?(:mime_type) ? attachment.mime_type : "application/octet-stream"

# Use OpenAI's image_url format for consistency
{
"type" => "image_url",
"image_url" => {
"url" => "data:#{mime_type};base64,#{encoded}"
}
}
rescue => e
Log.debug("Failed to read attachment file: #{e.message}")
# Return a placeholder if we can't read the file
{"type" => "text", "text" => "[attachment: #{attachment.respond_to?(:filename) ? attachment.filename : "unknown"}]"}
end
elsif attachment.respond_to?(:to_h)
# Try to use attachment's own serialization
attachment.to_h
else
{"type" => "text", "text" => "[attachment]"}
end
end

# Capture streaming output and metrics
# @param span [OpenTelemetry::Trace::Span] the span
# @param aggregated_chunks [Array] the aggregated chunks
Expand All @@ -458,8 +523,11 @@ def self.capture_streaming_output(span, aggregated_chunks, result)
return if aggregated_chunks.empty?

# Aggregate content from chunks
# Extract text from Content objects if present (issue #71)
aggregated_content = aggregated_chunks.map { |c|
c.respond_to?(:content) ? c.content : c.to_s
content = c.respond_to?(:content) ? c.content : c.to_s
content = content.text if content.respond_to?(:text)
content
}.join

output = [{
Expand Down Expand Up @@ -490,8 +558,11 @@ def self.capture_non_streaming_output(span, chat, response, messages_before_coun
}

# Add content if it's a simple text response
# Extract text from Content objects if present (issue #71)
if response.respond_to?(:content) && response.content && !response.content.empty?
message["content"] = response.content
content = response.content
content = content.text if content.respond_to?(:text)
message["content"] = content
end

# Check if there are tool calls in the messages history
Expand Down
Loading