[FEATURE] Support for attached files in the paint call#547
[FEATURE] Support for attached files in the paint call#547olivierroy wants to merge 1 commit intocrmne:mainfrom
Conversation
|
Please, maintainers, take a look here, we need this! :) |
13a543c to
c00f242
Compare
There was a problem hiding this comment.
Pull request overview
This pull request implements support for file attachments in image generation via the RubyLLM.paint method, addressing issue #512. The feature enables users to provide reference images or files when generating or editing images, similar to how the ask method accepts attachments for chat. The implementation follows existing codebase patterns for attachment handling and adds comprehensive documentation.
Changes:
- Added
with:parameter toRubyLLM.paintAPI for passing file attachments (single file or array) - Implemented image editing support for OpenAI's gpt-image-* and dall-e-2 models
- Added attachment support for Gemini's gemini-flash-image models via the generateContent API
- Enhanced documentation with examples and model compatibility information
Reviewed changes
Copilot reviewed 7 out of 10 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/ruby_llm/image.rb | Added with: parameter to the Image.paint method signature |
| lib/ruby_llm/provider.rb | Added attachment normalization and passed attachments to render_image_payload |
| lib/ruby_llm/providers/openai/images.rb | Implemented image editing with multipart file uploads and model validation |
| lib/ruby_llm/providers/gemini/images.rb | Added generateContent support with inline base64 image data for flash-image models |
| spec/ruby_llm/image_generation_spec.rb | Added integration tests for image editing with both OpenAI and Gemini models |
| spec/ruby_llm/providers/open_ai/images_spec.rb | Added unit tests for single and multiple attachment payload rendering |
| docs/_core_features/image-generation.md | Added comprehensive documentation on image editing, supported models, and usage examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| expect(image.model_id).to eq('gpt-image-1') | ||
|
|
||
| save_and_verify_image image | ||
| end |
There was a problem hiding this comment.
Missing test coverage for the error case where unsupported models (like dall-e-3 or imagen-*) are used with attachments. The validation in validate_edit_support! should be tested to ensure it raises the expected error when attempting to use attachments with unsupported models. Consider adding a test case like: expect { RubyLLM.paint('test', model: 'dall-e-3', with: 'file.png') }.to raise_error(RubyLLM::Error, /Image editing is only supported for/)
| end | |
| end | |
| it 'raises an error when trying to edit images with openai/dall-e-3' do | |
| attachment_path = File.expand_path('../fixtures/ruby.png', __dir__) | |
| expect do | |
| RubyLLM.paint( | |
| 'make the background blue', | |
| model: 'dall-e-3', | |
| with: attachment_path | |
| ) | |
| end.to raise_error(RubyLLM::Error, /Image editing is only supported for/) | |
| end | |
| it 'raises an error when trying to edit images with gemini/imagen-4.0-generate-001' do | |
| attachment_path = File.expand_path('../fixtures/ruby.png', __dir__) | |
| expect do | |
| RubyLLM.paint( | |
| 'make the background blue', | |
| model: 'imagen-4.0-generate-001', | |
| with: attachment_path | |
| ) | |
| end.to raise_error(RubyLLM::Error, /Image editing is only supported for/) | |
| end |
| expect(image.model_id).to eq('gpt-image-1') | ||
|
|
||
| save_and_verify_image image | ||
| end |
There was a problem hiding this comment.
Missing integration test coverage for multiple attachments. While the unit test in spec/ruby_llm/providers/open_ai/images_spec.rb verifies the payload structure for multiple attachments, there's no end-to-end test that actually calls the API with multiple files. Consider adding a test case that uses an array of file paths with the with: parameter to ensure the feature works correctly end-to-end.
| end | |
| end | |
| it 'openai/gpt-image-1 can edit images with multiple source images' do | |
| attachment_path = File.expand_path('../fixtures/ruby.png', __dir__) | |
| attachment_paths = [attachment_path, attachment_path] | |
| image = RubyLLM.paint( | |
| 'make the background blue', | |
| model: 'gpt-image-1', | |
| with: attachment_paths | |
| ) | |
| expect(image.base64?).to be(true) | |
| expect(image.data).to be_present | |
| expect(image.mime_type).to include('image') | |
| expect(image.model_id).to eq('gpt-image-1') | |
| save_and_verify_image image | |
| end |
| it 'uses an array for multiple attachments to emit repeated `image[]` fields' do | ||
| payload = described_class.render_image_payload(prompt, model:, size:, with: [attachment_path, attachment_path]) | ||
|
|
||
| expect(payload[:image]).to all(be_a(Faraday::Multipart::FilePart)) | ||
| expect(payload[:image].size).to eq(2) | ||
| end |
There was a problem hiding this comment.
The test description says "uses an array for multiple attachments to emit repeated image[] fields" but the test checks for payload[:image] (not payload[:'image[]']). This is actually correct behavior - Faraday's multipart encoder will send multiple fields with the same key when the value is an array - but the test description might be clearer if it said "uses an array for the image key when there are multiple attachments" to avoid confusion about the key name.
| def images_url | ||
| "models/#{@model}:predict" | ||
| if uses_generate_content?(@model) | ||
| "models/#{@model}:generateContent" | ||
| else | ||
| "models/#{@model}:predict" | ||
| end | ||
| end |
There was a problem hiding this comment.
The images_url method depends on @model being set by a prior call to render_image_payload. This creates fragile order-dependent code similar to the OpenAI implementation. Consider passing the model as a parameter to images_url to make the dependency explicit and the code more maintainable.
| def images_url | ||
| 'images/generations' | ||
| attachments_present? ? 'images/edits' : 'images/generations' | ||
| end | ||
|
|
||
| def render_image_payload(prompt, model:, size:) | ||
| def render_image_payload(prompt, model:, size:, with: []) | ||
| @attachments = with | ||
|
|
||
| if attachments_present? | ||
| validate_edit_support!(model) | ||
| default_generation_payload(prompt, model:, size:) | ||
| .merge(image_attachments_payload) | ||
| .compact | ||
| else | ||
| default_generation_payload(prompt, model:, size:) | ||
| end | ||
| end | ||
|
|
||
| def attachments_present? | ||
| @attachments&.any? |
There was a problem hiding this comment.
The use of @attachments instance variable with module_function creates fragile order-dependent code. The images_url method depends on @attachments being set by render_image_payload, which works only because they're called in a specific order in Provider#paint. Consider refactoring to either:
- Pass attachments as a parameter to
images_urlandimage_attachments_payload, or - Remove
module_functionif the module is only meant to be included as instance methods.
This would make the code more maintainable and less prone to bugs if call order changes.
| expanded_path = File.expand_path(file_path) | ||
| Marcel::MimeType.for(Pathname.new(expanded_path)) | ||
| end | ||
|
|
||
| def encode_image_to_base64(file_path) | ||
| expanded_path = File.expand_path(file_path) |
There was a problem hiding this comment.
Missing validation that the file exists before attempting to read it. If a user provides an invalid file path, they'll receive a less helpful Errno::ENOENT error instead of a clear validation error. Consider adding a check like raise ArgumentError, "File not found: #{file_path}" unless File.exist?(expanded_path) before attempting to read the file.
| expanded_path = File.expand_path(file_path) | |
| Marcel::MimeType.for(Pathname.new(expanded_path)) | |
| end | |
| def encode_image_to_base64(file_path) | |
| expanded_path = File.expand_path(file_path) | |
| expanded_path = File.expand_path(file_path) | |
| unless File.exist?(expanded_path) | |
| raise ArgumentError, "File not found: #{file_path}" | |
| end | |
| Marcel::MimeType.for(Pathname.new(expanded_path)) | |
| end | |
| def encode_image_to_base64(file_path) | |
| expanded_path = File.expand_path(file_path) | |
| unless File.exist?(expanded_path) | |
| raise ArgumentError, "File not found: #{file_path}" | |
| end |
| def build_image_file_part(file_path) | ||
| expanded_path = File.expand_path(file_path) | ||
| mime_type = Marcel::MimeType.for(Pathname.new(expanded_path)) | ||
|
|
||
| Faraday::Multipart::FilePart.new( | ||
| expanded_path, | ||
| mime_type, | ||
| File.basename(expanded_path) | ||
| ) | ||
| end |
There was a problem hiding this comment.
Missing validation that the file exists before attempting to read it. If a user provides an invalid file path, they'll receive a less helpful Errno::ENOENT error instead of a clear validation error. Consider adding a check like raise ArgumentError, "File not found: #{file_path}" unless File.exist?(expanded_path) after expanding the path.
What this does
This allows to have RubyLLM.paint prompts that are based on attachments.
Type of change
Scope check
Quality check
overcommit --installand all hooks passbundle exec rake vcr:record[provider_name]bundle exec rspecmodels.json,aliases.json)API changes
Related issues
Implements #512