Skip to content

Add Gemini provider support#324

Open
aoki-ryusei wants to merge 3 commits intoactiveagents:mainfrom
aoki-ryusei:feature/gemini-provider
Open

Add Gemini provider support#324
aoki-ryusei wants to merge 3 commits intoactiveagents:mainfrom
aoki-ryusei:feature/gemini-provider

Conversation

@aoki-ryusei
Copy link

Summary

Adds support for Google Gemini as a provider, allowing agents to use Gemini models via the OpenAI-compatible API endpoint. This is useful for teams that want to use Google's latest AI models while maintaining a consistent ActiveAgent interface.
The implementation follows the same pattern established by the Ollama provider: inherit from OpenAI::ChatProvider and override only what differs - in this case, streaming role handling and API key resolution.

How it works

GeminiProvider inherits from OpenAI::ChatProvider, reusing all existing functionality (streaming, tool use, structured output). The key difference is handling Gemini's streaming behavior: unlike OpenAI which sends role only in the first chunk, Gemini sends role: "assistant" in every chunk, which would cause role concatenation ("assistantassistant...") without the message_merge_delta override.

Configuration

# config/active_agent.yml
production:
  gemini:
    service: "Gemini"
    model: "gemini-2.0-flash"
    api_key: <%= Rails.application.credentials.dig(:gemini, :api_key) %>

Agent usage

class ChatAgent < ApplicationAgent
  generate_with :gemini, model: "gemini-2.0-flash"

  def ask
    prompt(message: params[:message])
  end
end

Authentication

API key is resolved in order:

  1. Explicit api_key option in config
  2. access_token alias (for consistency with other providers)
  3. GEMINI_API_KEY environment variable
  4. GOOGLE_API_KEY environment variable

Files changed

Production code (4 files, ~110 lines):

File Purpose
gemini_provider.rb Provider class - inherits OpenAI::ChatProvider, overrides message_merge_delta for streaming role handling, api_prompt_execute and api_embed_execute with connection error handling
gemini/options.rb API key configuration with ENV fallbacks (GEMINI_API_KEY, “GOOGLE_API_KEY), disables organization_idandproject_id` (not used by Gemini)
gemini/_types.rb Reuses OpenAI's Chat::RequestType and Embedding::RequestType - no new types needed since Gemini implements OpenAI-compatible format

Test code (3 files, ~320 lines, 22 tests):

File Tests
gemini_provider_test.rb Service name, options class, prompt/embed request type delegation, client construction, inheritance
options_test.rb API key validation, all ENV variable fallbacks (GEMINI_API_KEY, GOOGLE_API_KEY), explicit-over-ENV precedence, access_token alias, organization/project return nil
streaming_lifecycle_test.rb Inherits :open event from parent, idempotency, message_merge_delta handles role duplication correctly, full lifecycle event ordering (open → update → close), streaming flag state transitions

Supporting files:

File Purpose
test/dummy/app/agents/providers/gemini_agent.rb Example agent for the test dummy app
test/dummy/config/active_agent.yml Gemini config entry for dev/test environments

Design decisions

  • Inheritance over composition: GeminiProvider < OpenAI::ChatProvider means Gemini automatically benefits from any future improvements to streaming, tool use, or message handling in the OpenAI provider — zero maintenance burden.
  • message_merge_delta override: Gemini's OpenAI-compatible API sends role in every streaming chunk (unlike OpenAI which sends it only in the first). The override uses assignment (=) instead of concatenation to prevent "assistantassistant..." corruption.
  • No new RequestType: Gemini uses the exact same OpenAI Chat/Embedding API format — the endpoint handles all protocol differences internally. So gemini/_types.rb simply delegates to open_ai/chat/_types.rb and open_ai/embedding/_types.rb.
  • Dual ENV support: Both GEMINI_API_KEY and GOOGLE_API_KEY are supported, with GEMINI_API_KEY taking precedence. This accommodates different naming conventions across teams.
  • organization_id / project_id disabled: These OpenAI-specific fields are not used by Gemini API, so the resolver methods return nil.

Test plan

  • All 22 new Gemini tests pass (36 assertions, 0 failures, 0 errors)
  • Full existing test suite unaffected (0 new failures)
  • Verified end-to-end with real Gemini API (chat and streaming)
  • Reviewer: verify CI passes on all matrix combinations

Enables using Google's Gemini models through the OpenAI-compatible API endpoint at generativelanguage.googleapis.com. The provider inherits from OpenAI::ChatProvider, reusing streaming, tool use, and structured output functionality while overriding only Gemini-specific behaviors.

Implementation:
- GeminiProvider inherits OpenAI::ChatProvider, overrides message_merge_delta to fix Gemini's streaming role duplication (Gemini sends role in every chunk, causing "assistantassistant...")
- Gemini::Options handles API key resolution: explicit api_key, access_token alias, then environment variables (GEMINI_API_KEY, GOOGLE_API_KEY in priority order)
- Reuses OpenAI::Chat::RequestType — no protocol translation needed as Gemini implements OpenAI-compatible format
- organization_id and project_id disabled (not used by Gemini API)
- Connection error handling with instrumentation logging

Follows the same pattern established by OllamaProvider which also inherits from OpenAI::ChatProvider for OpenAI-compatible endpoints.
Comprehensive test coverage for GeminiProvider, Gemini::Options, and streaming lifecycle behaviors.

Test coverage (21 tests, 35 assertions):

- Provider class (6 tests): service_name, options_klass, prompt_request_type delegation to OpenAI::Chat::RequestType, initialization, inheritance from OpenAI::ChatProvider, client construction
- Options (8 tests): api_key validation, GEMINI_API_KEY env resolution, GOOGLE_API_KEY env resolution, GEMINI over GOOGLE precedence, explicit-over-ENV precedence, access_token alias, organization_id returns nil, project_id returns nil
- Streaming lifecycle (7 tests): inherits :open event emission from OpenAI::ChatProvider, broadcast_stream_open idempotency, message_merge_delta handles Gemini role duplication correctly, full lifecycle event ordering (open -> update -> close), streaming flag state transitions

The streaming tests specifically verify the message_merge_delta override prevents role concatenation when Gemini sends role in every chunk.
Enables text embedding functionality using Gemini's OpenAI-compatible embeddings endpoint.

Implementation:
- Add embed_request_type class method returning OpenAI::Embedding::RequestType (Gemini uses same request format as OpenAI)
- Add api_embed_execute with connection error handling and instrumentation
- Add Gemini::Embedding::RequestType alias in _types.rb

Test coverage (1 test, 1 assertion):
- embed_request_type returns OpenAI::Embedding::RequestType instance
@superconductor-for-github
Copy link
Contributor

superconductor-for-github bot commented Mar 4, 2026

@aoki-ryuseiSuperconductor finishedView implementation | Guided Review


Standing by for instructions.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Gemini provider to ActiveAgent by reusing the existing OpenAI Chat provider implementation against Google’s OpenAI-compatible Gemini endpoint, plus accompanying tests and dummy app configuration.

Changes:

  • Introduces ActiveAgent::Providers::GeminiProvider inheriting from OpenAI::ChatProvider, with a streaming delta merge override to prevent role duplication.
  • Adds Gemini::Options (API key resolution + Gemini base URL default) and Gemini type aliases reusing OpenAI request types.
  • Adds Gemini-focused tests (provider basics, options/env fallback behavior, streaming lifecycle) and dummy app config/agent examples.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lib/active_agent/providers/gemini_provider.rb New Gemini provider inheriting OpenAI Chat provider; overrides streaming role merge and adds connection error instrumentation.
lib/active_agent/providers/gemini/options.rb Gemini options with base_url fallback and API key resolution (config + ENV).
lib/active_agent/providers/gemini/_types.rb Aliases Gemini request types to existing OpenAI Chat/Embedding request types.
test/providers/gemini/gemini_provider_test.rb Provider-level tests (service name, options class, request types, inheritance, client).
test/providers/gemini/options_test.rb Options tests for API key validation and ENV fallback precedence.
test/providers/gemini/streaming_lifecycle_test.rb Streaming tests ensuring lifecycle events and role-deduplication behavior.
test/dummy/config/active_agent.yml Adds Gemini configuration anchor and enables it in dev/test dummy environments.
test/dummy/app/agents/providers/gemini_agent.rb Example dummy agent demonstrating generate_with :gemini.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +5 to +15
begin
require "openai"
rescue LoadError
puts "OpenAI gem not available, skipping Gemini provider tests"
return
end

require_relative "../../../lib/active_agent/providers/gemini_provider"

class GeminiProviderTest < ActiveSupport::TestCase
setup do
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return at the top level will raise LocalJumpError if the rescue path is hit (i.e., when the openai gem is missing), so these tests won't actually skip cleanly. Prefer defining the test class and calling skip in setup, or conditionally defining the tests only when require "openai" succeeds.

Suggested change
begin
require "openai"
rescue LoadError
puts "OpenAI gem not available, skipping Gemini provider tests"
return
end
require_relative "../../../lib/active_agent/providers/gemini_provider"
class GeminiProviderTest < ActiveSupport::TestCase
setup do
OPENAI_AVAILABLE = begin
require "openai"
true
rescue LoadError
warn "OpenAI gem not available, skipping Gemini provider tests"
false
end
require_relative "../../../lib/active_agent/providers/gemini_provider" if OPENAI_AVAILABLE
class GeminiProviderTest < ActiveSupport::TestCase
setup do
skip "OpenAI gem not available, skipping Gemini provider tests" unless OPENAI_AVAILABLE

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +10
begin
require "openai"
rescue LoadError
puts "OpenAI gem not available, skipping Gemini options tests"
return
end
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return is not valid at file scope; if the openai gem isn't available this will raise LocalJumpError instead of skipping. Use a Minitest/Rails skip mechanism (e.g., skip in setup) or wrap the test class definition in a conditional so the file loads safely without the dependency.

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +10
begin
require "openai"
rescue LoadError
puts "OpenAI gem not available, skipping Gemini streaming lifecycle tests"
return
end
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level return will raise LocalJumpError if executed, so the intended "skip when openai gem missing" behavior won't work. Consider conditionally defining the test class only when require "openai" succeeds, or call skip inside setup/tests when the dependency isn't present.

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +63
test "client is configured with Gemini base_url" do
provider = ActiveAgent::Providers::GeminiProvider.new(@valid_config)
client = provider.client

assert_kind_of ::OpenAI::Client, client
end
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name claims it verifies the Gemini base_url, but it only asserts the client is an OpenAI::Client. Either assert the configured base URL (via the provider options or client config) or rename the test to match what it actually checks.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants