Skip to content

Conversation

@zc277584121
Copy link

Summary

  • Add Milvus as a vector database option for knowledge base RAG workflows, following the existing Spanner integration pattern (BaseToolset + utility class).
  • MilvusVectorStore: Core utility class supporting collection setup, batch data ingestion (add_contents / add_contents_async), similarity search, and connection management.
  • MilvusToolset: BaseToolset implementation that provides a similarity_search tool ready for use with LLM agents.
  • MilvusTool: FunctionTool subclass that injects the vector_store parameter (hidden from LLM via _ignore_params), analogous to GoogleTool.
  • MilvusVectorStoreSettings / MilvusToolSettings: Pydantic configuration classes for connection URI, collection, dimension, metric type, index type, etc.
  • Registered MILVUS_TOOLSET and MILVUS_VECTOR_STORE as experimental features in the feature registry.
  • Added pymilvus>=2.5.0 to test, extensions, and new standalone milvus optional dependency groups.
  • The embedding_fn is a required parameter (no built-in default) — users provide their own embedding function (e.g., Google GenAI text-embedding-004).

New Files

Path Description
src/google/adk/tools/milvus/__init__.py Module entry, exports MilvusToolset
src/google/adk/tools/milvus/settings.py Pydantic config classes
src/google/adk/tools/milvus/milvus_vector_store.py Core vector store utility
src/google/adk/tools/milvus/search_tool.py similarity_search function
src/google/adk/tools/milvus/milvus_tool.py FunctionTool subclass
src/google/adk/tools/milvus/milvus_toolset.py BaseToolset implementation
tests/unittests/tools/milvus/ 29 unit tests (all passing)
contributing/samples/milvus_rag_agent/ Sample RAG agent

Test Plan

  • 29 Milvus-specific unit tests pass (settings, vector store, search tool, toolset)
  • All tests mock MilvusClient — no live Milvus instance or API key required
  • Full test suite: 4153 passed, 1 skipped, 0 failures — no regressions
  • Code formatted with pyink + isort

🤖 Generated with Claude Code

@google-cla
Copy link

google-cla bot commented Feb 8, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @zc277584121, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the framework's capabilities by integrating Milvus, a high-performance vector database, to support advanced RAG applications. The changes provide a structured and configurable way for developers to incorporate vector search functionalities into their LLM agents, facilitating the creation of knowledge-aware systems. This integration follows existing patterns within the framework, ensuring consistency and ease of adoption for users familiar with other toolsets.

Highlights

  • Milvus Vector Store Integration: Introduced a comprehensive integration for Milvus, enabling its use as a vector database for Retrieval-Augmented Generation (RAG) workflows within the framework.
  • Core Utility Classes: Added MilvusVectorStore for managing Milvus collections, including setup, batch data ingestion (sync and async), and similarity search capabilities.
  • LLM Agent Tooling: Implemented MilvusToolset as a BaseToolset to expose a similarity_search tool to LLM agents, and MilvusTool to inject the vector_store parameter while keeping it hidden from the LLM.
  • Configuration and Dependencies: Provided Pydantic configuration classes (MilvusVectorStoreSettings, MilvusToolSettings) for flexible Milvus connection and collection settings. The pymilvus>=2.5.0 dependency has been added to relevant groups.
  • Experimental Feature Registration: Registered MILVUS_TOOLSET and MILVUS_VECTOR_STORE as experimental features in the feature registry, indicating their readiness for broader testing and feedback.
  • Sample RAG Agent: Included a new sample RAG agent (contributing/samples/milvus_rag_agent/) demonstrating how to build an agent using Milvus for knowledge base retrieval with Google GenAI embeddings.
  • Comprehensive Testing: Added 29 dedicated unit tests for Milvus components (settings, vector store, search tool, toolset), all passing and mocking MilvusClient to avoid live Milvus instances.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • contributing/samples/milvus_rag_agent/README.md
    • Added a new README file detailing the setup, prerequisites, data ingestion, and execution instructions for the Milvus RAG agent sample.
  • contributing/samples/milvus_rag_agent/init.py
    • Added an __init__.py file to define the milvus_rag_agent directory as a Python package.
  • contributing/samples/milvus_rag_agent/agent.py
    • Added a sample RAG agent (LlmAgent) that utilizes the new MilvusToolset for similarity search, configured with Google GenAI embeddings.
  • pyproject.toml
    • Added pymilvus>=2.5.0 to the test and extensions dependency groups.
    • Created a new milvus optional dependency group for pymilvus>=2.5.0.
  • src/google/adk/features/_feature_registry.py
    • Registered MILVUS_TOOLSET and MILVUS_VECTOR_STORE as experimental features with default_on=True.
  • src/google/adk/tools/milvus/init.py
    • Created the milvus tools module, exporting MilvusToolset.
  • src/google/adk/tools/milvus/milvus_tool.py
    • Added MilvusTool, a FunctionTool subclass designed to inject a MilvusVectorStore instance into wrapped functions, hiding it from the LLM's function declaration.
  • src/google/adk/tools/milvus/milvus_toolset.py
    • Implemented MilvusToolset, a BaseToolset that provides Milvus vector similarity search capabilities to agents, managing the MilvusVectorStore lifecycle.
  • src/google/adk/tools/milvus/milvus_vector_store.py
    • Introduced MilvusVectorStore, a utility class for interacting with Milvus, including collection creation, batch content ingestion (sync and async), and similarity search.
  • src/google/adk/tools/milvus/search_tool.py
    • Defined the similarity_search function, which performs vector similarity search using MilvusVectorStore and formats results as JSON for agent consumption.
  • src/google/adk/tools/milvus/settings.py
    • Added Pydantic models MilvusVectorStoreSettings and MilvusToolSettings for configuring Milvus connection, collection parameters, and toolset behavior, including validation logic.
  • tests/unittests/tools/milvus/init.py
    • Added an __init__.py file for the Milvus unit tests directory.
  • tests/unittests/tools/milvus/test_milvus_toolset.py
    • Added unit tests for MilvusToolset, verifying tool registration, filtering, and proper closing of the vector store.
  • tests/unittests/tools/milvus/test_milvus_vector_store.py
    • Added unit tests for MilvusVectorStore, covering initialization, collection setup, content addition (batching, additional fields), similarity search (custom top_k, filters, empty results), and client closing.
  • tests/unittests/tools/milvus/test_search_tool.py
    • Added unit tests for the similarity_search function, including scenarios with and without search results, and with filter expressions.
  • tests/unittests/tools/milvus/test_settings.py
    • Added unit tests for MilvusVectorStoreSettings and MilvusToolSettings, validating default values, custom configurations, and error handling for invalid parameters.
Activity
  • The author zc277584121 initiated this pull request to add Milvus vector store integration.
  • New files were added to introduce the Milvus toolset, vector store, settings, and a sample RAG agent.
  • Dependency pymilvus was added to pyproject.toml for test, extensions, and a new milvus optional group.
  • Milvus-related features were registered as experimental in the feature registry.
  • A total of 29 unit tests were added for the new Milvus components, all of which are passing and utilize mocking for MilvusClient.
  • The full test suite passed without any regressions, indicating stability of existing features.
  • Code formatting with pyink and isort was applied to maintain code style consistency.
  • The PR description indicates that the changes were generated with Claude Code.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot
Copy link
Collaborator

adk-bot commented Feb 8, 2026

Response from ADK Triaging Agent

Hello @zc277584121, thank you for creating this PR!

This PR is a feature request, could you please associate the github issue with this PR? If there is no existing issue, could you please create one?

In addition, could you please provide logs or screenshot after the fix is applied?

This information will help reviewers to review your PR more efficiently. Thanks!

@adk-bot adk-bot added the tools [Component] This issue is related to tools label Feb 8, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive integration for Milvus as a vector store for RAG workflows. The changes are well-structured, following existing patterns in the codebase with a new MilvusToolset and MilvusVectorStore utility. The implementation includes configuration via Pydantic settings, data ingestion, similarity search, and proper connection management. The addition of a sample agent and extensive unit tests is commendable.

My feedback includes a few suggestions for improvement:

  • Correcting a likely typo in a model name in the sample agent.
  • Improving the robustness of the data ingestion method by adding validation.
  • Minor stylistic and clarity improvements in the sample code and documentation.

@zc277584121 zc277584121 force-pushed the feat/milvus-vector-store-integration branch 2 times, most recently from 6bd77ee to 949a888 Compare February 9, 2026 03:18
Add Milvus as a vector database option for knowledge base RAG workflows.
This follows the existing Spanner integration pattern with BaseToolset,
providing data ingestion, similarity search, and a ready-to-use toolset
for LLM agents.

- MilvusVectorStoreSettings / MilvusToolSettings (Pydantic config)
- MilvusVectorStore: setup, add_contents, search, close
- MilvusTool: FunctionTool subclass injecting vector_store param
- MilvusToolset: BaseToolset providing similarity_search tool
- Registered MILVUS_TOOLSET / MILVUS_VECTOR_STORE as experimental features
- Added pymilvus>=2.5.0 to test, extensions, and new milvus extras
- 30 unit tests (all passing)
- Sample milvus_rag_agent under contributing/samples/
@zc277584121 zc277584121 force-pushed the feat/milvus-vector-store-integration branch from 949a888 to 1e7e5ca Compare February 9, 2026 03:34
@zc277584121
Copy link
Author

Review Feedback Addressed

Thanks for the review! Here's a summary of how each item was handled:

# Feedback Action
1 gemini-2.5-flash typo No changegemini-2.5-flash is a valid model released in 2025, not a typo.
2 add_contents should validate additional_fields length Fixed — Added a ValueError check when lengths mismatch, with a corresponding unit test.
3 Combine two imports in README No change — The project uses isort with one-import-per-line style.
4 embedding_fn undefined in README snippet Fixed — Added a complete embedding_fn definition example using Google GenAI in the README.
5 list(e.values) is redundant No changee.values returns a protobuf RepeatedScalarFieldContainer, not a native Python list. The list() conversion is necessary.

Additionally, refactored MilvusVectorStore.setup() to use MilvusClient.create_schema() API instead of ORM-layer classes (FieldSchema, CollectionSchema).

End-to-End Verification

The full agent pipeline has been tested end-to-end with real Google GenAI embeddings (gemini-embedding-001) and Gemini LLM:

[1/7] Setting up embedding function...
      Using REAL Google GenAI embedding (dim=3072)
[2/7] Importing modules and creating settings... OK
[3/7] Creating collection... OK
[4/7] Ingesting documents... OK - 8 documents ingested
[5/7] Similarity search... OK - semantically correct results
[6/7] MilvusToolset... OK
[7/7] Full LLM Agent call...
      User: What is Milvus?
      Agent: Milvus is an open-source vector database for AI applications.
      OK - Full agent pipeline works!

Usage with LLM Agent

from google.adk.agents.llm_agent import LlmAgent
from google.adk.tools.milvus import MilvusToolset
from google.adk.tools.milvus.settings import MilvusToolSettings, MilvusVectorStoreSettings
from google.genai import Client

# 1. Define embedding function
genai_client = Client()
def embedding_fn(texts):
    resp = genai_client.models.embed_content(
        model="gemini-embedding-001", contents=texts)
    return [list(e.values) for e in resp.embeddings]

# 2. Configure Milvus toolset
toolset = MilvusToolset(
    milvus_tool_settings=MilvusToolSettings(
        vector_store_settings=MilvusVectorStoreSettings(
            uri="http://localhost:19530",  # see backend options below
            collection_name="knowledge_base",
            dimension=3072,
        ),
    ),
    embedding_fn=embedding_fn,
)

# 3. Create agent with Milvus RAG
agent = LlmAgent(
    model="gemini-2.5-flash",
    name="rag_agent",
    instruction="Use similarity_search to answer questions from the knowledge base.",
    tools=[toolset],
)

Supported Backends

All three Milvus deployment modes are supported — just change uri and token:

Backend uri token
Milvus Lite (local, no server) ./milvus.db
Milvus Server (self-hosted) http://localhost:19530
Zilliz Cloud (fully managed) https://in01-xxx.serverless.gcp-us-west1.cloud.zilliz.com your-api-key

Enhance the docstring of the similarity_search function to provide
a richer tool description for the LLM. The previous description was
too brief ("Search for similar content in Milvus vector store.").
The new description explains when to use the tool, how to write
effective queries, filter expression syntax, and return format.
Implement BaseMemoryService backed by Milvus vector database,
enabling semantic search across past conversation history.

- MilvusMemoryService: stores session events as vector-embedded
  text, with app_name/user_id scoping and deduplication
- Lazy collection setup (auto-creates on first use)
- 13 unit tests with mocked MilvusClient
- E2E verified: cross-session recall and user isolation work
  with real Google GenAI embedding + Gemini LLM
@zc277584121
Copy link
Author

New: MilvusMemoryService — cross-session memory backed by Milvus

In addition to the MilvusToolset for RAG knowledge base search, this PR now also includes MilvusMemoryService, an implementation of BaseMemoryService that stores conversation history in Milvus for semantic recall across sessions.

What it does

  • Implements add_session_to_memory() and search_memory() from BaseMemoryService
  • Stores session events as vector-embedded text with app_name/user_id scoping
  • Deduplicates on repeated add_session_to_memory() calls
  • Works with PreloadMemoryTool (auto-inject past memories) and LoadMemoryTool

Quick E2E usage

from google.genai import Client
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.memory.milvus_memory_service import MilvusMemoryService
from google.adk.tools.preload_memory_tool import preload_memory_tool

# Embedding function
client = Client()
def embedding_fn(texts):
    resp = client.models.embed_content(model="gemini-embedding-001", contents=texts)
    return [list(e.values) for e in resp.embeddings]

# Create memory service (Milvus Lite / Server / Zilliz Cloud)
memory_service = MilvusMemoryService(
    embedding_fn=embedding_fn,
    uri="./memory.db",          # or "http://localhost:19530" or Zilliz Cloud URL
    collection_name="my_memory",
    dimension=3072,
)

# Agent with memory
agent = LlmAgent(
    model="gemini-2.5-flash",
    name="assistant",
    instruction="You are a helpful assistant with long-term memory.",
    tools=[preload_memory_tool],
)

runner = Runner(
    agent=agent,
    app_name="my_app",
    session_service=InMemorySessionService(),
    memory_service=memory_service,
)

# After a session ends, save to Milvus:
# await memory_service.add_session_to_memory(session)
# Next session: preload_memory_tool auto-searches Milvus and injects relevant memories.

E2E test results

Tested with real Google GenAI embedding (gemini-embedding-001, dim=3072) + gemini-2.5-flash:

  • Session 1: User tells agent 3 facts (workplace, favorite language, cat name)
  • Session 2 (new session): Agent recalls all 3 facts from Milvus memory ✅
  • User isolation: Different user cannot access another user's memories ✅
  • Deduplication: Repeated add_session_to_memory() does not duplicate data ✅

Files added/changed

File Description
src/google/adk/memory/milvus_memory_service.py Core implementation
src/google/adk/memory/__init__.py Conditional export
src/google/adk/features/_feature_registry.py Register MILVUS_MEMORY_SERVICE
tests/unittests/memory/test_milvus_memory_service.py 13 unit tests

cc @ahamedjobayer551-debug — would appreciate your continued review on this new addition. Thanks!

@zc277584121 zc277584121 changed the title feat: Add Milvus vector store integration for RAG feat: Add Milvus vector store integration for RAG tool and memory Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tools [Component] This issue is related to tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants