-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
Problem:
ADK's built-in retrieval tools (VertexAiRagRetrieval, LlamaIndexRetrieval) require specific backends. Users with custom retrieval logic such as Elasticsearch, Pinecone, pgvector, or any custom API must fall back to FunctionTool to wrap their retrieval callable. This works at runtime, but loses retrieval semantics:
- No standardized query parameter schema (must manually define it and hope the LLM respects it)
- isinstance(tool, BaseRetrievalTool) returns False, so downstream code and agent graph visualization cannot identify the tool as a retrieval tool
- Graph visualization shows it as a generic function tool instead of a retrieval tool
Solution:
A new CallableRetrieval class in google.adk.tools.retrieval that extends BaseRetrievalTool and wraps any user-provided sync or async callable as a first-class retrieval tool.
from google.adk.tools.retrieval import CallableRetrieval
def search_docs(query: str) -> list[str]:
return my_pinecone_index.query(query)
tool = CallableRetrieval(
name="search_docs",
description="Search the knowledge base.",
retriever=search_docs,
)
The callable must accept a query: str as its first argument. It may optionally accept a tool_context: ToolContext parameter (matching FunctionTool's injection pattern). Both sync and async callables should be supported.
Impact:
This is needed to integrate custom retrieval backends (Elasticsearch, Pinecone, pgvector) into ADK agents while preserving retrieval semantics. Without it, users must choose between proper retrieval identity (BaseRetrievalTool) and custom backends, they can't have both
Contribution:
I have an implementation ready and will submit a PR.
Alternatives Considered:
- Why not just use FunctionTool: Works at runtime but the tool loses its retrieval identity. No isinstance(tool, BaseRetrievalTool), no standardized query schema, no retrieval-specific graph visualization. If ADK ever adds retrieval-specific behavior (caching, logging, grounding metadata), FunctionTool-wrapped retrievers won't get it.
- Factory classmethod on BaseRetrievalTool (e.g., BaseRetrievalTool.from_callable(...)): Avoids a new class but returns a hidden anonymous subclass. Harder to debug, less discoverable, and doesn't fit ADK's Pydantic-based tool patterns.
Proposed Implementation:
from google.adk.tools.retrieval import CallableRetrieval
# Simple case — sync callable
def search_docs(query: str) -> list[str]:
return my_db.search(query)
tool = CallableRetrieval(name="search_docs", description="Search docs.", retriever=search_docs)
# Async callable with tool_context access
async def search_with_context(query: str, tool_context: ToolContext) -> list[str]:
user_id = tool_context.state.get("user_id")
return await my_db.search(query, user_id=user_id)
tool = CallableRetrieval(name="search", description="Search.", retriever=search_with_context)
Implementation follows the same pattern as LlamaIndexRetrieval. i.e., extends BaseRetrievalTool, overrides run_async(), uses inspect.signature for optional tool_context detection and inspect.iscoroutinefunction for sync/async dispatch. Zero external dependencies.
Summary:
This fills a gap in the retrieval tool family. The existing hierarchy is:
- BaseRetrievalTool (abstract) → VertexAiRagRetrieval (Vertex AI), LlamaIndexRetrieval (LlamaIndex), FilesRetrieval (local files)
CallableRetrieval is the framework-agnostic entry point, analogous to how FunctionTool is the generic wrapper for BaseTool, but scoped to retrieval semantics.