diff --git a/.github/workflows/python-merge-tests.yml b/.github/workflows/python-merge-tests.yml index 8704ec56c1..966b5ad361 100644 --- a/.github/workflows/python-merge-tests.yml +++ b/.github/workflows/python-merge-tests.yml @@ -71,8 +71,10 @@ jobs: OPENAI_API_KEY: ${{ secrets.OPENAI__APIKEY }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} ANTHROPIC_CHAT_MODEL_ID: ${{ vars.ANTHROPIC_CHAT_MODEL_ID }} + OPENAI_EMBEDDING_MODEL_ID: ${{ vars.OPENAI__EMBEDDINGMODELID }} AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ vars.AZUREOPENAI__CHATDEPLOYMENTNAME }} AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }} + AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME: ${{ vars.AZUREOPENAI__EMBEDDINGDEPLOYMENTNAME }} AZURE_OPENAI_ENDPOINT: ${{ vars.AZUREOPENAI__ENDPOINT }} LOCAL_MCP_URL: ${{ vars.LOCAL_MCP__URL }} # For Azure Functions integration tests diff --git a/docs/features/vector-stores-and-embeddings/README.md b/docs/features/vector-stores-and-embeddings/README.md new file mode 100644 index 0000000000..851e47afe1 --- /dev/null +++ b/docs/features/vector-stores-and-embeddings/README.md @@ -0,0 +1,385 @@ +# Vector Stores and Embeddings + +## Overview + +This feature ports the vector store abstractions, embedding generator abstractions, and their implementations from Semantic Kernel into Agent Framework. The ported code follows AF's coding standards, feels native to AF, and is structured to allow data models/schemas to be reusable across both frameworks. The embedding abstraction combines the best of SK's `EmbeddingGeneratorBase` and MEAI's `IEmbeddingGenerator`. + +| Capability | Description | +| --- | --- | +| Embedding generation | Generic embedding client abstraction supporting text, image, and audio inputs | +| Vector store collections | CRUD operations on vector store collections (upsert, get, delete) | +| Vector search | Unified search interface with `search_type` parameter (`"vector"`, `"keyword_hybrid"`) | +| Data model decorator | `@vectorstoremodel` decorator for defining vector store data models (supports Pydantic, dataclasses, plain classes, dicts) | +| Agent tools | `create_search_tool`, `create_upsert_tool`, `create_get_tool`, `create_delete_tool` for agent-usable vector store operations | +| In-memory store | Zero-dependency vector store for testing and development | +| 13+ connectors | Azure AI Search, Qdrant, Redis, PostgreSQL, MongoDB, Cosmos DB, Pinecone, Chroma, Weaviate, Oracle, SQL Server, FAISS | + +## Key Design Decisions + +### Embedding Abstractions (combining SK + MEAI) +- **Both Protocol and Base class** (matching AF's `SupportsChatGetResponse` + `BaseChatClient` pattern): + - `SupportsGetEmbeddings` — Protocol for duck-typing + - `BaseEmbeddingClient` — ABC base class for implementations (similar to `BaseChatClient`) +- **Generic input type** (`EmbeddingInputT`, default `str`) from MEAI — allows image/audio embeddings in the future +- **Generic output type** (`EmbeddingT`, default `list[float]`) from MEAI — supports `list[float]`, `list[int]`, `bytes`, etc. +- **Generic order**: `[EmbeddingInputT, EmbeddingT, EmbeddingOptionsT]` — options last, matching MEAI's `IEmbeddingGenerator` with options appended +- **TypeVar naming convention**: Use `SuffixT` per AF standard (e.g., `EmbeddingInputT`, `EmbeddingT`, `ModelT`, `KeyT`) +- `EmbeddingGenerationOptions` TypedDict (inspired by MEAI, matching AF's `ChatOptions` pattern) — `total=False`, includes `dimensions`, `model_id`. No `additional_properties` since each implementation extends with its own fields. +- Protocol and base class are generic over input, output, and options: `SupportsGetEmbeddings[EmbeddingInputT, EmbeddingT, OptionsContraT]`, `BaseEmbeddingClient[EmbeddingInputT, EmbeddingT, OptionsCoT]` +- **`Embedding[EmbeddingT]` type** in `_types.py` — a lightweight generic class (not Pydantic) with `vector: EmbeddingT`, `model_id: str | None`, `dimensions: int | None` (explicit or computed from vector), `created_at: datetime | None`, `additional_properties: dict[str, Any]` +- **`GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT]` type** — a list-like container of `Embedding[EmbeddingT]` objects with `options: EmbeddingOptionsT | None` (stores the options used to generate), `usage: dict[str, Any] | None`, `additional_properties: dict[str, Any]` +- **No numpy dependency** — return `list[float]` by default; users cast as needed + +### Vector Store Abstractions +- **Port core abstractions without Pydantic for internal classes** — use plain classes +- **Both Protocol and Base class** for vector store operations (matching AF pattern): + - `SupportsVectorUpsert` / `SupportsVectorSearch` — Protocols for duck-typing (follows `Supports` naming convention) + - `BaseVectorCollection` / `BaseVectorSearch` — ABC base classes for implementations + - `BaseVectorStore` — ABC base class for store operations (factory for collections, no protocol needed) +- **TypeVar naming convention**: `ModelT`, `KeyT`, `FilterT` (suffix T, per AF standard) +- **Support Pydantic for user-facing data models** — the `@vectorstoremodel` decorator and `VectorStoreCollectionDefinition` should work with Pydantic models, dataclasses, plain classes, and dicts +- **Remove SK-specific dependencies** — no `KernelBaseModel`, `KernelFunction`, `KernelParameterMetadata`, `kernel_function`, `PromptExecutionSettings` +- **Embedding types in `_types.py`**, embedding protocol/base class in `_clients.py` +- **All vector store specific types, enums, protocols, base classes** in `_vectors.py` +- **Error handling** uses AF's exception hierarchy (e.g., `IntegrationException` variants) + +### Package Structure +- **Embedding types** (`Embedding`, `GeneratedEmbeddings`, `EmbeddingGenerationOptions`) in `agent_framework/_types.py` +- **Embedding protocol + base class** (`SupportsGetEmbeddings`, `BaseEmbeddingClient`) in `agent_framework/_clients.py` +- **All vector store specific code** in a new `agent_framework/_vectors.py` module — this includes: + - Enums: `FieldTypes`, `IndexKind`, `DistanceFunction` + - `VectorStoreField`, `VectorStoreCollectionDefinition` + - `SearchOptions`, `SearchResponse`, `RecordFilterOptions` + - `@vectorstoremodel` decorator + - Serialization/deserialization protocols + - `VectorStoreRecordHandler`, `BaseVectorCollection`, `BaseVectorStore`, `BaseVectorSearch` + - `SupportsVectorUpsert`, `SupportsVectorSearch` protocols +- **OpenAI embeddings** in `agent_framework/openai/` (built into core, like OpenAI chat) +- **Azure OpenAI embeddings** in `agent_framework/azure/` (built into core, follows `AzureOpenAIChatClient` pattern) +- **Each vector store connector** in its own AF package under `packages/` +- **In-memory store** in core (no external deps) +- **TextSearch and its implementations** (Brave, Google) — last phase, separate work + +## Naming: SK → AF + +### Names that change + +| SK Name | AF Name | Rationale | +|---------|---------|-----------| +| `VectorStoreCollection` | `BaseVectorCollection` | Drop redundant `Store`, add `Base` prefix per AF pattern | +| `VectorStore` | `BaseVectorStore` | Add `Base` prefix per AF pattern | +| `VectorSearch` | `BaseVectorSearch` | Add `Base` prefix per AF pattern | +| `VectorSearchOptions` | `SearchOptions` | Shorter — context is already vector search | +| `VectorSearchResult` | `SearchResponse` | Align with `ChatResponse`/`AgentResponse` | +| `GetFilteredRecordOptions` | `RecordFilterOptions` | Shorter, more natural | +| `EmbeddingGeneratorBase` | `BaseEmbeddingClient` | Matches AF `BaseChatClient` pattern | +| `VectorStoreCollectionProtocol` | `SupportsVectorUpsert` | AF `Supports*` naming convention | +| `VectorSearchProtocol` | `SupportsVectorSearch` | AF `Supports*` naming convention | +| `__kernel_vectorstoremodel__` | `__vectorstoremodel__` | Drop SK `kernel` prefix | +| `__kernel_vectorstoremodel_definition__` | `__vectorstoremodel_definition__` | Drop SK `kernel` prefix | +| `search()` + `hybrid_search()` | `search(search_type=...)` | Single method with `Literal` parameter | +| `SearchType` enum | `Literal["vector", "keyword_hybrid"]` | No enum, just a literal | +| `KernelSearchResults` | `SearchResults` | Drop SK `Kernel` prefix (plural — container of `SearchResponse` items) | + +### Names that stay the same + +| Name | Location | +|------|----------| +| `@vectorstoremodel` | `_vectors.py` | +| `VectorStoreField` | `_vectors.py` | +| `VectorStoreCollectionDefinition` | `_vectors.py` | +| `VectorStoreRecordHandler` | `_vectors.py` | +| `FieldTypes` | `_vectors.py` | +| `IndexKind` | `_vectors.py` | +| `DistanceFunction` | `_vectors.py` | +| `DISTANCE_FUNCTION_DIRECTION_HELPER` | `_vectors.py` | +| `Embedding` | `_types.py` | +| `GeneratedEmbeddings` | `_types.py` | +| `EmbeddingGenerationOptions` | `_types.py` | +| `SupportsGetEmbeddings` | `_clients.py` | + +### New AF-only names (no SK equivalent) + +| Name | Location | Purpose | +|------|----------|---------| +| `BaseEmbeddingClient` | `_clients.py` | ABC base for embedding implementations | +| `EmbeddingInputT` | `_types.py` | TypeVar for generic embedding input (default `str`) | +| `EmbeddingTelemetryLayer` | `observability.py` | MRO-based OTel tracing for embeddings | +| `SupportsVectorUpsert` | `_vectors.py` | Protocol for collection CRUD | +| `SupportsVectorSearch` | `_vectors.py` | Protocol for vector search | +| `create_search_tool` | `_vectors.py` | Creates AF `FunctionTool` from vector search | + +## Source Files Reference (SK → AF mapping) + +### SK Source Files +| SK File | Lines | Content | +|---------|-------|---------| +| `data/vector.py` | 2369 | All vector store abstractions, enums, decorator, search | +| `data/_shared.py` | 184 | SearchOptions, KernelSearchResults, shared search types | +| `data/text_search.py` | 349 | TextSearch base, TextSearchResult | +| `connectors/ai/embedding_generator_base.py` | 50 | EmbeddingGeneratorBase ABC | +| `connectors/in_memory.py` | 520 | InMemoryCollection, InMemoryStore | +| `connectors/azure_ai_search.py` | 793 | Azure AI Search collection + store | +| `connectors/azure_cosmos_db.py` | 1104 | Cosmos DB (Mongo + NoSQL) | +| `connectors/redis.py` | 845 | Redis (Hashset + JSON) | +| `connectors/qdrant.py` | 653 | Qdrant collection + store | +| `connectors/postgres.py` | 987 | PostgreSQL collection + store | +| `connectors/mongodb.py` | 633 | MongoDB Atlas collection + store | +| `connectors/pinecone.py` | 691 | Pinecone collection + store | +| `connectors/chroma.py` | 484 | Chroma collection + store | +| `connectors/faiss.py` | 278 | FAISS (extends InMemory) | +| `connectors/weaviate.py` | 804 | Weaviate collection + store | +| `connectors/oracle.py` | 1267 | Oracle collection + store | +| `connectors/sql_server.py` | 1132 | SQL Server collection + store | +| `connectors/ai/open_ai/services/open_ai_text_embedding.py` | 91 | OpenAI embedding impl | +| `connectors/ai/open_ai/services/open_ai_text_embedding_base.py` | 78 | OpenAI embedding base | +| `connectors/brave.py` | ~200 | Brave TextSearch impl | +| `connectors/google_search.py` | ~200 | Google TextSearch impl | + +--- + +## Implementation Phases + +### Phase 1: Core Embedding Abstractions & OpenAI Implementation +**Goal:** Establish the embedding generator abstraction and ship one working implementation. +**Mergeable:** Yes — adds new types/protocols, no breaking changes. + +#### 1.1 — Embedding types in `_types.py` +- `EmbeddingInputT` TypeVar (default `str`) — generic input type for embedding generation +- `EmbeddingT` TypeVar (default `list[float]`) — generic output embedding vector type +- `Embedding[EmbeddingT]` generic class: `vector: EmbeddingT`, `model_id: str | None`, `dimensions: int | None` (explicit param or computed from vector length), `created_at: datetime | None`, `additional_properties: dict[str, Any]` +- `GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT]` generic class: list-like container of `Embedding[EmbeddingT]` objects with `options: EmbeddingOptionsT | None` (the options used to generate), `usage: dict[str, Any] | None`, `additional_properties: dict[str, Any]` +- `EmbeddingGenerationOptions` TypedDict (`total=False`): `dimensions: int`, `model_id: str` — follows the same pattern as `ChatOptions`. No `additional_properties` needed since it's a TypedDict and each implementation can extend with its own fields. + +#### 1.2 — Embedding generator protocol + base class in `_clients.py` +- `SupportsGetEmbeddings(Protocol[EmbeddingInputT, EmbeddingT, OptionsContraT])`: generic over input, output, and options (all with defaults), `get_embeddings(values: Sequence[EmbeddingInputT], *, options: OptionsContraT | None = None) -> Awaitable[GeneratedEmbeddings[EmbeddingT]]` +- `BaseEmbeddingClient(ABC, Generic[EmbeddingInputT, EmbeddingT, OptionsCoT])`: ABC base class mirroring `BaseChatClient` pattern + - `__init__` with `additional_properties`, etc. + - Abstract `get_embeddings(...)` for subclasses to implement directly (no `_inner_*` indirection — simpler than chat, no middleware needed) +- `EmbeddingTelemetryLayer` in `observability.py` — MRO-based telemetry (no closure), `gen_ai.operation.name = "embeddings"` + +#### 1.3 — OpenAI embedding generator in `agent_framework/openai/` and `agent_framework/azure/` +- `RawOpenAIEmbeddingClient` — implements `get_embeddings` via `_ensure_client()` factory +- `OpenAIEmbeddingClient(OpenAIConfigMixin, EmbeddingTelemetryLayer[str, list[float], OptionsT], RawOpenAIEmbeddingClient[OptionsT])` — full client with config + telemetry layers +- `OpenAIEmbeddingOptions(EmbeddingGenerationOptions)` — extends with `encoding_format`, `user` +- `AzureOpenAIEmbeddingClient` in `agent_framework/azure/` — follows `AzureOpenAIChatClient` pattern with `AzureOpenAIConfigMixin`, `load_settings`, Entra ID credential support +- `AzureOpenAISettings` extended with `embedding_deployment_name` (env var: `AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME`) + +#### 1.4 — Tests and samples +- Unit tests for types, protocol, base class, OpenAI client, Azure OpenAI client +- Integration tests for OpenAI and Azure OpenAI (gated behind `RUN_INTEGRATION_TESTS` + credentials, `@pytest.mark.flaky`) +- Samples in `samples/02-agents/embeddings/` — `openai_embeddings.py`, `azure_openai_embeddings.py` + +--- + +### Phase 2: Embedding Generators for Existing Providers +**Goal:** Add embedding generators to all existing AF provider packages that have chat clients. +**Mergeable:** Yes — each is independent, added to existing provider packages. + +#### 2.1 — Azure AI Inference embedding (in `packages/azure-ai/`) +#### 2.2 — Ollama embedding (in `packages/ollama/`) +#### 2.3 — Anthropic embedding (in `packages/anthropic/`) +#### 2.4 — Bedrock embedding (in `packages/bedrock/`) + +--- + +### Phase 3: Core Vector Store Abstractions +**Goal:** Establish all vector store types, enums, the decorator, collection definition, and base classes. +**Mergeable:** Yes — adds new abstractions, no breaking changes. + +#### 3.1 — Vector store enums and field types in `_vectors.py` +- `FieldTypes` enum: `KEY`, `VECTOR`, `DATA` +- `IndexKind` enum: `HNSW`, `FLAT`, `IVF_FLAT`, `DISK_ANN`, `QUANTIZED_FLAT`, `DYNAMIC`, `DEFAULT` +- `DistanceFunction` enum: `COSINE_SIMILARITY`, `COSINE_DISTANCE`, `DOT_PROD`, `EUCLIDEAN_DISTANCE`, `EUCLIDEAN_SQUARED_DISTANCE`, `MANHATTAN`, `HAMMING`, `DEFAULT` +- No `SearchType` enum — use `Literal["vector", "keyword_hybrid"]` instead, per AF convention of avoiding unnecessary imports +- `VectorStoreField` plain class (not Pydantic) +- `VectorStoreCollectionDefinition` class (not Pydantic internally, but supports Pydantic models as input) +- `SearchOptions` plain class +- `SearchResponse` generic class +- `RecordFilterOptions` plain class +- `DISTANCE_FUNCTION_DIRECTION_HELPER` dict + +#### 3.2 — `@vectorstoremodel` decorator +- Port from SK, works with dataclasses, Pydantic models, plain classes, and dicts +- Sets `__vectorstoremodel__` and `__vectorstoremodel_definition__` on the class +- Remove SK-specific `kernel` prefix (`__kernel_vectorstoremodel__` → `__vectorstoremodel__`) + +#### 3.3 — Serialization/deserialization protocols +- `SerializeMethodProtocol`, `ToDictFunctionProtocol`, `FromDictFunctionProtocol`, etc. +- Port the record handler logic but without Pydantic base class — use plain class or ABC + +#### 3.4 — Vector store base classes in `_vectors.py` +- `VectorStoreRecordHandler` — internal base class that handles serialization/deserialization between user data models and store-specific formats, plus embedding generation for vector fields. Both `BaseVectorCollection` and `BaseVectorSearch` extend this. +- `BaseVectorCollection(VectorStoreRecordHandler)` — base for collections + - Uses `SupportsGetEmbeddings` instead of `EmbeddingGeneratorBase` + - Not a Pydantic model — use `__init__` with explicit params + - `upsert`, `get`, `delete`, `ensure_collection_exists`, `collection_exists`, `ensure_collection_deleted` + - Async context manager support +- `BaseVectorStore` — base for stores + - `get_collection`, `list_collection_names`, `collection_exists`, `ensure_collection_deleted` + - Async context manager support + +#### 3.5 — Vector search base class +- `BaseVectorSearch(VectorStoreRecordHandler)` — base for vector search + - Single `search(search_type=...)` method with `search_type: Literal["vector", "keyword_hybrid"]` parameter — no enum, just a literal + - `_inner_search` abstract method for implementations + - Filter building with lambda parser (AST-based) + - Vector generation from values using embedding generator + +#### 3.6 — Protocols for type checking +- `SupportsVectorUpsert` — Protocol for upsert/get/delete operations +- `SupportsVectorSearch` — Protocol for vector search (single `search()` with `search_type` parameter) +- No separate `SupportsVectorHybridSearch` — search type is a parameter, not a separate capability +- No protocol for `VectorStore` — it's a factory for collections, not a capability to duck-type against + +#### 3.7 — Exception types +- Add vector store exceptions under `IntegrationException` or create new branch +- `VectorStoreException`, `VectorStoreOperationException`, `VectorSearchException`, `VectorStoreModelException`, etc. + +#### 3.8 — `create_search_tool` on `BaseVectorSearch` +- Method on `BaseVectorSearch` that creates an AF `FunctionTool` from the vector search +- Wraps the single `search()` method, passing `search_type` parameter +- Accepts: `name`, `description`, `search_type`, `top`, `skip`, `filter`, `string_mapper` +- The tool takes a query string, vectorizes it, searches, and returns results as strings +- Can also be a standalone factory function in `_vectors.py` + +#### 3.9 — Tests for all vector store abstractions +- Unit tests for enums, field types, collection definition +- Unit tests for decorator +- Unit tests for serialization/deserialization +- Unit tests for record handler + +--- + +### Phase 4: In-Memory Vector Store +**Goal:** Provide a zero-dependency vector store for testing and development. +**Mergeable:** Yes — first usable vector store. + +#### 4.1 — Port `InMemoryCollection` and `InMemoryStore` into core +- Place in `agent_framework/_vectors.py` (alongside the abstractions) +- Supports vector search (cosine similarity, etc.) +- No external dependencies + +#### 4.2 — Port FAISS extension (optional, can be separate package) +- Extends InMemory with FAISS indexing + +#### 4.3 — Tests and sample code + +--- + +### Phase 5: Vector Store Connectors — Tier 1 (High Priority) +**Goal:** Ship the most commonly used vector store connectors. +**Mergeable:** Yes — each connector is independent. + +Each connector follows the AF package structure: +- New package under `packages/` +- Own `pyproject.toml`, `tests/`, lazy loading in core + +#### 5.1 — Azure AI Search (`packages/azure-ai-search/`) +- May extend existing package or be new +- `AzureAISearchCollection`, `AzureAISearchStore` + +#### 5.2 — Qdrant (`packages/qdrant/`) +- New package +- `QdrantCollection`, `QdrantStore` + +#### 5.3 — Redis (`packages/redis/`) +- May extend existing redis package +- `RedisCollection` (JSON + Hashset variants), `RedisStore` + +#### 5.4 — PostgreSQL/pgvector (`packages/postgres/`) +- New package +- `PostgresCollection`, `PostgresStore` + +--- + +### Phase 6: Vector Store Connectors — Tier 2 +**Goal:** Ship remaining vector store connectors. +**Mergeable:** Yes — each connector is independent. + +#### 6.1 — MongoDB Atlas (`packages/mongodb/`) +#### 6.2 — Azure Cosmos DB (`packages/azure-cosmos-db/`) +- Cosmos Mongo + Cosmos NoSQL +#### 6.3 — Pinecone (`packages/pinecone/`) +#### 6.4 — Chroma (`packages/chroma/`) +#### 6.5 — Weaviate (`packages/weaviate/`) + +--- + +### Phase 7: Vector Store Connectors — Tier 3 +**Goal:** Ship niche or less common connectors. +**Mergeable:** Yes — each connector is independent. + +#### 7.1 — Oracle (`packages/oracle/`) +#### 7.2 — SQL Server (`packages/sql-server/`) +#### 7.3 — FAISS (`packages/faiss/` or in core extending InMemory) + +--- + +### Phase 8: Vector Store CRUD Tools +**Goal:** Provide a full set of agent-usable tools for CRUD operations on vector store collections. +**Mergeable:** Yes — adds tools without changing existing APIs. + +#### 8.1 — `create_upsert_tool` — tool for upserting records into a collection +#### 8.2 — `create_get_tool` — tool for retrieving records by key +- Key-based lookup only (by primary key), not a search tool +- Documentation must clearly distinguish this from `create_search_tool`: get_tool retrieves specific records by their known key, while search_tool performs similarity/filtered search across the collection +- Consider if this overlaps with filtered search and document when to use which +#### 8.3 — `create_delete_tool` — tool for deleting records by key +#### 8.4 — Tests and samples for CRUD tools + +--- + +### Phase 9: Additional Embedding Implementations (New Providers) +**Goal:** Provide embedding generators for providers that don't yet have AF packages. +**Mergeable:** Yes — each is independent, new packages. + +#### 9.1 — HuggingFace/ONNX embedding (new package or lab) +#### 9.2 — Mistral AI embedding (new package) +#### 9.3 — Google AI / Vertex AI embedding (new package) +#### 9.4 — Nvidia embedding (new package) + +--- + +### Phase 10: TextSearch Abstractions & Implementations (Separate Work) +**Goal:** Port text search (non-vector) abstractions and implementations. +**Mergeable:** Yes — independent of vector stores. + +#### 10.1 — TextSearch base class and types +- `SearchOptions`, `SearchResponse`, `TextSearchResult` +- `TextSearch` base class with `search()` method +- `create_search_function()` for kernel integration (may need AF equivalent) + +#### 10.2 — Brave Search implementation +#### 10.3 — Google Search implementation +#### 10.4 — Vector store text search bridge (connecting VectorSearch to TextSearch interface) + +--- + +## Key Considerations + +1. **No Pydantic for internal classes**: All AF internal classes should use plain classes. Pydantic is only used for user-facing input validation (e.g., vector store data models). + +2. **Protocol + Base class**: Follow AF's pattern of both a `Protocol` for duck-typing and a `Base` ABC for implementation, matching how `SupportsChatGetResponse` + `BaseChatClient` works. + +3. **Exception hierarchy**: Use AF's `IntegrationException` branch for vector store operations, since vector stores are external dependencies. + +4. **`from __future__ import annotations`**: Required in all files per AF coding standard. + +5. **No `**kwargs` escape hatches in public APIs**: For user-facing interfaces, use explicit named parameters per AF coding standard. Internal implementation details (e.g., cooperative multiple inheritance / MRO patterns) may use `**kwargs` where necessary, as long as they are not exposed in public signatures. + +6. **Lazy loading**: Connector packages use `__getattr__` lazy loading in core provider folders. + +7. **Reusable data models**: The `@vectorstoremodel` decorator and `VectorStoreCollectionDefinition` should be agnostic enough to work with both SK and AF. The core types (`FieldTypes`, `IndexKind`, `DistanceFunction`, `VectorStoreField`) should be identical or easily mapped. + +8. **`create_search_tool`**: The AF-native equivalent of SK's `create_search_function`. Instead of creating a `KernelFunction`, this creates an AF `FunctionTool` (via the `@tool` decorator pattern) from a vector search. This allows agents to use vector search as a tool during conversations. Design: + - `create_search_tool(name, description, search_type, ...)` → returns a `FunctionTool` that wraps `VectorSearch.search(search_type=...)` + - The tool accepts a query string, performs embedding + vector search, and returns results as strings + - Supports configurable string mappers, filter functions, top/skip defaults + - Lives in `_vectors.py` as a method on `BaseVectorSearch` and/or as a standalone factory function + +9. **CRUD tools**: A full set of create/read/update/delete tools for vector store collections, allowing agents to manage data in vector stores. Design: + - `create_upsert_tool(...)` → tool for upserting records + - `create_get_tool(...)` → tool for retrieving records by key + - `create_delete_tool(...)` → tool for deleting records + - These are separate from search and are placed in a later phase diff --git a/python/packages/core/agent_framework/__init__.py b/python/packages/core/agent_framework/__init__.py index eaa149d749..bfa684b469 100644 --- a/python/packages/core/agent_framework/__init__.py +++ b/python/packages/core/agent_framework/__init__.py @@ -20,9 +20,11 @@ from ._agents import Agent, BaseAgent, RawAgent, SupportsAgentRun from ._clients import ( BaseChatClient, + BaseEmbeddingClient, SupportsChatGetResponse, SupportsCodeInterpreterTool, SupportsFileSearchTool, + SupportsGetEmbeddings, SupportsImageGenerationTool, SupportsMCPTool, SupportsWebSearchTool, @@ -82,9 +84,14 @@ ChatResponseUpdate, Content, ContinuationToken, + Embedding, + EmbeddingGenerationOptions, + EmbeddingInputT, + EmbeddingT, FinalT, FinishReason, FinishReasonLiteral, + GeneratedEmbeddings, Message, OuterFinalT, OuterUpdateT, @@ -201,6 +208,7 @@ "BaseAgent", "BaseChatClient", "BaseContextProvider", + "BaseEmbeddingClient", "BaseHistoryProvider", "Case", "ChatAndFunctionMiddlewareTypes", @@ -218,6 +226,10 @@ "Edge", "EdgeCondition", "EdgeDuplicationError", + "Embedding", + "EmbeddingGenerationOptions", + "EmbeddingInputT", + "EmbeddingT", "Executor", "FanInEdgeGroup", "FanOutEdgeGroup", @@ -232,6 +244,7 @@ "FunctionMiddleware", "FunctionMiddlewareTypes", "FunctionTool", + "GeneratedEmbeddings", "GraphConnectivityError", "InMemoryCheckpointStorage", "InMemoryHistoryProvider", @@ -261,6 +274,7 @@ "SupportsChatGetResponse", "SupportsCodeInterpreterTool", "SupportsFileSearchTool", + "SupportsGetEmbeddings", "SupportsImageGenerationTool", "SupportsMCPTool", "SupportsWebSearchTool", diff --git a/python/packages/core/agent_framework/_clients.py b/python/packages/core/agent_framework/_clients.py index b407be11cf..96e8dc0b6a 100644 --- a/python/packages/core/agent_framework/_clients.py +++ b/python/packages/core/agent_framework/_clients.py @@ -35,6 +35,10 @@ from ._types import ( ChatResponse, ChatResponseUpdate, + EmbeddingGenerationOptions, + EmbeddingInputT, + EmbeddingT, + GeneratedEmbeddings, Message, ResponseStream, validate_chat_options, @@ -56,7 +60,6 @@ InputT = TypeVar("InputT", contravariant=True) -EmbeddingT = TypeVar("EmbeddingT") BaseChatClientT = TypeVar("BaseChatClientT", bound="BaseChatClient") logger = logging.getLogger("agent_framework") @@ -660,3 +663,140 @@ def get_file_search_tool(**kwargs: Any) -> Any: # endregion + + +# region SupportsGetEmbeddings Protocol + +# Contravariant/covariant TypeVars for the Protocol +EmbeddingInputContraT = TypeVar( + "EmbeddingInputContraT", + default="str", + contravariant=True, +) +EmbeddingCoT = TypeVar( + "EmbeddingCoT", + default="list[float]", +) +EmbeddingOptionsContraT = TypeVar( + "EmbeddingOptionsContraT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", + contravariant=True, +) + + +@runtime_checkable +class SupportsGetEmbeddings(Protocol[EmbeddingInputContraT, EmbeddingCoT, EmbeddingOptionsContraT]): + """Protocol for an embedding client that can generate embeddings. + + This protocol enables duck-typing for embedding generation. Any class that + implements ``get_embeddings`` with a compatible signature satisfies this protocol. + + Generic over the input type (defaults to ``str``), output embedding type + (defaults to ``list[float]``), and options type. + + Examples: + .. code-block:: python + + from agent_framework import SupportsGetEmbeddings + + + async def use_embeddings(client: SupportsGetEmbeddings) -> None: + result = await client.get_embeddings(["Hello, world!"]) + for embedding in result: + print(embedding.vector) + """ + + additional_properties: dict[str, Any] + + def get_embeddings( + self, + values: Sequence[EmbeddingInputContraT], + *, + options: EmbeddingOptionsContraT | None = None, + ) -> Awaitable[GeneratedEmbeddings[EmbeddingCoT]]: + """Generate embeddings for the given values. + + Args: + values: The values to generate embeddings for. + options: Optional embedding generation options. + + Returns: + Generated embeddings with metadata. + """ + ... + + +# endregion + + +# region BaseEmbeddingClient + +# Covariant for the BaseEmbeddingClient +EmbeddingOptionsCoT = TypeVar( + "EmbeddingOptionsCoT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", + covariant=True, +) + + +class BaseEmbeddingClient(SerializationMixin, ABC, Generic[EmbeddingInputT, EmbeddingT, EmbeddingOptionsCoT]): + """Abstract base class for embedding clients. + + Subclasses implement ``get_embeddings`` to provide the actual + embedding generation logic. + + Generic over the input type (defaults to ``str``), output embedding type + (defaults to ``list[float]``), and options type. + + Examples: + .. code-block:: python + + from agent_framework import BaseEmbeddingClient, Embedding, GeneratedEmbeddings + from collections.abc import Sequence + + + class CustomEmbeddingClient(BaseEmbeddingClient): + async def get_embeddings(self, values, *, options=None): + return GeneratedEmbeddings([Embedding(vector=[0.1, 0.2, 0.3]) for _ in values]) + """ + + OTEL_PROVIDER_NAME: ClassVar[str] = "unknown" + DEFAULT_EXCLUDE: ClassVar[set[str]] = {"additional_properties"} + + def __init__( + self, + *, + additional_properties: dict[str, Any] | None = None, + **kwargs: Any, + ) -> None: + """Initialize a BaseEmbeddingClient instance. + + Args: + additional_properties: Additional properties to pass to the client. + **kwargs: Additional keyword arguments passed to parent classes (for MRO). + """ + self.additional_properties = additional_properties or {} + super().__init__(**kwargs) + + @abstractmethod + async def get_embeddings( + self, + values: Sequence[EmbeddingInputT], + *, + options: EmbeddingOptionsCoT | None = None, + ) -> GeneratedEmbeddings[EmbeddingT]: + """Generate embeddings for the given values. + + Args: + values: The values to generate embeddings for. + options: Optional embedding generation options. + + Returns: + Generated embeddings with metadata. + """ + ... + + +# endregion diff --git a/python/packages/core/agent_framework/_types.py b/python/packages/core/agent_framework/_types.py index a699a30f5f..77063902d2 100644 --- a/python/packages/core/agent_framework/_types.py +++ b/python/packages/core/agent_framework/_types.py @@ -8,8 +8,18 @@ import re import sys from asyncio import iscoroutine -from collections.abc import AsyncIterable, AsyncIterator, Awaitable, Callable, Mapping, MutableMapping, Sequence +from collections.abc import ( + AsyncIterable, + AsyncIterator, + Awaitable, + Callable, + Iterable, + Mapping, + MutableMapping, + Sequence, +) from copy import deepcopy +from datetime import datetime from typing import TYPE_CHECKING, Any, ClassVar, Final, Generic, Literal, NewType, cast, overload from pydantic import BaseModel @@ -272,7 +282,8 @@ def _serialize_value(value: Any, exclude_none: bool) -> Any: # region Constants and types _T = TypeVar("_T") -EmbeddingT = TypeVar("EmbeddingT") +EmbeddingT = TypeVar("EmbeddingT", default="list[float]") +EmbeddingInputT = TypeVar("EmbeddingInputT", default="str") ChatResponseT = TypeVar("ChatResponseT", bound="ChatResponse") ToolModeT = TypeVar("ToolModeT", bound="ToolMode") AgentResponseT = TypeVar("AgentResponseT", bound="AgentResponse") @@ -3158,3 +3169,129 @@ def merge_chat_options( result[key] = value return result + + +# region Embedding Types + + +class EmbeddingGenerationOptions(TypedDict, total=False): + """Common request settings for embedding generation. + + All fields are optional (total=False) to allow partial specification. + Provider-specific TypedDicts extend this with additional options. + + Examples: + .. code-block:: python + + from agent_framework import EmbeddingGenerationOptions + + options: EmbeddingGenerationOptions = { + "model_id": "text-embedding-3-small", + "dimensions": 1536, + } + """ + + model_id: str + dimensions: int + + +class Embedding(Generic[EmbeddingT]): + """A single embedding vector with metadata. + + Generic over the embedding vector type, e.g. ``Embedding[list[float]]``, + ``Embedding[list[int]]``, or ``Embedding[bytes]``. + + Args: + vector: The embedding vector data. + model_id: The model used to generate this embedding. + dimensions: Explicit dimension count (computed from vector length if omitted). + created_at: Timestamp of when the embedding was generated. + additional_properties: Additional metadata. + + Examples: + .. code-block:: python + + from agent_framework import Embedding + + embedding = Embedding( + vector=[0.1, 0.2, 0.3], + model_id="text-embedding-3-small", + ) + assert embedding.dimensions == 3 + """ + + def __init__( + self, + vector: EmbeddingT, + *, + model_id: str | None = None, + dimensions: int | None = None, + created_at: datetime | None = None, + additional_properties: dict[str, Any] | None = None, + ) -> None: + self.vector = vector + self._dimensions = dimensions + self.model_id = model_id + self.created_at = created_at + self.additional_properties = additional_properties or {} + + @property + def dimensions(self) -> int | None: + """Return the number of dimensions in the embedding vector. + + Uses the explicitly provided value if set, otherwise computes from vector length. + """ + if self._dimensions is not None: + return self._dimensions + if isinstance(self.vector, (list, tuple, bytes)): + return len(self.vector) + return None + + +EmbeddingOptionsT = TypeVar( + "EmbeddingOptionsT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", +) + + +class GeneratedEmbeddings(list[Embedding[EmbeddingT]], Generic[EmbeddingT, EmbeddingOptionsT]): + """A list of generated embeddings with usage metadata. + + Extends list for direct iteration and indexing. + Generic over both the embedding vector type and the options type used for generation. + + Args: + embeddings: Sequence of Embedding objects. + options: The options used to generate these embeddings. + usage: Token usage information (e.g. prompt_tokens, total_tokens). + additional_properties: Additional metadata. + + Examples: + .. code-block:: python + + from agent_framework import Embedding, GeneratedEmbeddings + + embeddings = GeneratedEmbeddings( + [Embedding(vector=[0.1, 0.2]), Embedding(vector=[0.3, 0.4])], + usage={"prompt_tokens": 10, "total_tokens": 10}, + ) + assert len(embeddings) == 2 + assert embeddings.usage["prompt_tokens"] == 10 + """ + + def __init__( + self, + embeddings: Iterable[Embedding[EmbeddingT]] | None = None, + *, + options: EmbeddingOptionsT | None = None, + usage: dict[str, Any] | None = None, + additional_properties: dict[str, Any] | None = None, + ) -> None: + super().__init__(embeddings or []) + self.options = options + self.usage = usage + self.additional_properties = additional_properties or {} + + +# endregion diff --git a/python/packages/core/agent_framework/azure/__init__.py b/python/packages/core/agent_framework/azure/__init__.py index a485ee7aa7..f525e7a33e 100644 --- a/python/packages/core/agent_framework/azure/__init__.py +++ b/python/packages/core/agent_framework/azure/__init__.py @@ -36,6 +36,7 @@ "AzureOpenAIAssistantsOptions": ("agent_framework.azure._assistants_client", "agent-framework-core"), "AzureOpenAIChatClient": ("agent_framework.azure._chat_client", "agent-framework-core"), "AzureOpenAIChatOptions": ("agent_framework.azure._chat_client", "agent-framework-core"), + "AzureOpenAIEmbeddingClient": ("agent_framework.azure._embedding_client", "agent-framework-core"), "AzureOpenAIResponsesClient": ("agent_framework.azure._responses_client", "agent-framework-core"), "AzureOpenAIResponsesOptions": ("agent_framework.azure._responses_client", "agent-framework-core"), "AzureOpenAISettings": ("agent_framework.azure._shared", "agent-framework-core"), diff --git a/python/packages/core/agent_framework/azure/__init__.pyi b/python/packages/core/agent_framework/azure/__init__.pyi index 4d6e3b914c..238411f8d7 100644 --- a/python/packages/core/agent_framework/azure/__init__.pyi +++ b/python/packages/core/agent_framework/azure/__init__.pyi @@ -21,6 +21,7 @@ from agent_framework_durabletask import ( from agent_framework.azure._assistants_client import AzureOpenAIAssistantsClient from agent_framework.azure._chat_client import AzureOpenAIChatClient +from agent_framework.azure._embedding_client import AzureOpenAIEmbeddingClient from agent_framework.azure._entra_id_authentication import AzureCredentialTypes, AzureTokenProvider from agent_framework.azure._responses_client import AzureOpenAIResponsesClient from agent_framework.azure._shared import AzureOpenAISettings @@ -40,6 +41,7 @@ __all__ = [ "AzureCredentialTypes", "AzureOpenAIAssistantsClient", "AzureOpenAIChatClient", + "AzureOpenAIEmbeddingClient", "AzureOpenAIResponsesClient", "AzureOpenAISettings", "AzureTokenProvider", diff --git a/python/packages/core/agent_framework/azure/_embedding_client.py b/python/packages/core/agent_framework/azure/_embedding_client.py new file mode 100644 index 0000000000..05d6b5d603 --- /dev/null +++ b/python/packages/core/agent_framework/azure/_embedding_client.py @@ -0,0 +1,136 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +import sys +from collections.abc import Mapping +from typing import Generic + +from openai.lib.azure import AsyncAzureOpenAI + +from agent_framework.observability import EmbeddingTelemetryLayer +from agent_framework.openai import OpenAIEmbeddingOptions +from agent_framework.openai._embedding_client import RawOpenAIEmbeddingClient + +from .._settings import load_settings +from ._entra_id_authentication import AzureCredentialTypes, AzureTokenProvider +from ._shared import ( + AzureOpenAIConfigMixin, + AzureOpenAISettings, + _apply_azure_defaults, +) + +if sys.version_info >= (3, 13): + from typing import TypeVar # type: ignore # pragma: no cover +else: + from typing_extensions import TypeVar # type: ignore # pragma: no cover +if sys.version_info >= (3, 11): + from typing import TypedDict # type: ignore # pragma: no cover +else: + from typing_extensions import TypedDict # type: ignore # pragma: no cover + + +AzureOpenAIEmbeddingOptionsT = TypeVar( + "AzureOpenAIEmbeddingOptionsT", + bound=TypedDict, # type: ignore[valid-type] + default="OpenAIEmbeddingOptions", + covariant=True, +) + + +class AzureOpenAIEmbeddingClient( + AzureOpenAIConfigMixin, + EmbeddingTelemetryLayer[str, list[float], AzureOpenAIEmbeddingOptionsT], + RawOpenAIEmbeddingClient[AzureOpenAIEmbeddingOptionsT], + Generic[AzureOpenAIEmbeddingOptionsT], +): + """Azure OpenAI embedding client with telemetry support. + + Keyword Args: + api_key: The API key. If provided, will override the value in the env vars or .env file. + Can also be set via environment variable AZURE_OPENAI_API_KEY. + deployment_name: The deployment name. If provided, will override the value + (embedding_deployment_name) in the env vars or .env file. + Can also be set via environment variable AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME. + endpoint: The deployment endpoint. + Can also be set via environment variable AZURE_OPENAI_ENDPOINT. + base_url: The deployment base URL. + Can also be set via environment variable AZURE_OPENAI_BASE_URL. + api_version: The deployment API version. + Can also be set via environment variable AZURE_OPENAI_API_VERSION. + token_endpoint: The token endpoint to request an Azure token. + Can also be set via environment variable AZURE_OPENAI_TOKEN_ENDPOINT. + credential: Azure credential or token provider for authentication. + default_headers: Default headers for HTTP requests. + async_client: An existing client to use. + env_file_path: Path to .env file for settings. + env_file_encoding: Encoding for .env file. + + Examples: + .. code-block:: python + + from agent_framework.azure import AzureOpenAIEmbeddingClient + + # Using environment variables + # Set AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com + # Set AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME=text-embedding-3-small + # Set AZURE_OPENAI_API_KEY=your-key + client = AzureOpenAIEmbeddingClient() + + # Or passing parameters directly + client = AzureOpenAIEmbeddingClient( + endpoint="https://your-endpoint.openai.azure.com", + deployment_name="text-embedding-3-small", + api_key="your-key", + ) + + result = await client.get_embeddings(["Hello, world!"]) + """ + + def __init__( + self, + *, + api_key: str | None = None, + deployment_name: str | None = None, + endpoint: str | None = None, + base_url: str | None = None, + api_version: str | None = None, + token_endpoint: str | None = None, + credential: AzureCredentialTypes | AzureTokenProvider | None = None, + default_headers: Mapping[str, str] | None = None, + async_client: AsyncAzureOpenAI | None = None, + env_file_path: str | None = None, + env_file_encoding: str | None = None, + ) -> None: + """Initialize an Azure OpenAI embedding client.""" + azure_openai_settings = load_settings( + AzureOpenAISettings, + env_prefix="AZURE_OPENAI_", + api_key=api_key, + base_url=base_url, + endpoint=endpoint, + embedding_deployment_name=deployment_name, + api_version=api_version, + env_file_path=env_file_path, + env_file_encoding=env_file_encoding, + token_endpoint=token_endpoint, + ) + _apply_azure_defaults(azure_openai_settings) + + if not azure_openai_settings.get("embedding_deployment_name"): + raise ValueError( + "Azure OpenAI embedding deployment name is required. Set via 'deployment_name' parameter " + "or 'AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME' environment variable." + ) + + super().__init__( + deployment_name=azure_openai_settings["embedding_deployment_name"], # type: ignore[arg-type] + endpoint=azure_openai_settings["endpoint"], + base_url=azure_openai_settings["base_url"], + api_version=azure_openai_settings["api_version"], # type: ignore + api_key=azure_openai_settings["api_key"].get_secret_value() if azure_openai_settings["api_key"] else None, + token_endpoint=azure_openai_settings["token_endpoint"], + credential=credential, + default_headers=default_headers, + client=async_client, + ) diff --git a/python/packages/core/agent_framework/azure/_shared.py b/python/packages/core/agent_framework/azure/_shared.py index 732de8281e..dce116a242 100644 --- a/python/packages/core/agent_framework/azure/_shared.py +++ b/python/packages/core/agent_framework/azure/_shared.py @@ -53,6 +53,8 @@ class AzureOpenAISettings(TypedDict, total=False): Resource Management > Deployments in the Azure portal or, alternatively, under Management > Deployments in Azure AI Foundry. Can be set via environment variable AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME. + embedding_deployment_name: The name of the Azure Embedding deployment. + Can be set via environment variable AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME. api_key: The API key for the Azure deployment. This value can be found in the Keys & Endpoint section when examining your resource in the Azure portal. You can use either KEY1 or KEY2. @@ -95,6 +97,7 @@ class AzureOpenAISettings(TypedDict, total=False): chat_deployment_name: str | None responses_deployment_name: str | None + embedding_deployment_name: str | None endpoint: str | None base_url: str | None api_key: SecretString | None diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 8f581a605d..701970d9d5 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -38,6 +38,10 @@ else: from typing_extensions import TypeVar # type: ignore # pragma: no cover +# Defined here to avoid circular import with _types.py +EmbeddingInputT = TypeVar("EmbeddingInputT", default="str") +EmbeddingT = TypeVar("EmbeddingT", default="list[float]") + if TYPE_CHECKING: # pragma: no cover from opentelemetry.sdk._logs.export import LogRecordExporter from opentelemetry.sdk.metrics.export import MetricExporter @@ -59,7 +63,9 @@ ChatResponse, ChatResponseUpdate, Content, + EmbeddingGenerationOptions, FinishReason, + GeneratedEmbeddings, Message, ResponseStream, ) @@ -70,6 +76,7 @@ "OBSERVABILITY_SETTINGS", "AgentTelemetryLayer", "ChatTelemetryLayer", + "EmbeddingTelemetryLayer", "OtelAttr", "configure_otel_providers", "create_metric_views", @@ -251,6 +258,7 @@ class OtelAttr(str, Enum): # Operation names CHAT_COMPLETION_OPERATION = "chat" + EMBEDDING_OPERATION = "embeddings" TOOL_EXECUTION_OPERATION = "execute_tool" # Describes GenAI agent creation and is usually applicable when working with remote agent services. AGENT_CREATE_OPERATION = "create_agent" @@ -1265,6 +1273,70 @@ async def _get_response() -> ChatResponse: return _get_response() +EmbeddingOptionsCoT = TypeVar( + "EmbeddingOptionsCoT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", + covariant=True, +) + + +class EmbeddingTelemetryLayer(Generic[EmbeddingInputT, EmbeddingT, EmbeddingOptionsCoT]): + """Layer that wraps embedding client get_embeddings with OpenTelemetry tracing.""" + + def __init__(self, *args: Any, otel_provider_name: str | None = None, **kwargs: Any) -> None: + """Initialize telemetry attributes and histograms.""" + super().__init__(*args, **kwargs) + self.token_usage_histogram = _get_token_usage_histogram() + self.duration_histogram = _get_duration_histogram() + self.otel_provider_name = otel_provider_name or getattr(self, "OTEL_PROVIDER_NAME", "unknown") + + async def get_embeddings( + self, + values: Sequence[EmbeddingInputT], + *, + options: EmbeddingOptionsCoT | None = None, + ) -> GeneratedEmbeddings[EmbeddingT]: + """Trace embedding generation with OpenTelemetry spans and metrics.""" + global OBSERVABILITY_SETTINGS + super_get_embeddings = super().get_embeddings # type: ignore[misc] + + if not OBSERVABILITY_SETTINGS.ENABLED: + return await super_get_embeddings(values, options=options) # type: ignore[no-any-return] + + opts: dict[str, Any] = options or {} # type: ignore[assignment] + provider_name = str(self.otel_provider_name) + model_id = opts.get("model_id") or getattr(self, "model_id", None) or "unknown" + service_url_func = getattr(self, "service_url", None) + service_url = str(service_url_func() if callable(service_url_func) else "unknown") + attributes = _get_span_attributes( + operation_name=OtelAttr.EMBEDDING_OPERATION, + provider_name=provider_name, + model=model_id, + service_url=service_url, + ) + + with _get_span(attributes=attributes, span_name_attribute=SpanAttributes.LLM_REQUEST_MODEL) as span: + start_time_stamp = perf_counter() + try: + result = await super_get_embeddings(values, options=options) + except Exception as exception: + capture_exception(span=span, exception=exception, timestamp=time_ns()) + raise + duration = perf_counter() - start_time_stamp + response_attributes: dict[str, Any] = {**attributes} + if result.usage and "prompt_tokens" in result.usage: + response_attributes[OtelAttr.INPUT_TOKENS] = result.usage["prompt_tokens"] + _capture_response( + span=span, + attributes=response_attributes, + token_usage_histogram=self.token_usage_histogram, + operation_duration_histogram=self.duration_histogram, + duration=duration, + ) + return result # type: ignore[no-any-return] + + class AgentTelemetryLayer: """Layer that wraps agent run with OpenTelemetry tracing.""" diff --git a/python/packages/core/agent_framework/openai/__init__.py b/python/packages/core/agent_framework/openai/__init__.py index 2d9cf09648..a3fe1fe8f6 100644 --- a/python/packages/core/agent_framework/openai/__init__.py +++ b/python/packages/core/agent_framework/openai/__init__.py @@ -19,6 +19,7 @@ OpenAIAssistantsOptions, ) from ._chat_client import OpenAIChatClient, OpenAIChatOptions +from ._embedding_client import OpenAIEmbeddingClient, OpenAIEmbeddingOptions from ._exceptions import ContentFilterResultSeverity, OpenAIContentFilterException from ._responses_client import ( OpenAIContinuationToken, @@ -38,6 +39,8 @@ "OpenAIChatOptions", "OpenAIContentFilterException", "OpenAIContinuationToken", + "OpenAIEmbeddingClient", + "OpenAIEmbeddingOptions", "OpenAIResponsesClient", "OpenAIResponsesOptions", "OpenAISettings", diff --git a/python/packages/core/agent_framework/openai/_embedding_client.py b/python/packages/core/agent_framework/openai/_embedding_client.py new file mode 100644 index 0000000000..a11557774e --- /dev/null +++ b/python/packages/core/agent_framework/openai/_embedding_client.py @@ -0,0 +1,204 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +import base64 +import struct +import sys +from collections.abc import Awaitable, Callable, Mapping, Sequence +from typing import Any, Generic, Literal, TypedDict + +from openai import AsyncOpenAI + +from .._clients import BaseEmbeddingClient +from .._settings import load_settings +from .._types import Embedding, EmbeddingGenerationOptions, GeneratedEmbeddings +from ..observability import EmbeddingTelemetryLayer +from ._shared import OpenAIBase, OpenAIConfigMixin, OpenAISettings + +if sys.version_info >= (3, 13): + from typing import TypeVar # type: ignore # pragma: no cover +else: + from typing_extensions import TypeVar # type: ignore # pragma: no cover + + +class OpenAIEmbeddingOptions(EmbeddingGenerationOptions, total=False): + """OpenAI-specific embedding options. + + Extends EmbeddingGenerationOptions with OpenAI-specific fields. + + Examples: + .. code-block:: python + + from agent_framework.openai import OpenAIEmbeddingOptions + + options: OpenAIEmbeddingOptions = { + "model_id": "text-embedding-3-small", + "dimensions": 1536, + "encoding_format": "float", + } + """ + + encoding_format: Literal["float", "base64"] + user: str + + +OpenAIEmbeddingOptionsT = TypeVar( + "OpenAIEmbeddingOptionsT", + bound=TypedDict, # type: ignore[valid-type] + default="OpenAIEmbeddingOptions", + covariant=True, +) + + +class RawOpenAIEmbeddingClient( + OpenAIBase, + BaseEmbeddingClient[str, list[float], OpenAIEmbeddingOptionsT], + Generic[OpenAIEmbeddingOptionsT], +): + """Raw OpenAI embedding client without telemetry.""" + + async def get_embeddings( + self, + values: Sequence[str], + *, + options: OpenAIEmbeddingOptionsT | None = None, + ) -> GeneratedEmbeddings[list[float]]: + """Call the OpenAI embeddings API. + + Args: + values: The text values to generate embeddings for. + options: Optional embedding generation options. + + Returns: + Generated embeddings with usage metadata. + + Raises: + ValueError: If model_id is not provided. + """ + opts: dict[str, Any] = dict(options) if options else {} + model = opts.get("model_id") or self.model_id + if not model: + raise ValueError("model_id is required") + + kwargs: dict[str, Any] = {"input": list(values), "model": model} + if dimensions := opts.get("dimensions"): + kwargs["dimensions"] = dimensions + if encoding_format := opts.get("encoding_format"): + kwargs["encoding_format"] = encoding_format + if user := opts.get("user"): + kwargs["user"] = user + + response = await (await self._ensure_client()).embeddings.create(**kwargs) + + encoding = kwargs.get("encoding_format", "float") + embeddings: list[Embedding[list[float]]] = [] + for item in response.data: + vector: list[float] + if encoding == "base64" and isinstance(item.embedding, str): + # Decode base64-encoded floats (little-endian IEEE 754) + raw = base64.b64decode(item.embedding) + vector = list(struct.unpack(f"<{len(raw) // 4}f", raw)) + else: + vector = item.embedding # type: ignore[assignment] + embeddings.append( + Embedding( + vector=vector, + dimensions=len(vector), + model_id=response.model, + ) + ) + + usage_dict: dict[str, Any] | None = None + if response.usage: + usage_dict = { + "prompt_tokens": response.usage.prompt_tokens, + "total_tokens": response.usage.total_tokens, + } + + return GeneratedEmbeddings(embeddings, options=options, usage=usage_dict) + + +class OpenAIEmbeddingClient( + OpenAIConfigMixin, + EmbeddingTelemetryLayer[str, list[float], OpenAIEmbeddingOptionsT], + RawOpenAIEmbeddingClient[OpenAIEmbeddingOptionsT], + Generic[OpenAIEmbeddingOptionsT], +): + """OpenAI embedding client with telemetry support. + + Keyword Args: + model_id: The embedding model ID (e.g. "text-embedding-3-small"). + Can also be set via environment variable OPENAI_EMBEDDING_MODEL_ID. + api_key: OpenAI API key. + Can also be set via environment variable OPENAI_API_KEY. + org_id: OpenAI organization ID. + default_headers: Additional HTTP headers. + async_client: Pre-configured AsyncOpenAI client. + base_url: Custom API base URL. + env_file_path: Path to .env file for settings. + env_file_encoding: Encoding for .env file. + + Examples: + .. code-block:: python + + from agent_framework.openai import OpenAIEmbeddingClient + + # Using environment variables + # Set OPENAI_API_KEY=sk-... + # Set OPENAI_EMBEDDING_MODEL_ID=text-embedding-3-small + client = OpenAIEmbeddingClient() + + # Or passing parameters directly + client = OpenAIEmbeddingClient( + model_id="text-embedding-3-small", + api_key="sk-...", + ) + + # Generate embeddings + result = await client.get_embeddings(["Hello, world!"]) + print(result[0].vector) + """ + + def __init__( + self, + *, + model_id: str | None = None, + api_key: str | Callable[[], str | Awaitable[str]] | None = None, + org_id: str | None = None, + default_headers: Mapping[str, str] | None = None, + async_client: AsyncOpenAI | None = None, + base_url: str | None = None, + env_file_path: str | None = None, + env_file_encoding: str | None = None, + ) -> None: + """Initialize an OpenAI embedding client.""" + openai_settings = load_settings( + OpenAISettings, + env_prefix="OPENAI_", + api_key=api_key, + base_url=base_url, + org_id=org_id, + embedding_model_id=model_id, + env_file_path=env_file_path, + env_file_encoding=env_file_encoding, + ) + + if not async_client and not openai_settings["api_key"]: + raise ValueError( + "OpenAI API key is required. Set via 'api_key' parameter or 'OPENAI_API_KEY' environment variable." + ) + if not openai_settings["embedding_model_id"]: + raise ValueError( + "OpenAI embedding model ID is required. " + "Set via 'model_id' parameter or 'OPENAI_EMBEDDING_MODEL_ID' environment variable." + ) + + super().__init__( + model_id=openai_settings["embedding_model_id"], + api_key=self._get_api_key(openai_settings["api_key"]), + base_url=openai_settings["base_url"] if openai_settings["base_url"] else None, + org_id=openai_settings["org_id"], + default_headers=default_headers, + client=async_client, + ) diff --git a/python/packages/core/agent_framework/openai/_shared.py b/python/packages/core/agent_framework/openai/_shared.py index ed4de17378..67f0e91818 100644 --- a/python/packages/core/agent_framework/openai/_shared.py +++ b/python/packages/core/agent_framework/openai/_shared.py @@ -92,6 +92,8 @@ class OpenAISettings(TypedDict, total=False): Can be set via environment variable OPENAI_CHAT_MODEL_ID. responses_model_id: The OpenAI responses model ID to use, for example, gpt-4o or o1. Can be set via environment variable OPENAI_RESPONSES_MODEL_ID. + embedding_model_id: The OpenAI embedding model ID to use, for example, text-embedding-3-small. + Can be set via environment variable OPENAI_EMBEDDING_MODEL_ID. Examples: .. code-block:: python @@ -115,6 +117,7 @@ class OpenAISettings(TypedDict, total=False): org_id: str | None chat_model_id: str | None responses_model_id: str | None + embedding_model_id: str | None class OpenAIBase(SerializationMixin): diff --git a/python/packages/core/tests/core/test_embedding_client.py b/python/packages/core/tests/core/test_embedding_client.py new file mode 100644 index 0000000000..71d2bcfd70 --- /dev/null +++ b/python/packages/core/tests/core/test_embedding_client.py @@ -0,0 +1,97 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +from collections.abc import Sequence + +from agent_framework import ( + BaseEmbeddingClient, + Embedding, + EmbeddingGenerationOptions, + GeneratedEmbeddings, + SupportsGetEmbeddings, +) + + +class MockEmbeddingClient(BaseEmbeddingClient): + """A simple mock embedding client for testing.""" + + async def get_embeddings( + self, + values: Sequence[str], + *, + options: EmbeddingGenerationOptions | None = None, + ) -> GeneratedEmbeddings[list[float]]: + return GeneratedEmbeddings( + [Embedding(vector=[0.1, 0.2, 0.3], model_id="mock-model") for _ in values], + usage={"prompt_tokens": len(values), "total_tokens": len(values)}, + ) + + +# --- BaseEmbeddingClient tests --- + + +async def test_base_get_embeddings() -> None: + client = MockEmbeddingClient() + result = await client.get_embeddings(["hello", "world"]) + assert len(result) == 2 + assert result[0].vector == [0.1, 0.2, 0.3] + assert result[0].model_id == "mock-model" + + +async def test_base_get_embeddings_with_options() -> None: + client = MockEmbeddingClient() + options: EmbeddingGenerationOptions = {"model_id": "test", "dimensions": 3} + result = await client.get_embeddings(["hello"], options=options) + assert len(result) == 1 + + +async def test_base_get_embeddings_usage() -> None: + client = MockEmbeddingClient() + result = await client.get_embeddings(["a", "b", "c"]) + assert result.usage is not None + assert result.usage["prompt_tokens"] == 3 + + +def test_base_additional_properties_default() -> None: + client = MockEmbeddingClient() + assert client.additional_properties == {} + + +def test_base_additional_properties_custom() -> None: + client = MockEmbeddingClient(additional_properties={"key": "value"}) + assert client.additional_properties == {"key": "value"} + + +# --- SupportsGetEmbeddings protocol tests --- + + +def test_mock_client_satisfies_protocol() -> None: + client = MockEmbeddingClient() + assert isinstance(client, SupportsGetEmbeddings) + + +def test_plain_class_satisfies_protocol() -> None: + """A plain class with the right signature should satisfy the protocol.""" + + class PlainEmbeddingClient: + additional_properties: dict = {} + + async def get_embeddings(self, values, *, options=None): + return GeneratedEmbeddings() + + client = PlainEmbeddingClient() + assert isinstance(client, SupportsGetEmbeddings) + + +def test_wrong_class_does_not_satisfy_protocol() -> None: + """A class without get_embeddings should not satisfy the protocol.""" + + class NotAnEmbeddingClient: + additional_properties: dict = {} + + async def generate(self, values): + pass + + client = NotAnEmbeddingClient() + assert not isinstance(client, SupportsGetEmbeddings) diff --git a/python/packages/core/tests/core/test_embedding_types.py b/python/packages/core/tests/core/test_embedding_types.py new file mode 100644 index 0000000000..0d6db6b27e --- /dev/null +++ b/python/packages/core/tests/core/test_embedding_types.py @@ -0,0 +1,182 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +from datetime import datetime + +from agent_framework import Embedding, EmbeddingGenerationOptions, GeneratedEmbeddings + +# --- Embedding tests --- + + +def test_embedding_basic_construction() -> None: + embedding = Embedding(vector=[0.1, 0.2, 0.3]) + assert embedding.vector == [0.1, 0.2, 0.3] + assert embedding.model_id is None + assert embedding.created_at is None + assert embedding.additional_properties == {} + + +def test_embedding_construction_with_metadata() -> None: + now = datetime.now() + embedding = Embedding( + vector=[0.1, 0.2], + model_id="text-embedding-3-small", + created_at=now, + additional_properties={"key": "value"}, + ) + assert embedding.model_id == "text-embedding-3-small" + assert embedding.created_at == now + assert embedding.additional_properties == {"key": "value"} + + +def test_embedding_dimensions_computed_from_list() -> None: + embedding = Embedding(vector=[0.1, 0.2, 0.3]) + assert embedding.dimensions == 3 + + +def test_embedding_dimensions_computed_from_tuple() -> None: + embedding = Embedding(vector=(0.1, 0.2, 0.3, 0.4)) + assert embedding.dimensions == 4 + + +def test_embedding_dimensions_computed_from_bytes() -> None: + embedding = Embedding(vector=b"\x00\x01\x02") + assert embedding.dimensions == 3 + + +def test_embedding_dimensions_explicit_overrides_computed() -> None: + embedding = Embedding(vector=[0.1, 0.2, 0.3], dimensions=1536) + assert embedding.dimensions == 1536 + + +def test_embedding_dimensions_none_for_unknown_type() -> None: + embedding = Embedding(vector="not a list") # type: ignore[arg-type] + assert embedding.dimensions is None + + +def test_embedding_dimensions_explicit_with_unknown_type() -> None: + embedding = Embedding(vector="not a list", dimensions=100) # type: ignore[arg-type] + assert embedding.dimensions == 100 + + +def test_embedding_empty_vector() -> None: + embedding = Embedding(vector=[]) + assert embedding.dimensions == 0 + + +def test_embedding_int_vector() -> None: + embedding = Embedding(vector=[1, 2, 3]) + assert embedding.vector == [1, 2, 3] + assert embedding.dimensions == 3 + + +# --- GeneratedEmbeddings tests --- + + +def test_generated_basic_construction() -> None: + embeddings = GeneratedEmbeddings() + assert len(embeddings) == 0 + assert embeddings.options is None + assert embeddings.usage is None + assert embeddings.additional_properties == {} + + +def test_generated_construction_with_embeddings() -> None: + items = [Embedding(vector=[0.1, 0.2]), Embedding(vector=[0.3, 0.4])] + embeddings = GeneratedEmbeddings(items) + assert len(embeddings) == 2 + assert embeddings[0].vector == [0.1, 0.2] + assert embeddings[1].vector == [0.3, 0.4] + + +def test_generated_construction_with_usage() -> None: + usage = {"prompt_tokens": 10, "total_tokens": 10} + embeddings = GeneratedEmbeddings( + [ + Embedding( + vector=[0.1], + model_id="test-model", + ) + ], + usage=usage, + ) + assert embeddings.usage == usage + assert embeddings.usage["prompt_tokens"] == 10 + + +def test_generated_construction_with_additional_properties() -> None: + embeddings = GeneratedEmbeddings( + additional_properties={"model": "test"}, + ) + assert embeddings.additional_properties == {"model": "test"} + + +def test_generated_construction_with_options() -> None: + opts: EmbeddingGenerationOptions = {"model_id": "text-embedding-3-small", "dimensions": 256} + embeddings = GeneratedEmbeddings( + [Embedding(vector=[0.1])], + options=opts, + ) + assert embeddings.options is not None + assert embeddings.options["model_id"] == "text-embedding-3-small" + assert embeddings.options["dimensions"] == 256 + + +def test_generated_list_behavior_iteration() -> None: + items = [Embedding(vector=[float(i)]) for i in range(5)] + embeddings = GeneratedEmbeddings(items) + vectors = [e.vector for e in embeddings] + assert vectors == [[0.0], [1.0], [2.0], [3.0], [4.0]] + + +def test_generated_list_behavior_indexing() -> None: + items = [Embedding(vector=[0.1]), Embedding(vector=[0.2])] + embeddings = GeneratedEmbeddings(items) + assert embeddings[0].vector == [0.1] + assert embeddings[-1].vector == [0.2] + + +def test_generated_list_behavior_slicing() -> None: + items = [Embedding(vector=[float(i)]) for i in range(5)] + embeddings = GeneratedEmbeddings(items) + sliced = embeddings[1:3] + assert len(sliced) == 2 + + +def test_generated_list_behavior_append() -> None: + embeddings = GeneratedEmbeddings() + embeddings.append(Embedding(vector=[0.1])) + assert len(embeddings) == 1 + + +def test_generated_none_embeddings_creates_empty_list() -> None: + embeddings = GeneratedEmbeddings(None) + assert len(embeddings) == 0 + + +# --- EmbeddingGenerationOptions tests --- + + +def test_options_empty() -> None: + options: EmbeddingGenerationOptions = {} + assert "model_id" not in options + + +def test_options_with_model_id() -> None: + options: EmbeddingGenerationOptions = {"model_id": "text-embedding-3-small"} + assert options["model_id"] == "text-embedding-3-small" + + +def test_options_with_dimensions() -> None: + options: EmbeddingGenerationOptions = {"dimensions": 1536} + assert options["dimensions"] == 1536 + + +def test_options_with_all_fields() -> None: + options: EmbeddingGenerationOptions = { + "model_id": "text-embedding-3-small", + "dimensions": 1536, + } + assert options["model_id"] == "text-embedding-3-small" + assert options["dimensions"] == 1536 diff --git a/python/packages/core/tests/openai/test_openai_embedding_client.py b/python/packages/core/tests/openai/test_openai_embedding_client.py new file mode 100644 index 0000000000..9cb3de20f0 --- /dev/null +++ b/python/packages/core/tests/openai/test_openai_embedding_client.py @@ -0,0 +1,347 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +import os +from unittest.mock import AsyncMock, MagicMock + +import pytest +from openai.types import CreateEmbeddingResponse +from openai.types import Embedding as OpenAIEmbedding +from openai.types.create_embedding_response import Usage + +from agent_framework.azure import AzureOpenAIEmbeddingClient +from agent_framework.openai import ( + OpenAIEmbeddingClient, + OpenAIEmbeddingOptions, +) + + +def _make_openai_response( + embeddings: list[list[float]], + model: str = "text-embedding-3-small", + prompt_tokens: int = 5, + total_tokens: int = 5, +) -> CreateEmbeddingResponse: + """Helper to create a mock OpenAI embeddings response.""" + data = [OpenAIEmbedding(embedding=emb, index=i, object="embedding") for i, emb in enumerate(embeddings)] + return CreateEmbeddingResponse( + data=data, + model=model, + object="list", + usage=Usage(prompt_tokens=prompt_tokens, total_tokens=total_tokens), + ) + + +@pytest.fixture +def openai_unit_test_env(monkeypatch: pytest.MonkeyPatch) -> None: + """Set up environment variables for OpenAI embedding client.""" + monkeypatch.setenv("OPENAI_API_KEY", "test-api-key") + monkeypatch.setenv("OPENAI_EMBEDDING_MODEL_ID", "text-embedding-3-small") + + +# --- OpenAI unit tests --- + + +def test_openai_construction_with_explicit_params() -> None: + client = OpenAIEmbeddingClient( + model_id="text-embedding-3-small", + api_key="test-key", + ) + assert client.model_id == "text-embedding-3-small" + + +def test_openai_construction_from_env(openai_unit_test_env: None) -> None: + client = OpenAIEmbeddingClient() + assert client.model_id == "text-embedding-3-small" + + +def test_openai_construction_missing_api_key_raises() -> None: + with pytest.raises(ValueError, match="API key is required"): + OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + +def test_openai_construction_missing_model_raises() -> None: + with pytest.raises(ValueError, match="model ID is required"): + OpenAIEmbeddingClient(api_key="test-key") + + +async def test_openai_get_embeddings(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response( + embeddings=[[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], + ) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + result = await client.get_embeddings(["hello", "world"]) + + assert len(result) == 2 + assert result[0].vector == [0.1, 0.2, 0.3] + assert result[1].vector == [0.4, 0.5, 0.6] + assert result[0].model_id == "text-embedding-3-small" + assert result[0].dimensions == 3 + + +async def test_openai_get_embeddings_usage(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response( + embeddings=[[0.1]], + prompt_tokens=10, + total_tokens=10, + ) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + result = await client.get_embeddings(["test"]) + + assert result.usage is not None + assert result.usage["prompt_tokens"] == 10 + assert result.usage["total_tokens"] == 10 + + +async def test_openai_options_passthrough_dimensions(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response(embeddings=[[0.1]]) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + options: OpenAIEmbeddingOptions = {"dimensions": 256} + result = await client.get_embeddings(["test"], options=options) + + call_kwargs = client.client.embeddings.create.call_args[1] + assert call_kwargs["dimensions"] == 256 + assert result.options is options + + +async def test_openai_options_passthrough_encoding_format(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response(embeddings=[[0.1]]) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + options: OpenAIEmbeddingOptions = {"encoding_format": "base64"} + await client.get_embeddings(["test"], options=options) + + call_kwargs = client.client.embeddings.create.call_args[1] + assert call_kwargs["encoding_format"] == "base64" + + +async def test_openai_base64_decoding(openai_unit_test_env: None) -> None: + import base64 + import struct + + # Encode [0.1, 0.2, 0.3] as base64 little-endian floats + raw_floats = [0.1, 0.2, 0.3] + b64_str = base64.b64encode(struct.pack(f"<{len(raw_floats)}f", *raw_floats)).decode() + + # Mock the embedding item to return a base64 string (as the API does with encoding_format=base64) + mock_item = MagicMock() + mock_item.embedding = b64_str + mock_item.index = 0 + + mock_response = MagicMock() + mock_response.data = [mock_item] + mock_response.model = "text-embedding-3-small" + mock_response.usage = MagicMock(prompt_tokens=3, total_tokens=3) + + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + options: OpenAIEmbeddingOptions = {"encoding_format": "base64"} + result = await client.get_embeddings(["test"], options=options) + + assert len(result) == 1 + assert len(result[0].vector) == 3 + assert result[0].dimensions == 3 + for expected, actual in zip(raw_floats, result[0].vector): + assert abs(expected - actual) < 1e-6 + + +async def test_openai_error_when_no_model_id() -> None: + client = OpenAIEmbeddingClient.__new__(OpenAIEmbeddingClient) + client.model_id = None + client.client = MagicMock() + client.additional_properties = {} + client.otel_provider_name = "openai" + + with pytest.raises(ValueError, match="model_id is required"): + await client.get_embeddings(["test"]) + + +# --- Azure OpenAI unit tests --- + + +def test_azure_construction_with_deployment_name() -> None: + client = AzureOpenAIEmbeddingClient( + deployment_name="text-embedding-3-small", + api_key="test-key", + endpoint="https://test.openai.azure.com/", + ) + assert client.model_id == "text-embedding-3-small" + + +def test_azure_construction_with_existing_client() -> None: + mock_client = MagicMock() + client = AzureOpenAIEmbeddingClient( + deployment_name="my-deployment", + async_client=mock_client, + ) + assert client.model_id == "my-deployment" + assert client.client is mock_client + + +def test_azure_construction_missing_deployment_name_raises() -> None: + with pytest.raises(ValueError, match="deployment name is required"): + AzureOpenAIEmbeddingClient( + api_key="test-key", + endpoint="https://test.openai.azure.com/", + ) + + +def test_azure_construction_missing_credentials_raises() -> None: + with pytest.raises(ValueError, match="api_key, credential, or a client"): + AzureOpenAIEmbeddingClient( + deployment_name="test", + endpoint="https://test.openai.azure.com/", + ) + + +async def test_azure_get_embeddings() -> None: + mock_response = _make_openai_response( + embeddings=[[0.1, 0.2]], + ) + mock_async_client = MagicMock() + mock_async_client.embeddings = MagicMock() + mock_async_client.embeddings.create = AsyncMock(return_value=mock_response) + + client = AzureOpenAIEmbeddingClient( + deployment_name="text-embedding-3-small", + async_client=mock_async_client, + ) + + result = await client.get_embeddings(["hello"]) + + assert len(result) == 1 + assert result[0].vector == [0.1, 0.2] + + +def test_azure_otel_provider_name() -> None: + mock_client = MagicMock() + client = AzureOpenAIEmbeddingClient( + deployment_name="test", + async_client=mock_client, + ) + assert client.OTEL_PROVIDER_NAME == "azure.ai.openai" + + +# --- Integration tests --- + +skip_if_openai_integration_tests_disabled = pytest.mark.skipif( + os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true" + or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"), + reason="No real OPENAI_API_KEY provided; skipping integration tests." + if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true" + else "Integration tests are disabled.", +) + +skip_if_azure_openai_integration_tests_disabled = pytest.mark.skipif( + os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true" + or not os.getenv("AZURE_OPENAI_ENDPOINT") + or (not os.getenv("AZURE_OPENAI_API_KEY") and not os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")), + reason="No Azure OpenAI credentials provided; skipping integration tests." + if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true" + else "Integration tests are disabled.", +) + + +@skip_if_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_openai_get_embeddings() -> None: + """End-to-end test of OpenAI embedding generation.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + result = await client.get_embeddings(["hello world"]) + + assert len(result) == 1 + assert isinstance(result[0].vector, list) + assert len(result[0].vector) > 0 + assert all(isinstance(v, float) for v in result[0].vector) + assert result[0].model_id is not None + assert result.usage is not None + assert result.usage["prompt_tokens"] > 0 + + +@skip_if_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_openai_get_embeddings_multiple() -> None: + """Test embedding generation for multiple inputs.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + result = await client.get_embeddings(["hello", "world", "test"]) + + assert len(result) == 3 + dims = [len(e.vector) for e in result] + assert all(d == dims[0] for d in dims) + + +@skip_if_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_openai_get_embeddings_with_dimensions() -> None: + """Test embedding generation with custom dimensions.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + options: OpenAIEmbeddingOptions = {"dimensions": 256} + result = await client.get_embeddings(["hello world"], options=options) + + assert len(result) == 1 + assert len(result[0].vector) == 256 + + +@skip_if_azure_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_azure_openai_get_embeddings() -> None: + """End-to-end test of Azure OpenAI embedding generation.""" + client = AzureOpenAIEmbeddingClient() + + result = await client.get_embeddings(["hello world"]) + + assert len(result) == 1 + assert isinstance(result[0].vector, list) + assert len(result[0].vector) > 0 + assert all(isinstance(v, float) for v in result[0].vector) + assert result[0].model_id is not None + assert result.usage is not None + assert result.usage["prompt_tokens"] > 0 + + +@skip_if_azure_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_azure_openai_get_embeddings_multiple() -> None: + """Test Azure OpenAI embedding generation for multiple inputs.""" + client = AzureOpenAIEmbeddingClient() + + result = await client.get_embeddings(["hello", "world", "test"]) + + assert len(result) == 3 + dims = [len(e.vector) for e in result] + assert all(d == dims[0] for d in dims) + + +@skip_if_azure_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_azure_openai_get_embeddings_with_dimensions() -> None: + """Test Azure OpenAI embedding generation with custom dimensions.""" + client = AzureOpenAIEmbeddingClient() + + options: OpenAIEmbeddingOptions = {"dimensions": 256} + result = await client.get_embeddings(["hello world"], options=options) + + assert len(result) == 1 + assert len(result[0].vector) == 256 diff --git a/python/packages/core/tests/workflow/test_full_conversation.py b/python/packages/core/tests/workflow/test_full_conversation.py index 23861ecc69..20d9abd8c0 100644 --- a/python/packages/core/tests/workflow/test_full_conversation.py +++ b/python/packages/core/tests/workflow/test_full_conversation.py @@ -362,9 +362,7 @@ async def test_run_request_with_full_history_clears_service_session_id() -> None """Replaying a full conversation (including function calls) via AgentExecutorRequest must clear service_session_id so the API does not receive both previous_response_id and the same function-call items in input — which would cause a 'Duplicate item' API error.""" - tool_agent = _ToolHistoryAgent( - id="tool_agent", name="ToolAgent", summary_text="Done." - ) + tool_agent = _ToolHistoryAgent(id="tool_agent", name="ToolAgent", summary_text="Done.") tool_exec = AgentExecutor(tool_agent, id="tool_agent") spy_agent = _SessionIdCapturingAgent(id="spy_agent", name="SpyAgent") @@ -393,9 +391,7 @@ async def test_from_response_preserves_service_session_id() -> None: """from_response hands off a prior agent's full conversation to the next executor. The receiving executor's service_session_id is preserved so the API can continue the conversation using previous_response_id.""" - tool_agent = _ToolHistoryAgent( - id="tool_agent2", name="ToolAgent", summary_text="Done." - ) + tool_agent = _ToolHistoryAgent(id="tool_agent2", name="ToolAgent", summary_text="Done.") tool_exec = AgentExecutor(tool_agent, id="tool_agent2") spy_agent = _SessionIdCapturingAgent(id="spy_agent2", name="SpyAgent") @@ -403,11 +399,7 @@ async def test_from_response_preserves_service_session_id() -> None: # Simulate a prior run on the spy executor. spy_exec._session.service_session_id = "resp_PREVIOUS_RUN" # pyright: ignore[reportPrivateUsage] - wf = ( - WorkflowBuilder(start_executor=tool_exec, output_executors=[spy_exec]) - .add_edge(tool_exec, spy_exec) - .build() - ) + wf = WorkflowBuilder(start_executor=tool_exec, output_executors=[spy_exec]).add_edge(tool_exec, spy_exec).build() result = await wf.run("start") assert result.get_outputs() is not None diff --git a/python/samples/02-agents/embeddings/azure_openai_embeddings.py b/python/samples/02-agents/embeddings/azure_openai_embeddings.py new file mode 100644 index 0000000000..16669eb51f --- /dev/null +++ b/python/samples/02-agents/embeddings/azure_openai_embeddings.py @@ -0,0 +1,70 @@ +# Copyright (c) Microsoft. All rights reserved. + +# Run with: uv run samples/02-agents/embeddings/azure_openai_embeddings.py + + +import asyncio + +from agent_framework.azure import AzureOpenAIEmbeddingClient +from dotenv import load_dotenv + +load_dotenv() + +"""Azure OpenAI Embedding Client Example + +This sample demonstrates how to generate embeddings using the Azure OpenAI embedding client. +It supports both API key and Azure credential authentication. + +Prerequisites: + Set the following environment variables or add them to a .env file: + - AZURE_OPENAI_ENDPOINT: Your Azure OpenAI endpoint URL + - AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME: The embedding model deployment name + - AZURE_OPENAI_API_KEY: Your API key (or use Azure credential instead) +""" + + +async def main() -> None: + """Generate embeddings with Azure OpenAI.""" + # 1. Create a client using environment variables. + # Reads AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME, + # and AZURE_OPENAI_API_KEY from environment. + client = AzureOpenAIEmbeddingClient() + + # 2. Generate a single embedding. + result = await client.get_embeddings(["Hello, world!"]) + print(f"Single embedding dimensions: {result[0].dimensions}") + print(f"First 5 values: {result[0].vector[:5]}") + print(f"Model: {result[0].model_id}") + print(f"Usage: {result.usage}") + print() + + # 3. Generate embeddings for multiple inputs. + texts = [ + "The weather is sunny today.", + "It is raining outside.", + "Machine learning is fascinating.", + ] + result = await client.get_embeddings(texts) + print(f"Batch of {len(result)} embeddings, each with {result[0].dimensions} dimensions") + print() + + # 4. Generate embeddings with custom dimensions. + result = await client.get_embeddings(["Custom dimensions example"], options={"dimensions": 256}) + print(f"Custom dimensions: {result[0].dimensions}") + + +if __name__ == "__main__": + asyncio.run(main()) + + +""" +Sample output: +Single embedding dimensions: 1536 +First 5 values: [0.012, -0.034, 0.056, -0.078, 0.090] +Model: text-embedding-3-small +Usage: {'prompt_tokens': 4, 'total_tokens': 4} + +Batch of 3 embeddings, each with 1536 dimensions + +Custom dimensions: 256 +""" diff --git a/python/samples/02-agents/embeddings/openai_embeddings.py b/python/samples/02-agents/embeddings/openai_embeddings.py new file mode 100644 index 0000000000..62d044fd72 --- /dev/null +++ b/python/samples/02-agents/embeddings/openai_embeddings.py @@ -0,0 +1,65 @@ +# Copyright (c) Microsoft. All rights reserved. + +# Run with: uv run samples/02-agents/embeddings/openai_embeddings.py + +import asyncio + +from agent_framework.openai import OpenAIEmbeddingClient +from dotenv import load_dotenv + +load_dotenv() + +"""OpenAI Embedding Client Example + +This sample demonstrates how to generate embeddings using the OpenAI embedding client. +It shows single and batch embedding generation, as well as custom dimensions. + +Prerequisites: + Set the OPENAI_API_KEY environment variable or add it to a .env file. +""" + + +async def main() -> None: + """Generate embeddings with OpenAI.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + # 1. Generate a single embedding. + result = await client.get_embeddings(["Hello, world!"]) + print(f"Single embedding dimensions: {result[0].dimensions}") + print(f"First 5 values: {result[0].vector[:5]}") + print(f"Model: {result[0].model_id}") + print(f"Usage: {result.usage}") + print() + + # 2. Generate embeddings for multiple inputs. + texts = [ + "The weather is sunny today.", + "It is raining outside.", + "Machine learning is fascinating.", + ] + result = await client.get_embeddings(texts) + print(f"Batch of {len(result)} embeddings, each with {result[0].dimensions} dimensions") + print(f"First embedding vector: {result[0].vector[:5]}") # Print first 5 values of the first embedding + print() + + # 3. Generate embeddings with custom dimensions. + result = await client.get_embeddings(["Custom dimensions example"], options={"dimensions": 256}) + print(f"Custom dimensions: {result[0].dimensions}") + print(f"First 5 values: {result[0].vector[:5]}") + + +if __name__ == "__main__": + asyncio.run(main()) + + +""" +Sample output: +Single embedding dimensions: 1536 +First 5 values: [0.012, -0.034, 0.056, -0.078, 0.090] +Model: text-embedding-3-small +Usage: {'prompt_tokens': 4, 'total_tokens': 4} + +Batch of 3 embeddings, each with 1536 dimensions + +Custom dimensions: 256 +"""