From f10afbfb98f5e5514840c05ad67aeae9f9d2c35d Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:01:18 +0100 Subject: [PATCH 01/14] feat(python): Add embedding abstractions and OpenAI implementation (Phase 1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This PR contains two parts: 1. **Overall migration plan** for porting vector stores and embeddings from Semantic Kernel to Agent Framework (docs/features/vector-stores-and-embeddings/README.md) covering all 10 phases from core abstractions through connectors and TextSearch. 2. **Phase 1 implementation** — core embedding abstractions and OpenAI/Azure OpenAI embedding clients: Core types (_types.py): - EmbeddingGenerationOptions TypedDict (total=False) - Embedding[EmbeddingT] generic class with model_id, dimensions, created_at - GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT] list container with options, usage - EmbeddingInputT (default str) and EmbeddingT (default list[float]) TypeVars Protocol + base class (_clients.py): - SupportsGetEmbeddings protocol — Generic[EmbeddingInputT, EmbeddingT, OptionsContraT] - BaseEmbeddingClient ABC — Generic[EmbeddingInputT, EmbeddingT, OptionsCoT] Telemetry (observability.py): - EmbeddingTelemetryLayer with gen_ai.operation.name = "embeddings" OpenAI implementation (openai/_embedding_client.py): - RawOpenAIEmbeddingClient, OpenAIEmbeddingClient, OpenAIEmbeddingOptions - Uses _ensure_client() factory pattern Azure OpenAI implementation (azure/_embedding_client.py): - AzureOpenAIEmbeddingClient following AzureOpenAIChatClient pattern - Supports API key, Entra ID credentials, env var configuration Tests: - 47 unit tests for types, protocol, base class, OpenAI, and Azure clients - 6 integration tests (gated behind RUN_INTEGRATION_TESTS + credentials) Samples: - samples/02-agents/embeddings/openai_embeddings.py - samples/02-agents/embeddings/azure_openai_embeddings.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../vector-stores-and-embeddings/README.md | 385 ++++++++++++++++++ .../packages/core/agent_framework/__init__.py | 14 + .../packages/core/agent_framework/_clients.py | 133 +++++- .../packages/core/agent_framework/_types.py | 145 ++++++- .../core/agent_framework/azure/__init__.py | 1 + .../azure/_embedding_client.py | 136 +++++++ .../core/agent_framework/azure/_shared.py | 3 + .../core/agent_framework/observability.py | 77 ++++ .../core/agent_framework/openai/__init__.py | 3 + .../openai/_embedding_client.py | 193 +++++++++ .../core/agent_framework/openai/_shared.py | 3 + .../core/tests/core/test_embedding_client.py | 97 +++++ .../core/tests/core/test_embedding_types.py | 182 +++++++++ .../openai/test_openai_embedding_client.py | 314 ++++++++++++++ .../tests/workflow/test_full_conversation.py | 14 +- .../embeddings/azure_openai_embeddings.py | 70 ++++ .../02-agents/embeddings/openai_embeddings.py | 65 +++ 17 files changed, 1821 insertions(+), 14 deletions(-) create mode 100644 docs/features/vector-stores-and-embeddings/README.md create mode 100644 python/packages/core/agent_framework/azure/_embedding_client.py create mode 100644 python/packages/core/agent_framework/openai/_embedding_client.py create mode 100644 python/packages/core/tests/core/test_embedding_client.py create mode 100644 python/packages/core/tests/core/test_embedding_types.py create mode 100644 python/packages/core/tests/openai/test_openai_embedding_client.py create mode 100644 python/samples/02-agents/embeddings/azure_openai_embeddings.py create mode 100644 python/samples/02-agents/embeddings/openai_embeddings.py diff --git a/docs/features/vector-stores-and-embeddings/README.md b/docs/features/vector-stores-and-embeddings/README.md new file mode 100644 index 0000000000..8627a9d5ea --- /dev/null +++ b/docs/features/vector-stores-and-embeddings/README.md @@ -0,0 +1,385 @@ +# Vector Stores and Embeddings + +## Overview + +This feature ports the vector store abstractions, embedding generator abstractions, and their implementations from Semantic Kernel into Agent Framework. The ported code follows AF's coding standards, feels native to AF, and is structured to allow data models/schemas to be reusable across both frameworks. The embedding abstraction combines the best of SK's `EmbeddingGeneratorBase` and MEAI's `IEmbeddingGenerator`. + +| Capability | Description | +| --- | --- | +| Embedding generation | Generic embedding client abstraction supporting text, image, and audio inputs | +| Vector store collections | CRUD operations on vector store collections (upsert, get, delete) | +| Vector search | Unified search interface with `search_type` parameter (`"vector"`, `"keyword_hybrid"`) | +| Data model decorator | `@vectorstoremodel` decorator for defining vector store data models (supports Pydantic, dataclasses, plain classes, dicts) | +| Agent tools | `create_search_tool`, `create_upsert_tool`, `create_get_tool`, `create_delete_tool` for agent-usable vector store operations | +| In-memory store | Zero-dependency vector store for testing and development | +| 13+ connectors | Azure AI Search, Qdrant, Redis, PostgreSQL, MongoDB, Cosmos DB, Pinecone, Chroma, Weaviate, Oracle, SQL Server, FAISS | + +## Key Design Decisions + +### Embedding Abstractions (combining SK + MEAI) +- **Both Protocol and Base class** (matching AF's `SupportsChatGetResponse` + `BaseChatClient` pattern): + - `SupportsGetEmbeddings` — Protocol for duck-typing + - `BaseEmbeddingClient` — ABC base class for implementations (similar to `BaseChatClient`) +- **Generic input type** (`EmbeddingInputT`, default `str`) from MEAI — allows image/audio embeddings in the future +- **Generic output type** (`EmbeddingT`, default `list[float]`) from MEAI — supports `list[float]`, `list[int]`, `bytes`, etc. +- **Generic order**: `[EmbeddingInputT, EmbeddingT, EmbeddingOptionsT]` — options last, matching MEAI's `IEmbeddingGenerator` with options appended +- **TypeVar naming convention**: Use `SuffixT` per AF standard (e.g., `EmbeddingInputT`, `EmbeddingT`, `ModelT`, `KeyT`) +- `EmbeddingGenerationOptions` TypedDict (inspired by MEAI, matching AF's `ChatOptions` pattern) — `total=False`, includes `dimensions`, `model_id`. No `additional_properties` since each implementation extends with its own fields. +- Protocol and base class are generic over input, output, and options: `SupportsGetEmbeddings[EmbeddingInputT, EmbeddingT, OptionsContraT]`, `BaseEmbeddingClient[EmbeddingInputT, EmbeddingT, OptionsCoT]` +- **`Embedding[EmbeddingT]` type** in `_types.py` — a lightweight generic class (not Pydantic) with `vector: EmbeddingT`, `model_id: str | None`, `dimensions: int | None` (explicit or computed from vector), `created_at: datetime | None`, `additional_properties: dict[str, Any]` +- **`GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT]` type** — a list-like container of `Embedding[EmbeddingT]` objects with `options: EmbeddingOptionsT | None` (stores the options used to generate), `usage: dict[str, Any] | None`, `additional_properties: dict[str, Any]` +- **No numpy dependency** — return `list[float]` by default; users cast as needed + +### Vector Store Abstractions +- **Port core abstractions without Pydantic for internal classes** — use plain classes +- **Both Protocol and Base class** for vector store operations (matching AF pattern): + - `SupportsVectorUpsert` / `SupportsVectorSearch` — Protocols for duck-typing (follows `Supports` naming convention) + - `BaseVectorCollection` / `BaseVectorSearch` — ABC base classes for implementations + - `BaseVectorStore` — ABC base class for store operations (factory for collections, no protocol needed) +- **TypeVar naming convention**: `ModelT`, `KeyT`, `FilterT` (suffix T, per AF standard) +- **Support Pydantic for user-facing data models** — the `@vectorstoremodel` decorator and `VectorStoreCollectionDefinition` should work with Pydantic models, dataclasses, plain classes, and dicts +- **Remove SK-specific dependencies** — no `KernelBaseModel`, `KernelFunction`, `KernelParameterMetadata`, `kernel_function`, `PromptExecutionSettings` +- **Embedding types in `_types.py`**, embedding protocol/base class in `_clients.py` +- **All vector store specific types, enums, protocols, base classes** in `_vectors.py` +- **Error handling** uses AF's exception hierarchy (e.g., `IntegrationException` variants) + +### Package Structure +- **Embedding types** (`Embedding`, `GeneratedEmbeddings`, `EmbeddingGenerationOptions`) in `agent_framework/_types.py` +- **Embedding protocol + base class** (`SupportsGetEmbeddings`, `BaseEmbeddingClient`) in `agent_framework/_clients.py` +- **All vector store specific code** in a new `agent_framework/_vectors.py` module — this includes: + - Enums: `FieldTypes`, `IndexKind`, `DistanceFunction` + - `VectorStoreField`, `VectorStoreCollectionDefinition` + - `SearchOptions`, `SearchResponse`, `RecordFilterOptions` + - `@vectorstoremodel` decorator + - Serialization/deserialization protocols + - `VectorStoreRecordHandler`, `BaseVectorCollection`, `BaseVectorStore`, `BaseVectorSearch` + - `SupportsVectorUpsert`, `SupportsVectorSearch` protocols +- **OpenAI embeddings** in `agent_framework/openai/` (built into core, like OpenAI chat) +- **Azure OpenAI embeddings** in `agent_framework/azure/` (built into core, follows `AzureOpenAIChatClient` pattern) +- **Each vector store connector** in its own AF package under `packages/` +- **In-memory store** in core (no external deps) +- **TextSearch and its implementations** (Brave, Google) — last phase, separate work + +## Naming: SK → AF + +### Names that change + +| SK Name | AF Name | Rationale | +|---------|---------|-----------| +| `VectorStoreCollection` | `BaseVectorCollection` | Drop redundant `Store`, add `Base` prefix per AF pattern | +| `VectorStore` | `BaseVectorStore` | Add `Base` prefix per AF pattern | +| `VectorSearch` | `BaseVectorSearch` | Add `Base` prefix per AF pattern | +| `VectorSearchOptions` | `SearchOptions` | Shorter — context is already vector search | +| `VectorSearchResult` | `SearchResponse` | Align with `ChatResponse`/`AgentResponse` | +| `GetFilteredRecordOptions` | `RecordFilterOptions` | Shorter, more natural | +| `EmbeddingGeneratorBase` | `BaseEmbeddingClient` | Matches AF `BaseChatClient` pattern | +| `VectorStoreCollectionProtocol` | `SupportsVectorUpsert` | AF `Supports*` naming convention | +| `VectorSearchProtocol` | `SupportsVectorSearch` | AF `Supports*` naming convention | +| `__kernel_vectorstoremodel__` | `__vectorstoremodel__` | Drop SK `kernel` prefix | +| `__kernel_vectorstoremodel_definition__` | `__vectorstoremodel_definition__` | Drop SK `kernel` prefix | +| `search()` + `hybrid_search()` | `search(search_type=...)` | Single method with `Literal` parameter | +| `SearchType` enum | `Literal["vector", "keyword_hybrid"]` | No enum, just a literal | +| `KernelSearchResults` | `SearchResults` | Drop SK `Kernel` prefix (plural — container of `SearchResponse` items) | + +### Names that stay the same + +| Name | Location | +|------|----------| +| `@vectorstoremodel` | `_vectors.py` | +| `VectorStoreField` | `_vectors.py` | +| `VectorStoreCollectionDefinition` | `_vectors.py` | +| `VectorStoreRecordHandler` | `_vectors.py` | +| `FieldTypes` | `_vectors.py` | +| `IndexKind` | `_vectors.py` | +| `DistanceFunction` | `_vectors.py` | +| `DISTANCE_FUNCTION_DIRECTION_HELPER` | `_vectors.py` | +| `Embedding` | `_types.py` | +| `GeneratedEmbeddings` | `_types.py` | +| `EmbeddingGenerationOptions` | `_types.py` | +| `SupportsGetEmbeddings` | `_clients.py` | + +### New AF-only names (no SK equivalent) + +| Name | Location | Purpose | +|------|----------|---------| +| `BaseEmbeddingClient` | `_clients.py` | ABC base for embedding implementations | +| `EmbeddingInputT` | `_types.py` | TypeVar for generic embedding input (default `str`) | +| `EmbeddingTelemetryLayer` | `observability.py` | MRO-based OTel tracing for embeddings | +| `SupportsVectorUpsert` | `_vectors.py` | Protocol for collection CRUD | +| `SupportsVectorSearch` | `_vectors.py` | Protocol for vector search | +| `create_search_tool` | `_vectors.py` | Creates AF `FunctionTool` from vector search | + +## Source Files Reference (SK → AF mapping) + +### SK Source Files +| SK File | Lines | Content | +|---------|-------|---------| +| `data/vector.py` | 2369 | All vector store abstractions, enums, decorator, search | +| `data/_shared.py` | 184 | SearchOptions, KernelSearchResults, shared search types | +| `data/text_search.py` | 349 | TextSearch base, TextSearchResult | +| `connectors/ai/embedding_generator_base.py` | 50 | EmbeddingGeneratorBase ABC | +| `connectors/in_memory.py` | 520 | InMemoryCollection, InMemoryStore | +| `connectors/azure_ai_search.py` | 793 | Azure AI Search collection + store | +| `connectors/azure_cosmos_db.py` | 1104 | Cosmos DB (Mongo + NoSQL) | +| `connectors/redis.py` | 845 | Redis (Hashset + JSON) | +| `connectors/qdrant.py` | 653 | Qdrant collection + store | +| `connectors/postgres.py` | 987 | PostgreSQL collection + store | +| `connectors/mongodb.py` | 633 | MongoDB Atlas collection + store | +| `connectors/pinecone.py` | 691 | Pinecone collection + store | +| `connectors/chroma.py` | 484 | Chroma collection + store | +| `connectors/faiss.py` | 278 | FAISS (extends InMemory) | +| `connectors/weaviate.py` | 804 | Weaviate collection + store | +| `connectors/oracle.py` | 1267 | Oracle collection + store | +| `connectors/sql_server.py` | 1132 | SQL Server collection + store | +| `connectors/ai/open_ai/services/open_ai_text_embedding.py` | 91 | OpenAI embedding impl | +| `connectors/ai/open_ai/services/open_ai_text_embedding_base.py` | 78 | OpenAI embedding base | +| `connectors/brave.py` | ~200 | Brave TextSearch impl | +| `connectors/google_search.py` | ~200 | Google TextSearch impl | + +--- + +## Implementation Phases + +### Phase 1: Core Embedding Abstractions & OpenAI Implementation +**Goal:** Establish the embedding generator abstraction and ship one working implementation. +**Mergeable:** Yes — adds new types/protocols, no breaking changes. + +#### 1.1 — Embedding types in `_types.py` +- `EmbeddingInputT` TypeVar (default `str`) — generic input type for embedding generation +- `EmbeddingT` TypeVar (default `list[float]`) — generic output embedding vector type +- `Embedding[EmbeddingT]` generic class: `vector: EmbeddingT`, `model_id: str | None`, `dimensions: int | None` (explicit param or computed from vector length), `created_at: datetime | None`, `additional_properties: dict[str, Any]` +- `GeneratedEmbeddings[EmbeddingT, EmbeddingOptionsT]` generic class: list-like container of `Embedding[EmbeddingT]` objects with `options: EmbeddingOptionsT | None` (the options used to generate), `usage: dict[str, Any] | None`, `additional_properties: dict[str, Any]` +- `EmbeddingGenerationOptions` TypedDict (`total=False`): `dimensions: int`, `model_id: str` — follows the same pattern as `ChatOptions`. No `additional_properties` needed since it's a TypedDict and each implementation can extend with its own fields. + +#### 1.2 — Embedding generator protocol + base class in `_clients.py` +- `SupportsGetEmbeddings(Protocol[EmbeddingInputT, EmbeddingT, OptionsContraT])`: generic over input, output, and options (all with defaults), `get_embeddings(values: Sequence[EmbeddingInputT], *, options: OptionsContraT | None = None) -> Awaitable[GeneratedEmbeddings[EmbeddingT]]` +- `BaseEmbeddingClient(ABC, Generic[EmbeddingInputT, EmbeddingT, OptionsCoT])`: ABC base class mirroring `BaseChatClient` pattern + - `__init__` with `additional_properties`, etc. + - Abstract `get_embeddings(...)` for subclasses to implement directly (no `_inner_*` indirection — simpler than chat, no middleware needed) +- `EmbeddingTelemetryLayer` in `observability.py` — MRO-based telemetry (no closure), `gen_ai.operation.name = "embeddings"` + +#### 1.3 — OpenAI embedding generator in `agent_framework/openai/` and `agent_framework/azure/` +- `RawOpenAIEmbeddingClient` — implements `get_embeddings` via `_ensure_client()` factory +- `OpenAIEmbeddingClient(OpenAIConfigMixin, EmbeddingTelemetryLayer[str, list[float], OptionsT], RawOpenAIEmbeddingClient[OptionsT])` — full client with config + telemetry layers +- `OpenAIEmbeddingOptions(EmbeddingGenerationOptions)` — extends with `encoding_format`, `user` +- `AzureOpenAIEmbeddingClient` in `agent_framework/azure/` — follows `AzureOpenAIChatClient` pattern with `AzureOpenAIConfigMixin`, `load_settings`, Entra ID credential support +- `AzureOpenAISettings` extended with `embedding_deployment_name` (env var: `AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME`) + +#### 1.4 — Tests and samples +- Unit tests for types, protocol, base class, OpenAI client, Azure OpenAI client +- Integration tests for OpenAI and Azure OpenAI (gated behind `RUN_INTEGRATION_TESTS` + credentials, `@pytest.mark.flaky`) +- Samples in `samples/02-agents/embeddings/` — `openai_embeddings.py`, `azure_openai_embeddings.py` + +--- + +### Phase 2: Embedding Generators for Existing Providers +**Goal:** Add embedding generators to all existing AF provider packages that have chat clients. +**Mergeable:** Yes — each is independent, added to existing provider packages. + +#### 2.1 — Azure AI Inference embedding (in `packages/azure-ai/`) +#### 2.2 — Ollama embedding (in `packages/ollama/`) +#### 2.3 — Anthropic embedding (in `packages/anthropic/`) +#### 2.4 — Bedrock embedding (in `packages/bedrock/`) + +--- + +### Phase 3: Core Vector Store Abstractions +**Goal:** Establish all vector store types, enums, the decorator, collection definition, and base classes. +**Mergeable:** Yes — adds new abstractions, no breaking changes. + +#### 3.1 — Vector store enums and field types in `_vectors.py` +- `FieldTypes` enum: `KEY`, `VECTOR`, `DATA` +- `IndexKind` enum: `HNSW`, `FLAT`, `IVF_FLAT`, `DISK_ANN`, `QUANTIZED_FLAT`, `DYNAMIC`, `DEFAULT` +- `DistanceFunction` enum: `COSINE_SIMILARITY`, `COSINE_DISTANCE`, `DOT_PROD`, `EUCLIDEAN_DISTANCE`, `EUCLIDEAN_SQUARED_DISTANCE`, `MANHATTAN`, `HAMMING`, `DEFAULT` +- No `SearchType` enum — use `Literal["vector", "keyword_hybrid"]` instead, per AF convention of avoiding unnecessary imports +- `VectorStoreField` plain class (not Pydantic) +- `VectorStoreCollectionDefinition` class (not Pydantic internally, but supports Pydantic models as input) +- `SearchOptions` plain class +- `SearchResponse` generic class +- `RecordFilterOptions` plain class +- `DISTANCE_FUNCTION_DIRECTION_HELPER` dict + +#### 3.2 — `@vectorstoremodel` decorator +- Port from SK, works with dataclasses, Pydantic models, plain classes, and dicts +- Sets `__vectorstoremodel__` and `__vectorstoremodel_definition__` on the class +- Remove SK-specific `kernel` prefix (`__kernel_vectorstoremodel__` → `__vectorstoremodel__`) + +#### 3.3 — Serialization/deserialization protocols +- `SerializeMethodProtocol`, `ToDictFunctionProtocol`, `FromDictFunctionProtocol`, etc. +- Port the record handler logic but without Pydantic base class — use plain class or ABC + +#### 3.4 — Vector store base classes in `_vectors.py` +- `VectorStoreRecordHandler` — internal base class that handles serialization/deserialization between user data models and store-specific formats, plus embedding generation for vector fields. Both `BaseVectorCollection` and `BaseVectorSearch` extend this. +- `BaseVectorCollection(VectorStoreRecordHandler)` — base for collections + - Uses `SupportsGetEmbeddings` instead of `EmbeddingGeneratorBase` + - Not a Pydantic model — use `__init__` with explicit params + - `upsert`, `get`, `delete`, `ensure_collection_exists`, `collection_exists`, `ensure_collection_deleted` + - Async context manager support +- `BaseVectorStore` — base for stores + - `get_collection`, `list_collection_names`, `collection_exists`, `ensure_collection_deleted` + - Async context manager support + +#### 3.5 — Vector search base class +- `BaseVectorSearch(VectorStoreRecordHandler)` — base for vector search + - Single `search(search_type=...)` method with `search_type: Literal["vector", "keyword_hybrid"]` parameter — no enum, just a literal + - `_inner_search` abstract method for implementations + - Filter building with lambda parser (AST-based) + - Vector generation from values using embedding generator + +#### 3.6 — Protocols for type checking +- `SupportsVectorUpsert` — Protocol for upsert/get/delete operations +- `SupportsVectorSearch` — Protocol for vector search (single `search()` with `search_type` parameter) +- No separate `SupportsVectorHybridSearch` — search type is a parameter, not a separate capability +- No protocol for `VectorStore` — it's a factory for collections, not a capability to duck-type against + +#### 3.7 — Exception types +- Add vector store exceptions under `IntegrationException` or create new branch +- `VectorStoreException`, `VectorStoreOperationException`, `VectorSearchException`, `VectorStoreModelException`, etc. + +#### 3.8 — `create_search_tool` on `BaseVectorSearch` +- Method on `BaseVectorSearch` that creates an AF `FunctionTool` from the vector search +- Wraps the single `search()` method, passing `search_type` parameter +- Accepts: `name`, `description`, `search_type`, `top`, `skip`, `filter`, `string_mapper` +- The tool takes a query string, vectorizes it, searches, and returns results as strings +- Can also be a standalone factory function in `_vectors.py` + +#### 3.9 — Tests for all vector store abstractions +- Unit tests for enums, field types, collection definition +- Unit tests for decorator +- Unit tests for serialization/deserialization +- Unit tests for record handler + +--- + +### Phase 4: In-Memory Vector Store +**Goal:** Provide a zero-dependency vector store for testing and development. +**Mergeable:** Yes — first usable vector store. + +#### 4.1 — Port `InMemoryCollection` and `InMemoryStore` into core +- Place in `agent_framework/_vectors.py` (alongside the abstractions) +- Supports vector search (cosine similarity, etc.) +- No external dependencies + +#### 4.2 — Port FAISS extension (optional, can be separate package) +- Extends InMemory with FAISS indexing + +#### 4.3 — Tests and sample code + +--- + +### Phase 5: Vector Store Connectors — Tier 1 (High Priority) +**Goal:** Ship the most commonly used vector store connectors. +**Mergeable:** Yes — each connector is independent. + +Each connector follows the AF package structure: +- New package under `packages/` +- Own `pyproject.toml`, `tests/`, lazy loading in core + +#### 5.1 — Azure AI Search (`packages/azure-ai-search/`) +- May extend existing package or be new +- `AzureAISearchCollection`, `AzureAISearchStore` + +#### 5.2 — Qdrant (`packages/qdrant/`) +- New package +- `QdrantCollection`, `QdrantStore` + +#### 5.3 — Redis (`packages/redis/`) +- May extend existing redis package +- `RedisCollection` (JSON + Hashset variants), `RedisStore` + +#### 5.4 — PostgreSQL/pgvector (`packages/postgres/`) +- New package +- `PostgresCollection`, `PostgresStore` + +--- + +### Phase 6: Vector Store Connectors — Tier 2 +**Goal:** Ship remaining vector store connectors. +**Mergeable:** Yes — each connector is independent. + +#### 6.1 — MongoDB Atlas (`packages/mongodb/`) +#### 6.2 — Azure Cosmos DB (`packages/azure-cosmos-db/`) +- Cosmos Mongo + Cosmos NoSQL +#### 6.3 — Pinecone (`packages/pinecone/`) +#### 6.4 — Chroma (`packages/chroma/`) +#### 6.5 — Weaviate (`packages/weaviate/`) + +--- + +### Phase 7: Vector Store Connectors — Tier 3 +**Goal:** Ship niche or less common connectors. +**Mergeable:** Yes — each connector is independent. + +#### 7.1 — Oracle (`packages/oracle/`) +#### 7.2 — SQL Server (`packages/sql-server/`) +#### 7.3 — FAISS (`packages/faiss/` or in core extending InMemory) + +--- + +### Phase 8: Vector Store CRUD Tools +**Goal:** Provide a full set of agent-usable tools for CRUD operations on vector store collections. +**Mergeable:** Yes — adds tools without changing existing APIs. + +#### 8.1 — `create_upsert_tool` — tool for upserting records into a collection +#### 8.2 — `create_get_tool` — tool for retrieving records by key +- Key-based lookup only (by primary key), not a search tool +- Documentation must clearly distinguish this from `create_search_tool`: get_tool retrieves specific records by their known key, while search_tool performs similarity/filtered search across the collection +- Consider if this overlaps with filtered search and document when to use which +#### 8.3 — `create_delete_tool` — tool for deleting records by key +#### 8.4 — Tests and samples for CRUD tools + +--- + +### Phase 9: Additional Embedding Implementations (New Providers) +**Goal:** Provide embedding generators for providers that don't yet have AF packages. +**Mergeable:** Yes — each is independent, new packages. + +#### 9.1 — HuggingFace/ONNX embedding (new package or lab) +#### 9.2 — Mistral AI embedding (new package) +#### 9.3 — Google AI / Vertex AI embedding (new package) +#### 9.4 — Nvidia embedding (new package) + +--- + +### Phase 10: TextSearch Abstractions & Implementations (Separate Work) +**Goal:** Port text search (non-vector) abstractions and implementations. +**Mergeable:** Yes — independent of vector stores. + +#### 10.1 — TextSearch base class and types +- `SearchOptions`, `SearchResponse`, `TextSearchResult` +- `TextSearch` base class with `search()` method +- `create_search_function()` for kernel integration (may need AF equivalent) + +#### 10.2 — Brave Search implementation +#### 10.3 — Google Search implementation +#### 10.4 — Vector store text search bridge (connecting VectorSearch to TextSearch interface) + +--- + +## Key Considerations + +1. **No Pydantic for internal classes**: All AF internal classes should use plain classes. Pydantic is only used for user-facing input validation (e.g., vector store data models). + +2. **Protocol + Base class**: Follow AF's pattern of both a `Protocol` for duck-typing and a `Base` ABC for implementation, matching how `SupportsChatGetResponse` + `BaseChatClient` works. + +3. **Exception hierarchy**: Use AF's `IntegrationException` branch for vector store operations, since vector stores are external dependencies. + +4. **`from __future__ import annotations`**: Required in all files per AF coding standard. + +5. **No `**kwargs` escape hatches**: Use explicit named parameters per AF coding standard. + +6. **Lazy loading**: Connector packages use `__getattr__` lazy loading in core provider folders. + +7. **Reusable data models**: The `@vectorstoremodel` decorator and `VectorStoreCollectionDefinition` should be agnostic enough to work with both SK and AF. The core types (`FieldTypes`, `IndexKind`, `DistanceFunction`, `VectorStoreField`) should be identical or easily mapped. + +8. **`create_search_tool`**: The AF-native equivalent of SK's `create_search_function`. Instead of creating a `KernelFunction`, this creates an AF `FunctionTool` (via the `@tool` decorator pattern) from a vector search. This allows agents to use vector search as a tool during conversations. Design: + - `create_search_tool(name, description, search_type, ...)` → returns a `FunctionTool` that wraps `VectorSearch.search(search_type=...)` + - The tool accepts a query string, performs embedding + vector search, and returns results as strings + - Supports configurable string mappers, filter functions, top/skip defaults + - Lives in `_vectors.py` as a method on `BaseVectorSearch` and/or as a standalone factory function + +9. **CRUD tools**: A full set of create/read/update/delete tools for vector store collections, allowing agents to manage data in vector stores. Design: + - `create_upsert_tool(...)` → tool for upserting records + - `create_get_tool(...)` → tool for retrieving records by key + - `create_delete_tool(...)` → tool for deleting records + - These are separate from search and are placed in a later phase diff --git a/python/packages/core/agent_framework/__init__.py b/python/packages/core/agent_framework/__init__.py index eaa149d749..bfa684b469 100644 --- a/python/packages/core/agent_framework/__init__.py +++ b/python/packages/core/agent_framework/__init__.py @@ -20,9 +20,11 @@ from ._agents import Agent, BaseAgent, RawAgent, SupportsAgentRun from ._clients import ( BaseChatClient, + BaseEmbeddingClient, SupportsChatGetResponse, SupportsCodeInterpreterTool, SupportsFileSearchTool, + SupportsGetEmbeddings, SupportsImageGenerationTool, SupportsMCPTool, SupportsWebSearchTool, @@ -82,9 +84,14 @@ ChatResponseUpdate, Content, ContinuationToken, + Embedding, + EmbeddingGenerationOptions, + EmbeddingInputT, + EmbeddingT, FinalT, FinishReason, FinishReasonLiteral, + GeneratedEmbeddings, Message, OuterFinalT, OuterUpdateT, @@ -201,6 +208,7 @@ "BaseAgent", "BaseChatClient", "BaseContextProvider", + "BaseEmbeddingClient", "BaseHistoryProvider", "Case", "ChatAndFunctionMiddlewareTypes", @@ -218,6 +226,10 @@ "Edge", "EdgeCondition", "EdgeDuplicationError", + "Embedding", + "EmbeddingGenerationOptions", + "EmbeddingInputT", + "EmbeddingT", "Executor", "FanInEdgeGroup", "FanOutEdgeGroup", @@ -232,6 +244,7 @@ "FunctionMiddleware", "FunctionMiddlewareTypes", "FunctionTool", + "GeneratedEmbeddings", "GraphConnectivityError", "InMemoryCheckpointStorage", "InMemoryHistoryProvider", @@ -261,6 +274,7 @@ "SupportsChatGetResponse", "SupportsCodeInterpreterTool", "SupportsFileSearchTool", + "SupportsGetEmbeddings", "SupportsImageGenerationTool", "SupportsMCPTool", "SupportsWebSearchTool", diff --git a/python/packages/core/agent_framework/_clients.py b/python/packages/core/agent_framework/_clients.py index b407be11cf..33e049be54 100644 --- a/python/packages/core/agent_framework/_clients.py +++ b/python/packages/core/agent_framework/_clients.py @@ -35,6 +35,10 @@ from ._types import ( ChatResponse, ChatResponseUpdate, + EmbeddingGenerationOptions, + EmbeddingInputT, + EmbeddingT, + GeneratedEmbeddings, Message, ResponseStream, validate_chat_options, @@ -56,7 +60,6 @@ InputT = TypeVar("InputT", contravariant=True) -EmbeddingT = TypeVar("EmbeddingT") BaseChatClientT = TypeVar("BaseChatClientT", bound="BaseChatClient") logger = logging.getLogger("agent_framework") @@ -660,3 +663,131 @@ def get_file_search_tool(**kwargs: Any) -> Any: # endregion + + +# region SupportsGetEmbeddings Protocol + +# Contravariant for the Protocol +EmbeddingOptionsContraT = TypeVar( + "EmbeddingOptionsContraT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", + contravariant=True, +) + + +@runtime_checkable +class SupportsGetEmbeddings(Protocol[EmbeddingInputT, EmbeddingT, EmbeddingOptionsContraT]): + """Protocol for an embedding client that can generate embeddings. + + This protocol enables duck-typing for embedding generation. Any class that + implements ``get_embeddings`` with a compatible signature satisfies this protocol. + + Generic over the input type (defaults to ``str``), output embedding type + (defaults to ``list[float]``), and options type. + + Examples: + .. code-block:: python + + from agent_framework import SupportsGetEmbeddings + + + def use_embeddings(client: SupportsGetEmbeddings) -> None: + result = await client.get_embeddings(["Hello, world!"]) + for embedding in result: + print(embedding.vector) + """ + + additional_properties: dict[str, Any] + + def get_embeddings( + self, + values: Sequence[EmbeddingInputT], + *, + options: EmbeddingOptionsContraT | None = None, + ) -> Awaitable[GeneratedEmbeddings[EmbeddingT]]: + """Generate embeddings for the given values. + + Args: + values: The values to generate embeddings for. + options: Optional embedding generation options. + + Returns: + Generated embeddings with metadata. + """ + ... + + +# endregion + + +# region BaseEmbeddingClient + +# Covariant for the BaseEmbeddingClient +EmbeddingOptionsCoT = TypeVar( + "EmbeddingOptionsCoT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", + covariant=True, +) + + +class BaseEmbeddingClient(SerializationMixin, ABC, Generic[EmbeddingInputT, EmbeddingT, EmbeddingOptionsCoT]): + """Abstract base class for embedding clients. + + Subclasses implement ``get_embeddings`` to provide the actual + embedding generation logic. + + Generic over the input type (defaults to ``str``), output embedding type + (defaults to ``list[float]``), and options type. + + Examples: + .. code-block:: python + + from agent_framework import BaseEmbeddingClient, Embedding, GeneratedEmbeddings + from collections.abc import Sequence + + + class CustomEmbeddingClient(BaseEmbeddingClient): + async def get_embeddings(self, values, *, options=None): + return GeneratedEmbeddings([Embedding(vector=[0.1, 0.2, 0.3]) for _ in values]) + """ + + OTEL_PROVIDER_NAME: ClassVar[str] = "unknown" + DEFAULT_EXCLUDE: ClassVar[set[str]] = {"additional_properties"} + + def __init__( + self, + *, + additional_properties: dict[str, Any] | None = None, + **kwargs: Any, + ) -> None: + """Initialize a BaseEmbeddingClient instance. + + Args: + additional_properties: Additional properties to pass to the client. + **kwargs: Additional keyword arguments passed to parent classes (for MRO). + """ + self.additional_properties = additional_properties or {} + super().__init__(**kwargs) + + @abstractmethod + async def get_embeddings( + self, + values: Sequence[EmbeddingInputT], + *, + options: EmbeddingOptionsCoT | None = None, + ) -> GeneratedEmbeddings[EmbeddingT]: + """Generate embeddings for the given values. + + Args: + values: The values to generate embeddings for. + options: Optional embedding generation options. + + Returns: + Generated embeddings with metadata. + """ + ... + + +# endregion diff --git a/python/packages/core/agent_framework/_types.py b/python/packages/core/agent_framework/_types.py index a699a30f5f..c03aeda07f 100644 --- a/python/packages/core/agent_framework/_types.py +++ b/python/packages/core/agent_framework/_types.py @@ -8,8 +8,18 @@ import re import sys from asyncio import iscoroutine -from collections.abc import AsyncIterable, AsyncIterator, Awaitable, Callable, Mapping, MutableMapping, Sequence +from collections.abc import ( + AsyncIterable, + AsyncIterator, + Awaitable, + Callable, + Iterable, + Mapping, + MutableMapping, + Sequence, +) from copy import deepcopy +from datetime import datetime from typing import TYPE_CHECKING, Any, ClassVar, Final, Generic, Literal, NewType, cast, overload from pydantic import BaseModel @@ -23,6 +33,10 @@ from typing import TypeVar # pragma: no cover else: from typing_extensions import TypeVar # pragma: no cover +if sys.version_info >= (3, 12): + pass # pragma: no cover +else: + pass # pragma: no cover if sys.version_info >= (3, 11): from typing import TypedDict # type: ignore # pragma: no cover else: @@ -272,7 +286,8 @@ def _serialize_value(value: Any, exclude_none: bool) -> Any: # region Constants and types _T = TypeVar("_T") -EmbeddingT = TypeVar("EmbeddingT") +EmbeddingT = TypeVar("EmbeddingT", default="list[float]") +EmbeddingInputT = TypeVar("EmbeddingInputT", default="str") ChatResponseT = TypeVar("ChatResponseT", bound="ChatResponse") ToolModeT = TypeVar("ToolModeT", bound="ToolMode") AgentResponseT = TypeVar("AgentResponseT", bound="AgentResponse") @@ -3158,3 +3173,129 @@ def merge_chat_options( result[key] = value return result + + +# region Embedding Types + + +class EmbeddingGenerationOptions(TypedDict, total=False): + """Common request settings for embedding generation. + + All fields are optional (total=False) to allow partial specification. + Provider-specific TypedDicts extend this with additional options. + + Examples: + .. code-block:: python + + from agent_framework import EmbeddingGenerationOptions + + options: EmbeddingGenerationOptions = { + "model_id": "text-embedding-3-small", + "dimensions": 1536, + } + """ + + model_id: str + dimensions: int + + +class Embedding(Generic[EmbeddingT]): + """A single embedding vector with metadata. + + Generic over the embedding vector type, e.g. ``Embedding[list[float]]``, + ``Embedding[list[int]]``, or ``Embedding[bytes]``. + + Args: + vector: The embedding vector data. + model_id: The model used to generate this embedding. + dimensions: Explicit dimension count (computed from vector length if omitted). + created_at: Timestamp of when the embedding was generated. + additional_properties: Additional metadata. + + Examples: + .. code-block:: python + + from agent_framework import Embedding + + embedding = Embedding( + vector=[0.1, 0.2, 0.3], + model_id="text-embedding-3-small", + ) + assert embedding.dimensions == 3 + """ + + def __init__( + self, + vector: EmbeddingT, + *, + model_id: str | None = None, + dimensions: int | None = None, + created_at: datetime | None = None, + additional_properties: dict[str, Any] | None = None, + ) -> None: + self.vector = vector + self._dimensions = dimensions + self.model_id = model_id + self.created_at = created_at + self.additional_properties = additional_properties or {} + + @property + def dimensions(self) -> int | None: + """Return the number of dimensions in the embedding vector. + + Uses the explicitly provided value if set, otherwise computes from vector length. + """ + if self._dimensions is not None: + return self._dimensions + if isinstance(self.vector, (list, tuple, bytes)): + return len(self.vector) + return None + + +EmbeddingOptionsT = TypeVar( + "EmbeddingOptionsT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", +) + + +class GeneratedEmbeddings(list[Embedding[EmbeddingT]], Generic[EmbeddingT, EmbeddingOptionsT]): + """A list of generated embeddings with usage metadata. + + Extends list for direct iteration and indexing. + Generic over both the embedding vector type and the options type used for generation. + + Args: + embeddings: Sequence of Embedding objects. + options: The options used to generate these embeddings. + usage: Token usage information (e.g. prompt_tokens, total_tokens). + additional_properties: Additional metadata. + + Examples: + .. code-block:: python + + from agent_framework import Embedding, GeneratedEmbeddings + + embeddings = GeneratedEmbeddings( + [Embedding(vector=[0.1, 0.2]), Embedding(vector=[0.3, 0.4])], + usage={"prompt_tokens": 10, "total_tokens": 10}, + ) + assert len(embeddings) == 2 + assert embeddings.usage["prompt_tokens"] == 10 + """ + + def __init__( + self, + embeddings: Iterable[Embedding[EmbeddingT]] | None = None, + *, + options: EmbeddingOptionsT | None = None, + usage: dict[str, Any] | None = None, + additional_properties: dict[str, Any] | None = None, + ) -> None: + super().__init__(embeddings or []) + self.options = options + self.usage = usage + self.additional_properties = additional_properties or {} + + +# endregion diff --git a/python/packages/core/agent_framework/azure/__init__.py b/python/packages/core/agent_framework/azure/__init__.py index a485ee7aa7..f525e7a33e 100644 --- a/python/packages/core/agent_framework/azure/__init__.py +++ b/python/packages/core/agent_framework/azure/__init__.py @@ -36,6 +36,7 @@ "AzureOpenAIAssistantsOptions": ("agent_framework.azure._assistants_client", "agent-framework-core"), "AzureOpenAIChatClient": ("agent_framework.azure._chat_client", "agent-framework-core"), "AzureOpenAIChatOptions": ("agent_framework.azure._chat_client", "agent-framework-core"), + "AzureOpenAIEmbeddingClient": ("agent_framework.azure._embedding_client", "agent-framework-core"), "AzureOpenAIResponsesClient": ("agent_framework.azure._responses_client", "agent-framework-core"), "AzureOpenAIResponsesOptions": ("agent_framework.azure._responses_client", "agent-framework-core"), "AzureOpenAISettings": ("agent_framework.azure._shared", "agent-framework-core"), diff --git a/python/packages/core/agent_framework/azure/_embedding_client.py b/python/packages/core/agent_framework/azure/_embedding_client.py new file mode 100644 index 0000000000..05d6b5d603 --- /dev/null +++ b/python/packages/core/agent_framework/azure/_embedding_client.py @@ -0,0 +1,136 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +import sys +from collections.abc import Mapping +from typing import Generic + +from openai.lib.azure import AsyncAzureOpenAI + +from agent_framework.observability import EmbeddingTelemetryLayer +from agent_framework.openai import OpenAIEmbeddingOptions +from agent_framework.openai._embedding_client import RawOpenAIEmbeddingClient + +from .._settings import load_settings +from ._entra_id_authentication import AzureCredentialTypes, AzureTokenProvider +from ._shared import ( + AzureOpenAIConfigMixin, + AzureOpenAISettings, + _apply_azure_defaults, +) + +if sys.version_info >= (3, 13): + from typing import TypeVar # type: ignore # pragma: no cover +else: + from typing_extensions import TypeVar # type: ignore # pragma: no cover +if sys.version_info >= (3, 11): + from typing import TypedDict # type: ignore # pragma: no cover +else: + from typing_extensions import TypedDict # type: ignore # pragma: no cover + + +AzureOpenAIEmbeddingOptionsT = TypeVar( + "AzureOpenAIEmbeddingOptionsT", + bound=TypedDict, # type: ignore[valid-type] + default="OpenAIEmbeddingOptions", + covariant=True, +) + + +class AzureOpenAIEmbeddingClient( + AzureOpenAIConfigMixin, + EmbeddingTelemetryLayer[str, list[float], AzureOpenAIEmbeddingOptionsT], + RawOpenAIEmbeddingClient[AzureOpenAIEmbeddingOptionsT], + Generic[AzureOpenAIEmbeddingOptionsT], +): + """Azure OpenAI embedding client with telemetry support. + + Keyword Args: + api_key: The API key. If provided, will override the value in the env vars or .env file. + Can also be set via environment variable AZURE_OPENAI_API_KEY. + deployment_name: The deployment name. If provided, will override the value + (embedding_deployment_name) in the env vars or .env file. + Can also be set via environment variable AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME. + endpoint: The deployment endpoint. + Can also be set via environment variable AZURE_OPENAI_ENDPOINT. + base_url: The deployment base URL. + Can also be set via environment variable AZURE_OPENAI_BASE_URL. + api_version: The deployment API version. + Can also be set via environment variable AZURE_OPENAI_API_VERSION. + token_endpoint: The token endpoint to request an Azure token. + Can also be set via environment variable AZURE_OPENAI_TOKEN_ENDPOINT. + credential: Azure credential or token provider for authentication. + default_headers: Default headers for HTTP requests. + async_client: An existing client to use. + env_file_path: Path to .env file for settings. + env_file_encoding: Encoding for .env file. + + Examples: + .. code-block:: python + + from agent_framework.azure import AzureOpenAIEmbeddingClient + + # Using environment variables + # Set AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com + # Set AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME=text-embedding-3-small + # Set AZURE_OPENAI_API_KEY=your-key + client = AzureOpenAIEmbeddingClient() + + # Or passing parameters directly + client = AzureOpenAIEmbeddingClient( + endpoint="https://your-endpoint.openai.azure.com", + deployment_name="text-embedding-3-small", + api_key="your-key", + ) + + result = await client.get_embeddings(["Hello, world!"]) + """ + + def __init__( + self, + *, + api_key: str | None = None, + deployment_name: str | None = None, + endpoint: str | None = None, + base_url: str | None = None, + api_version: str | None = None, + token_endpoint: str | None = None, + credential: AzureCredentialTypes | AzureTokenProvider | None = None, + default_headers: Mapping[str, str] | None = None, + async_client: AsyncAzureOpenAI | None = None, + env_file_path: str | None = None, + env_file_encoding: str | None = None, + ) -> None: + """Initialize an Azure OpenAI embedding client.""" + azure_openai_settings = load_settings( + AzureOpenAISettings, + env_prefix="AZURE_OPENAI_", + api_key=api_key, + base_url=base_url, + endpoint=endpoint, + embedding_deployment_name=deployment_name, + api_version=api_version, + env_file_path=env_file_path, + env_file_encoding=env_file_encoding, + token_endpoint=token_endpoint, + ) + _apply_azure_defaults(azure_openai_settings) + + if not azure_openai_settings.get("embedding_deployment_name"): + raise ValueError( + "Azure OpenAI embedding deployment name is required. Set via 'deployment_name' parameter " + "or 'AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME' environment variable." + ) + + super().__init__( + deployment_name=azure_openai_settings["embedding_deployment_name"], # type: ignore[arg-type] + endpoint=azure_openai_settings["endpoint"], + base_url=azure_openai_settings["base_url"], + api_version=azure_openai_settings["api_version"], # type: ignore + api_key=azure_openai_settings["api_key"].get_secret_value() if azure_openai_settings["api_key"] else None, + token_endpoint=azure_openai_settings["token_endpoint"], + credential=credential, + default_headers=default_headers, + client=async_client, + ) diff --git a/python/packages/core/agent_framework/azure/_shared.py b/python/packages/core/agent_framework/azure/_shared.py index 732de8281e..dce116a242 100644 --- a/python/packages/core/agent_framework/azure/_shared.py +++ b/python/packages/core/agent_framework/azure/_shared.py @@ -53,6 +53,8 @@ class AzureOpenAISettings(TypedDict, total=False): Resource Management > Deployments in the Azure portal or, alternatively, under Management > Deployments in Azure AI Foundry. Can be set via environment variable AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME. + embedding_deployment_name: The name of the Azure Embedding deployment. + Can be set via environment variable AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME. api_key: The API key for the Azure deployment. This value can be found in the Keys & Endpoint section when examining your resource in the Azure portal. You can use either KEY1 or KEY2. @@ -95,6 +97,7 @@ class AzureOpenAISettings(TypedDict, total=False): chat_deployment_name: str | None responses_deployment_name: str | None + embedding_deployment_name: str | None endpoint: str | None base_url: str | None api_key: SecretString | None diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 8f581a605d..160a16af4d 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -38,6 +38,10 @@ else: from typing_extensions import TypeVar # type: ignore # pragma: no cover +# Defined here to avoid circular import with _types.py +EmbeddingInputT = TypeVar("EmbeddingInputT", default="str") +EmbeddingT = TypeVar("EmbeddingT", default="list[float]") + if TYPE_CHECKING: # pragma: no cover from opentelemetry.sdk._logs.export import LogRecordExporter from opentelemetry.sdk.metrics.export import MetricExporter @@ -59,7 +63,9 @@ ChatResponse, ChatResponseUpdate, Content, + EmbeddingGenerationOptions, FinishReason, + GeneratedEmbeddings, Message, ResponseStream, ) @@ -70,6 +76,7 @@ "OBSERVABILITY_SETTINGS", "AgentTelemetryLayer", "ChatTelemetryLayer", + "EmbeddingTelemetryLayer", "OtelAttr", "configure_otel_providers", "create_metric_views", @@ -251,6 +258,7 @@ class OtelAttr(str, Enum): # Operation names CHAT_COMPLETION_OPERATION = "chat" + EMBEDDING_OPERATION = "embeddings" TOOL_EXECUTION_OPERATION = "execute_tool" # Describes GenAI agent creation and is usually applicable when working with remote agent services. AGENT_CREATE_OPERATION = "create_agent" @@ -1265,6 +1273,75 @@ async def _get_response() -> ChatResponse: return _get_response() +EmbeddingOptionsCoT = TypeVar( + "EmbeddingOptionsCoT", + bound=TypedDict, # type: ignore[valid-type] + default="EmbeddingGenerationOptions", + covariant=True, +) + + +class EmbeddingTelemetryLayer(Generic[EmbeddingInputT, EmbeddingT, EmbeddingOptionsCoT]): + """Layer that wraps embedding client get_embeddings with OpenTelemetry tracing.""" + + def __init__(self, *args: Any, otel_provider_name: str | None = None, **kwargs: Any) -> None: + """Initialize telemetry attributes and histograms.""" + super().__init__(*args, **kwargs) + self.token_usage_histogram = _get_token_usage_histogram() + self.duration_histogram = _get_duration_histogram() + self.otel_provider_name = otel_provider_name or getattr(self, "OTEL_PROVIDER_NAME", "unknown") + + async def get_embeddings( + self, + values: Sequence[EmbeddingInputT], + *, + options: EmbeddingOptionsCoT | None = None, + ) -> GeneratedEmbeddings[EmbeddingT]: + """Trace embedding generation with OpenTelemetry spans and metrics.""" + global OBSERVABILITY_SETTINGS + super_get_embeddings = super().get_embeddings # type: ignore[misc] + + if not OBSERVABILITY_SETTINGS.ENABLED: + return await super_get_embeddings(values, options=options) # type: ignore[no-any-return] + + opts: dict[str, Any] = options or {} # type: ignore[assignment] + provider_name = str(self.otel_provider_name) + model_id = opts.get("model_id") or getattr(self, "model_id", None) or "unknown" + service_url_func = getattr(self, "service_url", None) + service_url = str(service_url_func() if callable(service_url_func) else "unknown") + attributes = _get_span_attributes( + operation_name=OtelAttr.EMBEDDING_OPERATION, + provider_name=provider_name, + model=model_id, + service_url=service_url, + ) + + with _get_span(attributes=attributes, span_name_attribute=SpanAttributes.LLM_REQUEST_MODEL) as span: + start_time_stamp = perf_counter() + try: + result = await super_get_embeddings(values, options=options) + except Exception as exception: + capture_exception(span=span, exception=exception, timestamp=time_ns()) + raise + duration = perf_counter() - start_time_stamp + response_attributes: dict[str, Any] = {**attributes} + if result.usage: + if "prompt_tokens" in result.usage: + response_attributes[OtelAttr.INPUT_TOKENS] = result.usage["prompt_tokens"] + if "total_tokens" in result.usage: + response_attributes[OtelAttr.OUTPUT_TOKENS] = result.usage.get( + "completion_tokens", result.usage["total_tokens"] + ) + _capture_response( + span=span, + attributes=response_attributes, + token_usage_histogram=self.token_usage_histogram, + operation_duration_histogram=self.duration_histogram, + duration=duration, + ) + return result # type: ignore[no-any-return] + + class AgentTelemetryLayer: """Layer that wraps agent run with OpenTelemetry tracing.""" diff --git a/python/packages/core/agent_framework/openai/__init__.py b/python/packages/core/agent_framework/openai/__init__.py index 2d9cf09648..a3fe1fe8f6 100644 --- a/python/packages/core/agent_framework/openai/__init__.py +++ b/python/packages/core/agent_framework/openai/__init__.py @@ -19,6 +19,7 @@ OpenAIAssistantsOptions, ) from ._chat_client import OpenAIChatClient, OpenAIChatOptions +from ._embedding_client import OpenAIEmbeddingClient, OpenAIEmbeddingOptions from ._exceptions import ContentFilterResultSeverity, OpenAIContentFilterException from ._responses_client import ( OpenAIContinuationToken, @@ -38,6 +39,8 @@ "OpenAIChatOptions", "OpenAIContentFilterException", "OpenAIContinuationToken", + "OpenAIEmbeddingClient", + "OpenAIEmbeddingOptions", "OpenAIResponsesClient", "OpenAIResponsesOptions", "OpenAISettings", diff --git a/python/packages/core/agent_framework/openai/_embedding_client.py b/python/packages/core/agent_framework/openai/_embedding_client.py new file mode 100644 index 0000000000..fb98774c40 --- /dev/null +++ b/python/packages/core/agent_framework/openai/_embedding_client.py @@ -0,0 +1,193 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +import sys +from collections.abc import Awaitable, Callable, Mapping, Sequence +from typing import Any, Generic, Literal, TypedDict + +from openai import AsyncOpenAI + +from .._clients import BaseEmbeddingClient +from .._settings import load_settings +from .._types import Embedding, EmbeddingGenerationOptions, GeneratedEmbeddings +from ..observability import EmbeddingTelemetryLayer +from ._shared import OpenAIBase, OpenAIConfigMixin, OpenAISettings + +if sys.version_info >= (3, 13): + from typing import TypeVar # type: ignore # pragma: no cover +else: + from typing_extensions import TypeVar # type: ignore # pragma: no cover + + +class OpenAIEmbeddingOptions(EmbeddingGenerationOptions, total=False): + """OpenAI-specific embedding options. + + Extends EmbeddingGenerationOptions with OpenAI-specific fields. + + Examples: + .. code-block:: python + + from agent_framework.openai import OpenAIEmbeddingOptions + + options: OpenAIEmbeddingOptions = { + "model_id": "text-embedding-3-small", + "dimensions": 1536, + "encoding_format": "float", + } + """ + + encoding_format: Literal["float", "base64"] + user: str + + +OpenAIEmbeddingOptionsT = TypeVar( + "OpenAIEmbeddingOptionsT", + bound=TypedDict, # type: ignore[valid-type] + default="OpenAIEmbeddingOptions", + covariant=True, +) + + +class RawOpenAIEmbeddingClient( + OpenAIBase, + BaseEmbeddingClient[str, list[float], OpenAIEmbeddingOptionsT], + Generic[OpenAIEmbeddingOptionsT], +): + """Raw OpenAI embedding client without telemetry.""" + + async def get_embeddings( + self, + values: Sequence[str], + *, + options: OpenAIEmbeddingOptionsT | None = None, + ) -> GeneratedEmbeddings[list[float]]: + """Call the OpenAI embeddings API. + + Args: + values: The text values to generate embeddings for. + options: Optional embedding generation options. + + Returns: + Generated embeddings with usage metadata. + + Raises: + ValueError: If model_id is not provided. + """ + opts: dict[str, Any] = dict(options) if options else {} + model = opts.get("model_id") or self.model_id + if not model: + raise ValueError("model_id is required") + + kwargs: dict[str, Any] = {"input": list(values), "model": model} + if dimensions := opts.get("dimensions"): + kwargs["dimensions"] = dimensions + if encoding_format := opts.get("encoding_format"): + kwargs["encoding_format"] = encoding_format + if user := opts.get("user"): + kwargs["user"] = user + + response = await (await self._ensure_client()).embeddings.create(**kwargs) + + embeddings = [ + Embedding( + vector=item.embedding, + dimensions=len(item.embedding), + model_id=response.model, + ) + for item in response.data + ] + + usage_dict: dict[str, Any] | None = None + if response.usage: + usage_dict = { + "prompt_tokens": response.usage.prompt_tokens, + "total_tokens": response.usage.total_tokens, + } + + return GeneratedEmbeddings(embeddings, options=options, usage=usage_dict) + + +class OpenAIEmbeddingClient( + OpenAIConfigMixin, + EmbeddingTelemetryLayer[str, list[float], OpenAIEmbeddingOptionsT], + RawOpenAIEmbeddingClient[OpenAIEmbeddingOptionsT], + Generic[OpenAIEmbeddingOptionsT], +): + """OpenAI embedding client with telemetry support. + + Keyword Args: + model_id: The embedding model ID (e.g. "text-embedding-3-small"). + Can also be set via environment variable OPENAI_EMBEDDING_MODEL_ID. + api_key: OpenAI API key. + Can also be set via environment variable OPENAI_API_KEY. + org_id: OpenAI organization ID. + default_headers: Additional HTTP headers. + async_client: Pre-configured AsyncOpenAI client. + base_url: Custom API base URL. + env_file_path: Path to .env file for settings. + env_file_encoding: Encoding for .env file. + + Examples: + .. code-block:: python + + from agent_framework.openai import OpenAIEmbeddingClient + + # Using environment variables + # Set OPENAI_API_KEY=sk-... + # Set OPENAI_EMBEDDING_MODEL_ID=text-embedding-3-small + client = OpenAIEmbeddingClient() + + # Or passing parameters directly + client = OpenAIEmbeddingClient( + model_id="text-embedding-3-small", + api_key="sk-...", + ) + + # Generate embeddings + result = await client.get_embeddings(["Hello, world!"]) + print(result[0].vector) + """ + + def __init__( + self, + *, + model_id: str | None = None, + api_key: str | Callable[[], str | Awaitable[str]] | None = None, + org_id: str | None = None, + default_headers: Mapping[str, str] | None = None, + async_client: AsyncOpenAI | None = None, + base_url: str | None = None, + env_file_path: str | None = None, + env_file_encoding: str | None = None, + ) -> None: + """Initialize an OpenAI embedding client.""" + openai_settings = load_settings( + OpenAISettings, + env_prefix="OPENAI_", + api_key=api_key, + base_url=base_url, + org_id=org_id, + embedding_model_id=model_id, + env_file_path=env_file_path, + env_file_encoding=env_file_encoding, + ) + + if not async_client and not openai_settings["api_key"]: + raise ValueError( + "OpenAI API key is required. Set via 'api_key' parameter or 'OPENAI_API_KEY' environment variable." + ) + if not openai_settings["embedding_model_id"]: + raise ValueError( + "OpenAI embedding model ID is required. " + "Set via 'model_id' parameter or 'OPENAI_EMBEDDING_MODEL_ID' environment variable." + ) + + super().__init__( + model_id=openai_settings["embedding_model_id"], + api_key=self._get_api_key(openai_settings["api_key"]), + base_url=openai_settings["base_url"] if openai_settings["base_url"] else None, + org_id=openai_settings["org_id"], + default_headers=default_headers, + client=async_client, + ) diff --git a/python/packages/core/agent_framework/openai/_shared.py b/python/packages/core/agent_framework/openai/_shared.py index ed4de17378..67f0e91818 100644 --- a/python/packages/core/agent_framework/openai/_shared.py +++ b/python/packages/core/agent_framework/openai/_shared.py @@ -92,6 +92,8 @@ class OpenAISettings(TypedDict, total=False): Can be set via environment variable OPENAI_CHAT_MODEL_ID. responses_model_id: The OpenAI responses model ID to use, for example, gpt-4o or o1. Can be set via environment variable OPENAI_RESPONSES_MODEL_ID. + embedding_model_id: The OpenAI embedding model ID to use, for example, text-embedding-3-small. + Can be set via environment variable OPENAI_EMBEDDING_MODEL_ID. Examples: .. code-block:: python @@ -115,6 +117,7 @@ class OpenAISettings(TypedDict, total=False): org_id: str | None chat_model_id: str | None responses_model_id: str | None + embedding_model_id: str | None class OpenAIBase(SerializationMixin): diff --git a/python/packages/core/tests/core/test_embedding_client.py b/python/packages/core/tests/core/test_embedding_client.py new file mode 100644 index 0000000000..71d2bcfd70 --- /dev/null +++ b/python/packages/core/tests/core/test_embedding_client.py @@ -0,0 +1,97 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +from collections.abc import Sequence + +from agent_framework import ( + BaseEmbeddingClient, + Embedding, + EmbeddingGenerationOptions, + GeneratedEmbeddings, + SupportsGetEmbeddings, +) + + +class MockEmbeddingClient(BaseEmbeddingClient): + """A simple mock embedding client for testing.""" + + async def get_embeddings( + self, + values: Sequence[str], + *, + options: EmbeddingGenerationOptions | None = None, + ) -> GeneratedEmbeddings[list[float]]: + return GeneratedEmbeddings( + [Embedding(vector=[0.1, 0.2, 0.3], model_id="mock-model") for _ in values], + usage={"prompt_tokens": len(values), "total_tokens": len(values)}, + ) + + +# --- BaseEmbeddingClient tests --- + + +async def test_base_get_embeddings() -> None: + client = MockEmbeddingClient() + result = await client.get_embeddings(["hello", "world"]) + assert len(result) == 2 + assert result[0].vector == [0.1, 0.2, 0.3] + assert result[0].model_id == "mock-model" + + +async def test_base_get_embeddings_with_options() -> None: + client = MockEmbeddingClient() + options: EmbeddingGenerationOptions = {"model_id": "test", "dimensions": 3} + result = await client.get_embeddings(["hello"], options=options) + assert len(result) == 1 + + +async def test_base_get_embeddings_usage() -> None: + client = MockEmbeddingClient() + result = await client.get_embeddings(["a", "b", "c"]) + assert result.usage is not None + assert result.usage["prompt_tokens"] == 3 + + +def test_base_additional_properties_default() -> None: + client = MockEmbeddingClient() + assert client.additional_properties == {} + + +def test_base_additional_properties_custom() -> None: + client = MockEmbeddingClient(additional_properties={"key": "value"}) + assert client.additional_properties == {"key": "value"} + + +# --- SupportsGetEmbeddings protocol tests --- + + +def test_mock_client_satisfies_protocol() -> None: + client = MockEmbeddingClient() + assert isinstance(client, SupportsGetEmbeddings) + + +def test_plain_class_satisfies_protocol() -> None: + """A plain class with the right signature should satisfy the protocol.""" + + class PlainEmbeddingClient: + additional_properties: dict = {} + + async def get_embeddings(self, values, *, options=None): + return GeneratedEmbeddings() + + client = PlainEmbeddingClient() + assert isinstance(client, SupportsGetEmbeddings) + + +def test_wrong_class_does_not_satisfy_protocol() -> None: + """A class without get_embeddings should not satisfy the protocol.""" + + class NotAnEmbeddingClient: + additional_properties: dict = {} + + async def generate(self, values): + pass + + client = NotAnEmbeddingClient() + assert not isinstance(client, SupportsGetEmbeddings) diff --git a/python/packages/core/tests/core/test_embedding_types.py b/python/packages/core/tests/core/test_embedding_types.py new file mode 100644 index 0000000000..0d6db6b27e --- /dev/null +++ b/python/packages/core/tests/core/test_embedding_types.py @@ -0,0 +1,182 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +from datetime import datetime + +from agent_framework import Embedding, EmbeddingGenerationOptions, GeneratedEmbeddings + +# --- Embedding tests --- + + +def test_embedding_basic_construction() -> None: + embedding = Embedding(vector=[0.1, 0.2, 0.3]) + assert embedding.vector == [0.1, 0.2, 0.3] + assert embedding.model_id is None + assert embedding.created_at is None + assert embedding.additional_properties == {} + + +def test_embedding_construction_with_metadata() -> None: + now = datetime.now() + embedding = Embedding( + vector=[0.1, 0.2], + model_id="text-embedding-3-small", + created_at=now, + additional_properties={"key": "value"}, + ) + assert embedding.model_id == "text-embedding-3-small" + assert embedding.created_at == now + assert embedding.additional_properties == {"key": "value"} + + +def test_embedding_dimensions_computed_from_list() -> None: + embedding = Embedding(vector=[0.1, 0.2, 0.3]) + assert embedding.dimensions == 3 + + +def test_embedding_dimensions_computed_from_tuple() -> None: + embedding = Embedding(vector=(0.1, 0.2, 0.3, 0.4)) + assert embedding.dimensions == 4 + + +def test_embedding_dimensions_computed_from_bytes() -> None: + embedding = Embedding(vector=b"\x00\x01\x02") + assert embedding.dimensions == 3 + + +def test_embedding_dimensions_explicit_overrides_computed() -> None: + embedding = Embedding(vector=[0.1, 0.2, 0.3], dimensions=1536) + assert embedding.dimensions == 1536 + + +def test_embedding_dimensions_none_for_unknown_type() -> None: + embedding = Embedding(vector="not a list") # type: ignore[arg-type] + assert embedding.dimensions is None + + +def test_embedding_dimensions_explicit_with_unknown_type() -> None: + embedding = Embedding(vector="not a list", dimensions=100) # type: ignore[arg-type] + assert embedding.dimensions == 100 + + +def test_embedding_empty_vector() -> None: + embedding = Embedding(vector=[]) + assert embedding.dimensions == 0 + + +def test_embedding_int_vector() -> None: + embedding = Embedding(vector=[1, 2, 3]) + assert embedding.vector == [1, 2, 3] + assert embedding.dimensions == 3 + + +# --- GeneratedEmbeddings tests --- + + +def test_generated_basic_construction() -> None: + embeddings = GeneratedEmbeddings() + assert len(embeddings) == 0 + assert embeddings.options is None + assert embeddings.usage is None + assert embeddings.additional_properties == {} + + +def test_generated_construction_with_embeddings() -> None: + items = [Embedding(vector=[0.1, 0.2]), Embedding(vector=[0.3, 0.4])] + embeddings = GeneratedEmbeddings(items) + assert len(embeddings) == 2 + assert embeddings[0].vector == [0.1, 0.2] + assert embeddings[1].vector == [0.3, 0.4] + + +def test_generated_construction_with_usage() -> None: + usage = {"prompt_tokens": 10, "total_tokens": 10} + embeddings = GeneratedEmbeddings( + [ + Embedding( + vector=[0.1], + model_id="test-model", + ) + ], + usage=usage, + ) + assert embeddings.usage == usage + assert embeddings.usage["prompt_tokens"] == 10 + + +def test_generated_construction_with_additional_properties() -> None: + embeddings = GeneratedEmbeddings( + additional_properties={"model": "test"}, + ) + assert embeddings.additional_properties == {"model": "test"} + + +def test_generated_construction_with_options() -> None: + opts: EmbeddingGenerationOptions = {"model_id": "text-embedding-3-small", "dimensions": 256} + embeddings = GeneratedEmbeddings( + [Embedding(vector=[0.1])], + options=opts, + ) + assert embeddings.options is not None + assert embeddings.options["model_id"] == "text-embedding-3-small" + assert embeddings.options["dimensions"] == 256 + + +def test_generated_list_behavior_iteration() -> None: + items = [Embedding(vector=[float(i)]) for i in range(5)] + embeddings = GeneratedEmbeddings(items) + vectors = [e.vector for e in embeddings] + assert vectors == [[0.0], [1.0], [2.0], [3.0], [4.0]] + + +def test_generated_list_behavior_indexing() -> None: + items = [Embedding(vector=[0.1]), Embedding(vector=[0.2])] + embeddings = GeneratedEmbeddings(items) + assert embeddings[0].vector == [0.1] + assert embeddings[-1].vector == [0.2] + + +def test_generated_list_behavior_slicing() -> None: + items = [Embedding(vector=[float(i)]) for i in range(5)] + embeddings = GeneratedEmbeddings(items) + sliced = embeddings[1:3] + assert len(sliced) == 2 + + +def test_generated_list_behavior_append() -> None: + embeddings = GeneratedEmbeddings() + embeddings.append(Embedding(vector=[0.1])) + assert len(embeddings) == 1 + + +def test_generated_none_embeddings_creates_empty_list() -> None: + embeddings = GeneratedEmbeddings(None) + assert len(embeddings) == 0 + + +# --- EmbeddingGenerationOptions tests --- + + +def test_options_empty() -> None: + options: EmbeddingGenerationOptions = {} + assert "model_id" not in options + + +def test_options_with_model_id() -> None: + options: EmbeddingGenerationOptions = {"model_id": "text-embedding-3-small"} + assert options["model_id"] == "text-embedding-3-small" + + +def test_options_with_dimensions() -> None: + options: EmbeddingGenerationOptions = {"dimensions": 1536} + assert options["dimensions"] == 1536 + + +def test_options_with_all_fields() -> None: + options: EmbeddingGenerationOptions = { + "model_id": "text-embedding-3-small", + "dimensions": 1536, + } + assert options["model_id"] == "text-embedding-3-small" + assert options["dimensions"] == 1536 diff --git a/python/packages/core/tests/openai/test_openai_embedding_client.py b/python/packages/core/tests/openai/test_openai_embedding_client.py new file mode 100644 index 0000000000..79bd94199f --- /dev/null +++ b/python/packages/core/tests/openai/test_openai_embedding_client.py @@ -0,0 +1,314 @@ +# Copyright (c) Microsoft. All rights reserved. + +from __future__ import annotations + +import os +from unittest.mock import AsyncMock, MagicMock + +import pytest +from openai.types import CreateEmbeddingResponse +from openai.types import Embedding as OpenAIEmbedding +from openai.types.create_embedding_response import Usage + +from agent_framework.azure import AzureOpenAIEmbeddingClient +from agent_framework.openai import ( + OpenAIEmbeddingClient, + OpenAIEmbeddingOptions, +) + + +def _make_openai_response( + embeddings: list[list[float]], + model: str = "text-embedding-3-small", + prompt_tokens: int = 5, + total_tokens: int = 5, +) -> CreateEmbeddingResponse: + """Helper to create a mock OpenAI embeddings response.""" + data = [OpenAIEmbedding(embedding=emb, index=i, object="embedding") for i, emb in enumerate(embeddings)] + return CreateEmbeddingResponse( + data=data, + model=model, + object="list", + usage=Usage(prompt_tokens=prompt_tokens, total_tokens=total_tokens), + ) + + +@pytest.fixture +def openai_unit_test_env(monkeypatch: pytest.MonkeyPatch) -> None: + """Set up environment variables for OpenAI embedding client.""" + monkeypatch.setenv("OPENAI_API_KEY", "test-api-key") + monkeypatch.setenv("OPENAI_EMBEDDING_MODEL_ID", "text-embedding-3-small") + + +# --- OpenAI unit tests --- + + +def test_openai_construction_with_explicit_params() -> None: + client = OpenAIEmbeddingClient( + model_id="text-embedding-3-small", + api_key="test-key", + ) + assert client.model_id == "text-embedding-3-small" + + +def test_openai_construction_from_env(openai_unit_test_env: None) -> None: + client = OpenAIEmbeddingClient() + assert client.model_id == "text-embedding-3-small" + + +def test_openai_construction_missing_api_key_raises() -> None: + with pytest.raises(ValueError, match="API key is required"): + OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + +def test_openai_construction_missing_model_raises() -> None: + with pytest.raises(ValueError, match="model ID is required"): + OpenAIEmbeddingClient(api_key="test-key") + + +async def test_openai_get_embeddings(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response( + embeddings=[[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], + ) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + result = await client.get_embeddings(["hello", "world"]) + + assert len(result) == 2 + assert result[0].vector == [0.1, 0.2, 0.3] + assert result[1].vector == [0.4, 0.5, 0.6] + assert result[0].model_id == "text-embedding-3-small" + assert result[0].dimensions == 3 + + +async def test_openai_get_embeddings_usage(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response( + embeddings=[[0.1]], + prompt_tokens=10, + total_tokens=10, + ) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + result = await client.get_embeddings(["test"]) + + assert result.usage is not None + assert result.usage["prompt_tokens"] == 10 + assert result.usage["total_tokens"] == 10 + + +async def test_openai_options_passthrough_dimensions(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response(embeddings=[[0.1]]) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + options: OpenAIEmbeddingOptions = {"dimensions": 256} + result = await client.get_embeddings(["test"], options=options) + + call_kwargs = client.client.embeddings.create.call_args[1] + assert call_kwargs["dimensions"] == 256 + assert result.options is options + + +async def test_openai_options_passthrough_encoding_format(openai_unit_test_env: None) -> None: + mock_response = _make_openai_response(embeddings=[[0.1]]) + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + options: OpenAIEmbeddingOptions = {"encoding_format": "base64"} + await client.get_embeddings(["test"], options=options) + + call_kwargs = client.client.embeddings.create.call_args[1] + assert call_kwargs["encoding_format"] == "base64" + + +async def test_openai_error_when_no_model_id() -> None: + client = OpenAIEmbeddingClient.__new__(OpenAIEmbeddingClient) + client.model_id = None + client.client = MagicMock() + client.additional_properties = {} + client.otel_provider_name = "openai" + + with pytest.raises(ValueError, match="model_id is required"): + await client.get_embeddings(["test"]) + + +# --- Azure OpenAI unit tests --- + + +def test_azure_construction_with_deployment_name() -> None: + client = AzureOpenAIEmbeddingClient( + deployment_name="text-embedding-3-small", + api_key="test-key", + endpoint="https://test.openai.azure.com/", + ) + assert client.model_id == "text-embedding-3-small" + + +def test_azure_construction_with_existing_client() -> None: + mock_client = MagicMock() + client = AzureOpenAIEmbeddingClient( + deployment_name="my-deployment", + async_client=mock_client, + ) + assert client.model_id == "my-deployment" + assert client.client is mock_client + + +def test_azure_construction_missing_deployment_name_raises() -> None: + with pytest.raises(ValueError, match="deployment name is required"): + AzureOpenAIEmbeddingClient( + api_key="test-key", + endpoint="https://test.openai.azure.com/", + ) + + +def test_azure_construction_missing_credentials_raises() -> None: + with pytest.raises(ValueError, match="api_key, credential, or a client"): + AzureOpenAIEmbeddingClient( + deployment_name="test", + endpoint="https://test.openai.azure.com/", + ) + + +async def test_azure_get_embeddings() -> None: + mock_response = _make_openai_response( + embeddings=[[0.1, 0.2]], + ) + mock_async_client = MagicMock() + mock_async_client.embeddings = MagicMock() + mock_async_client.embeddings.create = AsyncMock(return_value=mock_response) + + client = AzureOpenAIEmbeddingClient( + deployment_name="text-embedding-3-small", + async_client=mock_async_client, + ) + + result = await client.get_embeddings(["hello"]) + + assert len(result) == 1 + assert result[0].vector == [0.1, 0.2] + + +def test_azure_otel_provider_name() -> None: + mock_client = MagicMock() + client = AzureOpenAIEmbeddingClient( + deployment_name="test", + async_client=mock_client, + ) + assert client.OTEL_PROVIDER_NAME == "azure.ai.openai" + + +# --- Integration tests --- + +skip_if_openai_integration_tests_disabled = pytest.mark.skipif( + os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true" + or os.getenv("OPENAI_API_KEY", "") in ("", "test-dummy-key"), + reason="No real OPENAI_API_KEY provided; skipping integration tests." + if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true" + else "Integration tests are disabled.", +) + +skip_if_azure_openai_integration_tests_disabled = pytest.mark.skipif( + os.getenv("RUN_INTEGRATION_TESTS", "false").lower() != "true" + or not os.getenv("AZURE_OPENAI_ENDPOINT") + or (not os.getenv("AZURE_OPENAI_API_KEY") and not os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")), + reason="No Azure OpenAI credentials provided; skipping integration tests." + if os.getenv("RUN_INTEGRATION_TESTS", "false").lower() == "true" + else "Integration tests are disabled.", +) + + +@skip_if_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_openai_get_embeddings() -> None: + """End-to-end test of OpenAI embedding generation.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + result = await client.get_embeddings(["hello world"]) + + assert len(result) == 1 + assert isinstance(result[0].vector, list) + assert len(result[0].vector) > 0 + assert all(isinstance(v, float) for v in result[0].vector) + assert result[0].model_id is not None + assert result.usage is not None + assert result.usage["prompt_tokens"] > 0 + + +@skip_if_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_openai_get_embeddings_multiple() -> None: + """Test embedding generation for multiple inputs.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + result = await client.get_embeddings(["hello", "world", "test"]) + + assert len(result) == 3 + dims = [len(e.vector) for e in result] + assert all(d == dims[0] for d in dims) + + +@skip_if_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_openai_get_embeddings_with_dimensions() -> None: + """Test embedding generation with custom dimensions.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + options: OpenAIEmbeddingOptions = {"dimensions": 256} + result = await client.get_embeddings(["hello world"], options=options) + + assert len(result) == 1 + assert len(result[0].vector) == 256 + + +@skip_if_azure_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_azure_openai_get_embeddings() -> None: + """End-to-end test of Azure OpenAI embedding generation.""" + client = AzureOpenAIEmbeddingClient() + + result = await client.get_embeddings(["hello world"]) + + assert len(result) == 1 + assert isinstance(result[0].vector, list) + assert len(result[0].vector) > 0 + assert all(isinstance(v, float) for v in result[0].vector) + assert result[0].model_id is not None + assert result.usage is not None + assert result.usage["prompt_tokens"] > 0 + + +@skip_if_azure_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_azure_openai_get_embeddings_multiple() -> None: + """Test Azure OpenAI embedding generation for multiple inputs.""" + client = AzureOpenAIEmbeddingClient() + + result = await client.get_embeddings(["hello", "world", "test"]) + + assert len(result) == 3 + dims = [len(e.vector) for e in result] + assert all(d == dims[0] for d in dims) + + +@skip_if_azure_openai_integration_tests_disabled +@pytest.mark.flaky +async def test_integration_azure_openai_get_embeddings_with_dimensions() -> None: + """Test Azure OpenAI embedding generation with custom dimensions.""" + client = AzureOpenAIEmbeddingClient() + + options: OpenAIEmbeddingOptions = {"dimensions": 256} + result = await client.get_embeddings(["hello world"], options=options) + + assert len(result) == 1 + assert len(result[0].vector) == 256 diff --git a/python/packages/core/tests/workflow/test_full_conversation.py b/python/packages/core/tests/workflow/test_full_conversation.py index 23861ecc69..20d9abd8c0 100644 --- a/python/packages/core/tests/workflow/test_full_conversation.py +++ b/python/packages/core/tests/workflow/test_full_conversation.py @@ -362,9 +362,7 @@ async def test_run_request_with_full_history_clears_service_session_id() -> None """Replaying a full conversation (including function calls) via AgentExecutorRequest must clear service_session_id so the API does not receive both previous_response_id and the same function-call items in input — which would cause a 'Duplicate item' API error.""" - tool_agent = _ToolHistoryAgent( - id="tool_agent", name="ToolAgent", summary_text="Done." - ) + tool_agent = _ToolHistoryAgent(id="tool_agent", name="ToolAgent", summary_text="Done.") tool_exec = AgentExecutor(tool_agent, id="tool_agent") spy_agent = _SessionIdCapturingAgent(id="spy_agent", name="SpyAgent") @@ -393,9 +391,7 @@ async def test_from_response_preserves_service_session_id() -> None: """from_response hands off a prior agent's full conversation to the next executor. The receiving executor's service_session_id is preserved so the API can continue the conversation using previous_response_id.""" - tool_agent = _ToolHistoryAgent( - id="tool_agent2", name="ToolAgent", summary_text="Done." - ) + tool_agent = _ToolHistoryAgent(id="tool_agent2", name="ToolAgent", summary_text="Done.") tool_exec = AgentExecutor(tool_agent, id="tool_agent2") spy_agent = _SessionIdCapturingAgent(id="spy_agent2", name="SpyAgent") @@ -403,11 +399,7 @@ async def test_from_response_preserves_service_session_id() -> None: # Simulate a prior run on the spy executor. spy_exec._session.service_session_id = "resp_PREVIOUS_RUN" # pyright: ignore[reportPrivateUsage] - wf = ( - WorkflowBuilder(start_executor=tool_exec, output_executors=[spy_exec]) - .add_edge(tool_exec, spy_exec) - .build() - ) + wf = WorkflowBuilder(start_executor=tool_exec, output_executors=[spy_exec]).add_edge(tool_exec, spy_exec).build() result = await wf.run("start") assert result.get_outputs() is not None diff --git a/python/samples/02-agents/embeddings/azure_openai_embeddings.py b/python/samples/02-agents/embeddings/azure_openai_embeddings.py new file mode 100644 index 0000000000..16669eb51f --- /dev/null +++ b/python/samples/02-agents/embeddings/azure_openai_embeddings.py @@ -0,0 +1,70 @@ +# Copyright (c) Microsoft. All rights reserved. + +# Run with: uv run samples/02-agents/embeddings/azure_openai_embeddings.py + + +import asyncio + +from agent_framework.azure import AzureOpenAIEmbeddingClient +from dotenv import load_dotenv + +load_dotenv() + +"""Azure OpenAI Embedding Client Example + +This sample demonstrates how to generate embeddings using the Azure OpenAI embedding client. +It supports both API key and Azure credential authentication. + +Prerequisites: + Set the following environment variables or add them to a .env file: + - AZURE_OPENAI_ENDPOINT: Your Azure OpenAI endpoint URL + - AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME: The embedding model deployment name + - AZURE_OPENAI_API_KEY: Your API key (or use Azure credential instead) +""" + + +async def main() -> None: + """Generate embeddings with Azure OpenAI.""" + # 1. Create a client using environment variables. + # Reads AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME, + # and AZURE_OPENAI_API_KEY from environment. + client = AzureOpenAIEmbeddingClient() + + # 2. Generate a single embedding. + result = await client.get_embeddings(["Hello, world!"]) + print(f"Single embedding dimensions: {result[0].dimensions}") + print(f"First 5 values: {result[0].vector[:5]}") + print(f"Model: {result[0].model_id}") + print(f"Usage: {result.usage}") + print() + + # 3. Generate embeddings for multiple inputs. + texts = [ + "The weather is sunny today.", + "It is raining outside.", + "Machine learning is fascinating.", + ] + result = await client.get_embeddings(texts) + print(f"Batch of {len(result)} embeddings, each with {result[0].dimensions} dimensions") + print() + + # 4. Generate embeddings with custom dimensions. + result = await client.get_embeddings(["Custom dimensions example"], options={"dimensions": 256}) + print(f"Custom dimensions: {result[0].dimensions}") + + +if __name__ == "__main__": + asyncio.run(main()) + + +""" +Sample output: +Single embedding dimensions: 1536 +First 5 values: [0.012, -0.034, 0.056, -0.078, 0.090] +Model: text-embedding-3-small +Usage: {'prompt_tokens': 4, 'total_tokens': 4} + +Batch of 3 embeddings, each with 1536 dimensions + +Custom dimensions: 256 +""" diff --git a/python/samples/02-agents/embeddings/openai_embeddings.py b/python/samples/02-agents/embeddings/openai_embeddings.py new file mode 100644 index 0000000000..56fac52814 --- /dev/null +++ b/python/samples/02-agents/embeddings/openai_embeddings.py @@ -0,0 +1,65 @@ +# Copyright (c) Microsoft. All rights reserved. + +# Run with: uv run samples/02-agents/embeddings/openai_embeddings.py + +import asyncio + +from agent_framework.openai import OpenAIEmbeddingClient, OpenAIEmbeddingOptions +from dotenv import load_dotenv + +load_dotenv() + +"""OpenAI Embedding Client Example + +This sample demonstrates how to generate embeddings using the OpenAI embedding client. +It shows single and batch embedding generation, as well as custom dimensions. + +Prerequisites: + Set the OPENAI_API_KEY environment variable or add it to a .env file. +""" + + +async def main() -> None: + """Generate embeddings with OpenAI.""" + client = OpenAIEmbeddingClient(model_id="text-embedding-3-small") + + # 1. Generate a single embedding. + result = await client.get_embeddings(["Hello, world!"]) + print(f"Single embedding dimensions: {result[0].dimensions}") + print(f"First 5 values: {result[0].vector[:5]}") + print(f"Model: {result[0].model_id}") + print(f"Usage: {result.usage}") + print() + + # 2. Generate embeddings for multiple inputs. + texts = [ + "The weather is sunny today.", + "It is raining outside.", + "Machine learning is fascinating.", + ] + result = await client.get_embeddings(texts) + print(f"Batch of {len(result)} embeddings, each with {result[0].dimensions} dimensions") + print(f"First embedding vector: {result[0].vector[:5]}") # Print first 5 values of the first embedding + print() + + # 3. Generate embeddings with custom dimensions. + result = await client.get_embeddings(["Custom dimensions example"], options={"dimensions": 256}) + print(f"Custom dimensions: {result[0].dimensions}") + print(f"First 5 values: {result[0].vector[:5]}") + + +if __name__ == "__main__": + asyncio.run(main()) + + +""" +Sample output: +Single embedding dimensions: 1536 +First 5 values: [0.012, -0.034, 0.056, -0.078, 0.090] +Model: text-embedding-3-small +Usage: {'prompt_tokens': 4, 'total_tokens': 4} + +Batch of 3 embeddings, each with 1536 dimensions + +Custom dimensions: 256 +""" From 000362a228265145a97a190d506a461a709f6498 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:04:36 +0100 Subject: [PATCH 02/14] fix: Add AzureOpenAIEmbeddingClient to azure __init__.pyi stub Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/core/agent_framework/azure/__init__.pyi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/python/packages/core/agent_framework/azure/__init__.pyi b/python/packages/core/agent_framework/azure/__init__.pyi index 4d6e3b914c..238411f8d7 100644 --- a/python/packages/core/agent_framework/azure/__init__.pyi +++ b/python/packages/core/agent_framework/azure/__init__.pyi @@ -21,6 +21,7 @@ from agent_framework_durabletask import ( from agent_framework.azure._assistants_client import AzureOpenAIAssistantsClient from agent_framework.azure._chat_client import AzureOpenAIChatClient +from agent_framework.azure._embedding_client import AzureOpenAIEmbeddingClient from agent_framework.azure._entra_id_authentication import AzureCredentialTypes, AzureTokenProvider from agent_framework.azure._responses_client import AzureOpenAIResponsesClient from agent_framework.azure._shared import AzureOpenAISettings @@ -40,6 +41,7 @@ __all__ = [ "AzureCredentialTypes", "AzureOpenAIAssistantsClient", "AzureOpenAIChatClient", + "AzureOpenAIEmbeddingClient", "AzureOpenAIResponsesClient", "AzureOpenAISettings", "AzureTokenProvider", From 303f187d32bba417184ac12bbc55fc0e3c5d1c58 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:07:10 +0100 Subject: [PATCH 03/14] ci: Add embedding env vars to Python integration tests Map OPENAI_EMBEDDING_MODEL_ID and AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME from GitHub vars to the integration test environment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/python-merge-tests.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/python-merge-tests.yml b/.github/workflows/python-merge-tests.yml index 8704ec56c1..966b5ad361 100644 --- a/.github/workflows/python-merge-tests.yml +++ b/.github/workflows/python-merge-tests.yml @@ -71,8 +71,10 @@ jobs: OPENAI_API_KEY: ${{ secrets.OPENAI__APIKEY }} ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} ANTHROPIC_CHAT_MODEL_ID: ${{ vars.ANTHROPIC_CHAT_MODEL_ID }} + OPENAI_EMBEDDING_MODEL_ID: ${{ vars.OPENAI__EMBEDDINGMODELID }} AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ vars.AZUREOPENAI__CHATDEPLOYMENTNAME }} AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME: ${{ vars.AZUREOPENAI__RESPONSESDEPLOYMENTNAME }} + AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME: ${{ vars.AZUREOPENAI__EMBEDDINGDEPLOYMENTNAME }} AZURE_OPENAI_ENDPOINT: ${{ vars.AZUREOPENAI__ENDPOINT }} LOCAL_MCP_URL: ${{ vars.LOCAL_MCP__URL }} # For Azure Functions integration tests From 82c67ac742f61660025fddee63106b1a94259713 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:13:38 +0100 Subject: [PATCH 04/14] fix: Handle base64 encoding_format in OpenAI embedding client When encoding_format='base64' is used, the OpenAI API returns base64-encoded floats instead of a JSON array. Decode these automatically to list[float] so the return type stays consistent regardless of encoding format. Also adds a unit test for base64 decoding and fixes minor docstring/import issues. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../packages/core/agent_framework/_clients.py | 2 +- .../openai/_embedding_client.py | 25 ++++++++++---- .../openai/test_openai_embedding_client.py | 33 +++++++++++++++++++ .../02-agents/embeddings/openai_embeddings.py | 2 +- 4 files changed, 53 insertions(+), 9 deletions(-) diff --git a/python/packages/core/agent_framework/_clients.py b/python/packages/core/agent_framework/_clients.py index 33e049be54..d3cf818408 100644 --- a/python/packages/core/agent_framework/_clients.py +++ b/python/packages/core/agent_framework/_clients.py @@ -692,7 +692,7 @@ class SupportsGetEmbeddings(Protocol[EmbeddingInputT, EmbeddingT, EmbeddingOptio from agent_framework import SupportsGetEmbeddings - def use_embeddings(client: SupportsGetEmbeddings) -> None: + async def use_embeddings(client: SupportsGetEmbeddings) -> None: result = await client.get_embeddings(["Hello, world!"]) for embedding in result: print(embedding.vector) diff --git a/python/packages/core/agent_framework/openai/_embedding_client.py b/python/packages/core/agent_framework/openai/_embedding_client.py index fb98774c40..a11557774e 100644 --- a/python/packages/core/agent_framework/openai/_embedding_client.py +++ b/python/packages/core/agent_framework/openai/_embedding_client.py @@ -2,6 +2,8 @@ from __future__ import annotations +import base64 +import struct import sys from collections.abc import Awaitable, Callable, Mapping, Sequence from typing import Any, Generic, Literal, TypedDict @@ -89,14 +91,23 @@ async def get_embeddings( response = await (await self._ensure_client()).embeddings.create(**kwargs) - embeddings = [ - Embedding( - vector=item.embedding, - dimensions=len(item.embedding), - model_id=response.model, + encoding = kwargs.get("encoding_format", "float") + embeddings: list[Embedding[list[float]]] = [] + for item in response.data: + vector: list[float] + if encoding == "base64" and isinstance(item.embedding, str): + # Decode base64-encoded floats (little-endian IEEE 754) + raw = base64.b64decode(item.embedding) + vector = list(struct.unpack(f"<{len(raw) // 4}f", raw)) + else: + vector = item.embedding # type: ignore[assignment] + embeddings.append( + Embedding( + vector=vector, + dimensions=len(vector), + model_id=response.model, + ) ) - for item in response.data - ] usage_dict: dict[str, Any] | None = None if response.usage: diff --git a/python/packages/core/tests/openai/test_openai_embedding_client.py b/python/packages/core/tests/openai/test_openai_embedding_client.py index 79bd94199f..9cb3de20f0 100644 --- a/python/packages/core/tests/openai/test_openai_embedding_client.py +++ b/python/packages/core/tests/openai/test_openai_embedding_client.py @@ -131,6 +131,39 @@ async def test_openai_options_passthrough_encoding_format(openai_unit_test_env: assert call_kwargs["encoding_format"] == "base64" +async def test_openai_base64_decoding(openai_unit_test_env: None) -> None: + import base64 + import struct + + # Encode [0.1, 0.2, 0.3] as base64 little-endian floats + raw_floats = [0.1, 0.2, 0.3] + b64_str = base64.b64encode(struct.pack(f"<{len(raw_floats)}f", *raw_floats)).decode() + + # Mock the embedding item to return a base64 string (as the API does with encoding_format=base64) + mock_item = MagicMock() + mock_item.embedding = b64_str + mock_item.index = 0 + + mock_response = MagicMock() + mock_response.data = [mock_item] + mock_response.model = "text-embedding-3-small" + mock_response.usage = MagicMock(prompt_tokens=3, total_tokens=3) + + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock(return_value=mock_response) + + options: OpenAIEmbeddingOptions = {"encoding_format": "base64"} + result = await client.get_embeddings(["test"], options=options) + + assert len(result) == 1 + assert len(result[0].vector) == 3 + assert result[0].dimensions == 3 + for expected, actual in zip(raw_floats, result[0].vector): + assert abs(expected - actual) < 1e-6 + + async def test_openai_error_when_no_model_id() -> None: client = OpenAIEmbeddingClient.__new__(OpenAIEmbeddingClient) client.model_id = None diff --git a/python/samples/02-agents/embeddings/openai_embeddings.py b/python/samples/02-agents/embeddings/openai_embeddings.py index 56fac52814..62d044fd72 100644 --- a/python/samples/02-agents/embeddings/openai_embeddings.py +++ b/python/samples/02-agents/embeddings/openai_embeddings.py @@ -4,7 +4,7 @@ import asyncio -from agent_framework.openai import OpenAIEmbeddingClient, OpenAIEmbeddingOptions +from agent_framework.openai import OpenAIEmbeddingClient from dotenv import load_dotenv load_dotenv() From 1288080cdb3eeb4c960c627b53da574a2dc622d8 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:14:13 +0100 Subject: [PATCH 05/14] fix: Only record INPUT_TOKENS for embedding telemetry Embeddings have no output/completion tokens. Remove OUTPUT_TOKENS recording which was double-counting prompt_tokens via the total_tokens fallback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/core/agent_framework/observability.py | 4 ---- 1 file changed, 4 deletions(-) diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 160a16af4d..81de7c4686 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -1328,10 +1328,6 @@ async def get_embeddings( if result.usage: if "prompt_tokens" in result.usage: response_attributes[OtelAttr.INPUT_TOKENS] = result.usage["prompt_tokens"] - if "total_tokens" in result.usage: - response_attributes[OtelAttr.OUTPUT_TOKENS] = result.usage.get( - "completion_tokens", result.usage["total_tokens"] - ) _capture_response( span=span, attributes=response_attributes, From 231847e85110741525defaf58e8100c5f480210f Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:15:29 +0100 Subject: [PATCH 06/14] fix: Resolve mypy variance error and lint warning Use contravariant/covariant TypeVars for SupportsGetEmbeddings Protocol. Combine nested if into single statement in telemetry layer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../packages/core/agent_framework/_clients.py | 18 ++++++++++++++---- .../core/agent_framework/observability.py | 5 ++--- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/python/packages/core/agent_framework/_clients.py b/python/packages/core/agent_framework/_clients.py index d3cf818408..c9a7af6f34 100644 --- a/python/packages/core/agent_framework/_clients.py +++ b/python/packages/core/agent_framework/_clients.py @@ -667,7 +667,17 @@ def get_file_search_tool(**kwargs: Any) -> Any: # region SupportsGetEmbeddings Protocol -# Contravariant for the Protocol +# Contravariant/covariant TypeVars for the Protocol +EmbeddingInputContraT = TypeVar( + "EmbeddingInputContraT", + default="str", + contravariant=True, +) +EmbeddingCoT = TypeVar( + "EmbeddingCoT", + default="list[float]", + covariant=True, +) EmbeddingOptionsContraT = TypeVar( "EmbeddingOptionsContraT", bound=TypedDict, # type: ignore[valid-type] @@ -677,7 +687,7 @@ def get_file_search_tool(**kwargs: Any) -> Any: @runtime_checkable -class SupportsGetEmbeddings(Protocol[EmbeddingInputT, EmbeddingT, EmbeddingOptionsContraT]): +class SupportsGetEmbeddings(Protocol[EmbeddingInputContraT, EmbeddingCoT, EmbeddingOptionsContraT]): """Protocol for an embedding client that can generate embeddings. This protocol enables duck-typing for embedding generation. Any class that @@ -702,10 +712,10 @@ async def use_embeddings(client: SupportsGetEmbeddings) -> None: def get_embeddings( self, - values: Sequence[EmbeddingInputT], + values: Sequence[EmbeddingInputContraT], *, options: EmbeddingOptionsContraT | None = None, - ) -> Awaitable[GeneratedEmbeddings[EmbeddingT]]: + ) -> Awaitable[GeneratedEmbeddings[EmbeddingCoT]]: """Generate embeddings for the given values. Args: diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 81de7c4686..701970d9d5 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -1325,9 +1325,8 @@ async def get_embeddings( raise duration = perf_counter() - start_time_stamp response_attributes: dict[str, Any] = {**attributes} - if result.usage: - if "prompt_tokens" in result.usage: - response_attributes[OtelAttr.INPUT_TOKENS] = result.usage["prompt_tokens"] + if result.usage and "prompt_tokens" in result.usage: + response_attributes[OtelAttr.INPUT_TOKENS] = result.usage["prompt_tokens"] _capture_response( span=span, attributes=response_attributes, From 6b2cb6f7f484872449a4b36090f1367a480090c0 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Sun, 22 Feb 2026 15:23:30 +0100 Subject: [PATCH 07/14] fix: Make EmbeddingCoT invariant for mypy compatibility GeneratedEmbeddings is invariant in its type param, so the Protocol TypeVar cannot be covariant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/features/vector-stores-and-embeddings/README.md | 2 +- python/packages/core/agent_framework/_clients.py | 1 - python/packages/core/agent_framework/_types.py | 4 ---- 3 files changed, 1 insertion(+), 6 deletions(-) diff --git a/docs/features/vector-stores-and-embeddings/README.md b/docs/features/vector-stores-and-embeddings/README.md index 8627a9d5ea..851e47afe1 100644 --- a/docs/features/vector-stores-and-embeddings/README.md +++ b/docs/features/vector-stores-and-embeddings/README.md @@ -366,7 +366,7 @@ Each connector follows the AF package structure: 4. **`from __future__ import annotations`**: Required in all files per AF coding standard. -5. **No `**kwargs` escape hatches**: Use explicit named parameters per AF coding standard. +5. **No `**kwargs` escape hatches in public APIs**: For user-facing interfaces, use explicit named parameters per AF coding standard. Internal implementation details (e.g., cooperative multiple inheritance / MRO patterns) may use `**kwargs` where necessary, as long as they are not exposed in public signatures. 6. **Lazy loading**: Connector packages use `__getattr__` lazy loading in core provider folders. diff --git a/python/packages/core/agent_framework/_clients.py b/python/packages/core/agent_framework/_clients.py index c9a7af6f34..96e8dc0b6a 100644 --- a/python/packages/core/agent_framework/_clients.py +++ b/python/packages/core/agent_framework/_clients.py @@ -676,7 +676,6 @@ def get_file_search_tool(**kwargs: Any) -> Any: EmbeddingCoT = TypeVar( "EmbeddingCoT", default="list[float]", - covariant=True, ) EmbeddingOptionsContraT = TypeVar( "EmbeddingOptionsContraT", diff --git a/python/packages/core/agent_framework/_types.py b/python/packages/core/agent_framework/_types.py index c03aeda07f..77063902d2 100644 --- a/python/packages/core/agent_framework/_types.py +++ b/python/packages/core/agent_framework/_types.py @@ -33,10 +33,6 @@ from typing import TypeVar # pragma: no cover else: from typing_extensions import TypeVar # pragma: no cover -if sys.version_info >= (3, 12): - pass # pragma: no cover -else: - pass # pragma: no cover if sys.version_info >= (3, 11): from typing import TypedDict # type: ignore # pragma: no cover else: From 2b1c5ab996a3f22564e964115c7cfb767b2d13a6 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 23 Feb 2026 10:55:31 +0100 Subject: [PATCH 08/14] fix: Address PR review - empty values guard, service_url for telemetry - Add early return for empty values in get_embeddings to avoid unnecessary API calls - Add service_url() method to RawOpenAIEmbeddingClient for proper telemetry endpoint reporting - Add test for empty values behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../agent_framework/openai/_embedding_client.py | 9 ++++++++- .../tests/openai/test_openai_embedding_client.py | 13 +++++++++++++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/python/packages/core/agent_framework/openai/_embedding_client.py b/python/packages/core/agent_framework/openai/_embedding_client.py index a11557774e..0b59f9bf45 100644 --- a/python/packages/core/agent_framework/openai/_embedding_client.py +++ b/python/packages/core/agent_framework/openai/_embedding_client.py @@ -58,6 +58,10 @@ class RawOpenAIEmbeddingClient( ): """Raw OpenAI embedding client without telemetry.""" + def service_url(self) -> str: + """Get the URL of the service.""" + return str(self.client.base_url) if self.client else "Unknown" + async def get_embeddings( self, values: Sequence[str], @@ -74,8 +78,11 @@ async def get_embeddings( Generated embeddings with usage metadata. Raises: - ValueError: If model_id is not provided. + ValueError: If model_id is not provided or values is empty. """ + if not values: + return GeneratedEmbeddings([], options=options) + opts: dict[str, Any] = dict(options) if options else {} model = opts.get("model_id") or self.model_id if not model: diff --git a/python/packages/core/tests/openai/test_openai_embedding_client.py b/python/packages/core/tests/openai/test_openai_embedding_client.py index 9cb3de20f0..bf35aca6a4 100644 --- a/python/packages/core/tests/openai/test_openai_embedding_client.py +++ b/python/packages/core/tests/openai/test_openai_embedding_client.py @@ -175,6 +175,19 @@ async def test_openai_error_when_no_model_id() -> None: await client.get_embeddings(["test"]) +async def test_openai_empty_values_returns_empty(openai_unit_test_env: None) -> None: + client = OpenAIEmbeddingClient() + client.client = MagicMock() + client.client.embeddings = MagicMock() + client.client.embeddings.create = AsyncMock() + + result = await client.get_embeddings([]) + + assert len(result) == 0 + assert result.usage is None + client.client.embeddings.create.assert_not_called() + + # --- Azure OpenAI unit tests --- From aa1dc5345c9fc3b98393b8a903eca54f0519d664 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Mon, 23 Feb 2026 11:05:36 +0100 Subject: [PATCH 09/14] Python: Fix OpenAI chat client compatibility with third-party endpoints and OTel 0.4.14 (#4161) * Fix system message content sent as list instead of string Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages when content is a list of content parts. This change flattens system and developer message content to a plain string in the Chat Completions client. Fixes https://github.com/microsoft/agent-framework/issues/1407 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix compatibility with opentelemetry-semantic-conventions-ai 0.4.14 Version 0.4.14 removed several LLM_* attributes from SpanAttributes (LLM_SYSTEM, LLM_REQUEST_MODEL, LLM_RESPONSE_MODEL, LLM_REQUEST_MAX_TOKENS, LLM_REQUEST_TEMPERATURE, LLM_REQUEST_TOP_P, LLM_TOKEN_TYPE). Move these to the OtelAttr enum with their well-known gen_ai.* string values and update all references in observability.py and tests. Fixes https://github.com/microsoft/agent-framework/issues/4160 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Flatten text-only message content to string for all roles Extend the system/developer fix to all message roles. Text-only content lists are now post-processed into plain strings, while multimodal content (text + images/audio) remains as a list. This fixes compatibility with OpenAI-like endpoints that cannot deserialize list content (e.g. Foundry Local's Neutron backend). Partially fixes https://github.com/microsoft/agent-framework/issues/4084 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix streaming text lost when usage data in same chunk Some providers (e.g. Gemini) include both usage data and text content in the same streaming chunk. The early return on chunk.usage caused text and tool call parsing to be skipped entirely. Remove the early return and process usage alongside text/tool calls. Fixes https://github.com/microsoft/agent-framework/issues/3434 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix mypy errors in _chat_client.py Rename shadowed variable 'args' in system/developer branch to 'sys_args' and rename loop variable 'content' to 'msg_content' to avoid type conflict. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../core/agent_framework/observability.py | 32 ++-- .../agent_framework/openai/_chat_client.py | 56 +++++-- .../core/tests/core/test_observability.py | 21 ++- .../tests/openai/test_openai_chat_client.py | 150 +++++++++++++++++- 4 files changed, 214 insertions(+), 45 deletions(-) diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 701970d9d5..8f4ecbd708 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -210,6 +210,14 @@ class OtelAttr(str, Enum): INPUT_MESSAGES = "gen_ai.input.messages" OUTPUT_MESSAGES = "gen_ai.output.messages" SYSTEM_INSTRUCTIONS = "gen_ai.system_instructions" + # Attributes previously from opentelemetry-semantic-conventions-ai SpanAttributes, + # removed in v0.4.14. Defined here for forward compatibility. + SYSTEM = "gen_ai.system" + REQUEST_MAX_TOKENS = "gen_ai.request.max_tokens" + REQUEST_TEMPERATURE = "gen_ai.request.temperature" + REQUEST_TOP_P = "gen_ai.request.top_p" + REQUEST_MODEL = "gen_ai.request.model" + RESPONSE_MODEL = "gen_ai.response.model" # Workflow attributes WORKFLOW_ID = "workflow.id" @@ -1175,7 +1183,7 @@ def get_response( # in a different async context than creation — using use_span() would # cause "Failed to detach context" errors from OpenTelemetry. operation = attributes.get(OtelAttr.OPERATION, "operation") - span_name = attributes.get(SpanAttributes.LLM_REQUEST_MODEL, "unknown") + span_name = attributes.get(OtelAttr.REQUEST_MODEL, "unknown") span = get_tracer().start_span(f"{operation} {span_name}") span.set_attributes(attributes) if OBSERVABILITY_SETTINGS.SENSITIVE_DATA_ENABLED and messages: @@ -1237,7 +1245,7 @@ async def _finalize_stream() -> None: return wrapped_stream async def _get_response() -> ChatResponse: - with _get_span(attributes=attributes, span_name_attribute=SpanAttributes.LLM_REQUEST_MODEL) as span: + with _get_span(attributes=attributes, span_name_attribute=OtelAttr.REQUEST_MODEL) as span: if OBSERVABILITY_SETTINGS.SENSITIVE_DATA_ENABLED and messages: _capture_messages( span=span, @@ -1611,16 +1619,16 @@ def _get_instructions_from_options(options: Any) -> str | None: OTEL_ATTR_MAP: dict[str | tuple[str, ...], tuple[str, Callable[[Any], Any] | None, bool, Any]] = { "choice_count": (OtelAttr.CHOICE_COUNT, None, False, 1), "operation_name": (OtelAttr.OPERATION, None, False, None), - "system_name": (SpanAttributes.LLM_SYSTEM, None, False, None), + "system_name": (OtelAttr.SYSTEM, None, False, None), "provider_name": (OtelAttr.PROVIDER_NAME, None, False, None), "service_url": (OtelAttr.ADDRESS, None, False, None), "conversation_id": (OtelAttr.CONVERSATION_ID, None, True, None), "seed": (OtelAttr.SEED, None, True, None), "frequency_penalty": (OtelAttr.FREQUENCY_PENALTY, None, True, None), - "max_tokens": (SpanAttributes.LLM_REQUEST_MAX_TOKENS, None, True, None), + "max_tokens": (OtelAttr.REQUEST_MAX_TOKENS, None, True, None), "stop": (OtelAttr.STOP_SEQUENCES, None, True, None), - "temperature": (SpanAttributes.LLM_REQUEST_TEMPERATURE, None, True, None), - "top_p": (SpanAttributes.LLM_REQUEST_TOP_P, None, True, None), + "temperature": (OtelAttr.REQUEST_TEMPERATURE, None, True, None), + "top_p": (OtelAttr.REQUEST_TOP_P, None, True, None), "presence_penalty": (OtelAttr.PRESENCE_PENALTY, None, True, None), "top_k": (OtelAttr.TOP_K, None, True, None), "encoding_formats": ( @@ -1633,7 +1641,7 @@ def _get_instructions_from_options(options: Any) -> str | None: "agent_name": (OtelAttr.AGENT_NAME, None, False, None), "agent_description": (OtelAttr.AGENT_DESCRIPTION, None, False, None), # Multiple source keys - checks model_id in options, then model in kwargs, then model_id in kwargs - ("model_id", "model"): (SpanAttributes.LLM_REQUEST_MODEL, None, True, None), + ("model_id", "model"): (OtelAttr.REQUEST_MODEL, None, True, None), # Tools with validation - returns None if no valid tools "tools": ( OtelAttr.TOOL_DEFINITIONS, @@ -1790,7 +1798,7 @@ def _get_response_attributes( if finish_reason: attributes[OtelAttr.FINISH_REASONS] = json.dumps([finish_reason]) if model_id := getattr(response, "model_id", None): - attributes[SpanAttributes.LLM_RESPONSE_MODEL] = model_id + attributes[OtelAttr.RESPONSE_MODEL] = model_id if capture_usage and (usage := response.usage_details): if usage.get("input_token_count"): attributes[OtelAttr.INPUT_TOKENS] = usage["input_token_count"] @@ -1802,8 +1810,8 @@ def _get_response_attributes( GEN_AI_METRIC_ATTRIBUTES = ( OtelAttr.OPERATION, OtelAttr.PROVIDER_NAME, - SpanAttributes.LLM_REQUEST_MODEL, - SpanAttributes.LLM_RESPONSE_MODEL, + OtelAttr.REQUEST_MODEL, + OtelAttr.RESPONSE_MODEL, OtelAttr.ADDRESS, OtelAttr.PORT, ) @@ -1821,10 +1829,10 @@ def _capture_response( attrs: dict[str, Any] = {k: v for k, v in attributes.items() if k in GEN_AI_METRIC_ATTRIBUTES} if token_usage_histogram and (input_tokens := attributes.get(OtelAttr.INPUT_TOKENS)): token_usage_histogram.record( - input_tokens, attributes={**attrs, SpanAttributes.LLM_TOKEN_TYPE: OtelAttr.T_TYPE_INPUT} + input_tokens, attributes={**attrs, OtelAttr.T_TYPE: OtelAttr.T_TYPE_INPUT} ) if token_usage_histogram and (output_tokens := attributes.get(OtelAttr.OUTPUT_TOKENS)): - token_usage_histogram.record(output_tokens, {**attrs, SpanAttributes.LLM_TOKEN_TYPE: OtelAttr.T_TYPE_OUTPUT}) + token_usage_histogram.record(output_tokens, {**attrs, OtelAttr.T_TYPE: OtelAttr.T_TYPE_OUTPUT}) if operation_duration_histogram and duration is not None: if OtelAttr.ERROR_TYPE in attributes: attrs[OtelAttr.ERROR_TYPE] = attributes[OtelAttr.ERROR_TYPE] diff --git a/python/packages/core/agent_framework/openai/_chat_client.py b/python/packages/core/agent_framework/openai/_chat_client.py index 60e0daaf2b..5d6f66491c 100644 --- a/python/packages/core/agent_framework/openai/_chat_client.py +++ b/python/packages/core/agent_framework/openai/_chat_client.py @@ -5,7 +5,14 @@ import json import logging import sys -from collections.abc import AsyncIterable, Awaitable, Callable, Mapping, MutableMapping, Sequence +from collections.abc import ( + AsyncIterable, + Awaitable, + Callable, + Mapping, + MutableMapping, + Sequence, +) from datetime import datetime, timezone from itertools import chain from typing import Any, Generic, Literal @@ -16,7 +23,9 @@ from openai.types.chat.chat_completion import ChatCompletion, Choice from openai.types.chat.chat_completion_chunk import ChatCompletionChunk from openai.types.chat.chat_completion_chunk import Choice as ChunkChoice -from openai.types.chat.chat_completion_message_custom_tool_call import ChatCompletionMessageCustomToolCall +from openai.types.chat.chat_completion_message_custom_tool_call import ( + ChatCompletionMessageCustomToolCall, +) from openai.types.chat.completion_create_params import WebSearchOptions from pydantic import BaseModel @@ -395,21 +404,18 @@ def _parse_response_update_from_openai( ) -> ChatResponseUpdate: """Parse a streaming response update from OpenAI.""" chunk_metadata = self._get_metadata_from_streaming_chat_response(chunk) - if chunk.usage: - return ChatResponseUpdate( - role="assistant", - contents=[ - Content.from_usage( - usage_details=self._parse_usage_from_openai(chunk.usage), raw_representation=chunk - ) - ], - model_id=chunk.model, - additional_properties=chunk_metadata, - response_id=chunk.id, - message_id=chunk.id, - ) contents: list[Content] = [] finish_reason: FinishReason | None = None + + # Process usage data (may coexist with text/tool content in providers like Gemini). + # See https://github.com/microsoft/agent-framework/issues/3434 + if chunk.usage: + contents.append( + Content.from_usage( + usage_details=self._parse_usage_from_openai(chunk.usage), raw_representation=chunk + ) + ) + for choice in chunk.choices: chunk_metadata.update(self._get_metadata_from_chat_choice(choice)) contents.extend(self._parse_tool_calls_from_openai(choice)) @@ -532,6 +538,17 @@ def _prepare_messages_for_openai( def _prepare_message_for_openai(self, message: Message) -> list[dict[str, Any]]: """Prepare a chat message for OpenAI.""" + # System/developer messages must use plain string content because some + # OpenAI-compatible endpoints reject list content for non-user roles. + if message.role in ("system", "developer"): + texts = [content.text for content in message.contents if content.type == "text" and content.text] + if texts: + sys_args: dict[str, Any] = {"role": message.role, "content": "\n".join(texts)} + if message.author_name: + sys_args["name"] = message.author_name + return [sys_args] + return [] + all_messages: list[dict[str, Any]] = [] for content in message.contents: # Skip approval content - it's internal framework state, not for the LLM @@ -568,6 +585,15 @@ def _prepare_message_for_openai(self, message: Message) -> list[dict[str, Any]]: args["content"].append(self._prepare_content_for_openai(content)) # type: ignore if "content" in args or "tool_calls" in args: all_messages.append(args) + + # Flatten text-only content lists to plain strings for broader + # compatibility with OpenAI-like endpoints (e.g. Foundry Local). + # See https://github.com/microsoft/agent-framework/issues/4084 + for msg in all_messages: + msg_content: Any = msg.get("content") + if isinstance(msg_content, list) and all(isinstance(c, dict) and c.get("type") == "text" for c in msg_content): + msg["content"] = "\n".join(c.get("text", "") for c in msg_content) + return all_messages def _prepare_content_for_openai(self, content: Content) -> dict[str, Any]: diff --git a/python/packages/core/tests/core/test_observability.py b/python/packages/core/tests/core/test_observability.py index fccaf2f9f1..0e81b7580c 100644 --- a/python/packages/core/tests/core/test_observability.py +++ b/python/packages/core/tests/core/test_observability.py @@ -7,7 +7,6 @@ import pytest from opentelemetry.sdk.trace.export.in_memory_span_exporter import InMemorySpanExporter -from opentelemetry.semconv_ai import SpanAttributes from opentelemetry.trace import StatusCode from agent_framework import ( @@ -48,8 +47,8 @@ def test_role_event_map(): def test_enum_values(): """Test that OtelAttr enum has expected values.""" assert OtelAttr.OPERATION == "gen_ai.operation.name" - assert SpanAttributes.LLM_SYSTEM == "gen_ai.system" - assert SpanAttributes.LLM_REQUEST_MODEL == "gen_ai.request.model" + assert OtelAttr.SYSTEM == "gen_ai.system" + assert OtelAttr.REQUEST_MODEL == "gen_ai.request.model" assert OtelAttr.CHAT_COMPLETION_OPERATION == "chat" assert OtelAttr.TOOL_EXECUTION_OPERATION == "execute_tool" assert OtelAttr.AGENT_INVOKE_OPERATION == "invoke_agent" @@ -213,7 +212,7 @@ async def test_chat_client_observability(mock_chat_client, span_exporter: InMemo span = spans[0] assert span.name == "chat Test" assert span.attributes[OtelAttr.OPERATION.value] == OtelAttr.CHAT_COMPLETION_OPERATION - assert span.attributes[SpanAttributes.LLM_REQUEST_MODEL] == "Test" + assert span.attributes[OtelAttr.REQUEST_MODEL] == "Test" assert span.attributes[OtelAttr.INPUT_TOKENS] == 10 assert span.attributes[OtelAttr.OUTPUT_TOKENS] == 20 if enable_sensitive_data: @@ -243,7 +242,7 @@ async def test_chat_client_streaming_observability( span = spans[0] assert span.name == "chat Test" assert span.attributes[OtelAttr.OPERATION.value] == OtelAttr.CHAT_COMPLETION_OPERATION - assert span.attributes[SpanAttributes.LLM_REQUEST_MODEL] == "Test" + assert span.attributes[OtelAttr.REQUEST_MODEL] == "Test" if enable_sensitive_data: assert span.attributes[OtelAttr.INPUT_MESSAGES] is not None assert span.attributes[OtelAttr.OUTPUT_MESSAGES] is not None @@ -392,7 +391,7 @@ async def test_chat_client_without_model_id_observability(mock_chat_client, span assert span.name == "chat unknown" assert span.attributes[OtelAttr.OPERATION.value] == OtelAttr.CHAT_COMPLETION_OPERATION - assert span.attributes[SpanAttributes.LLM_REQUEST_MODEL] == "unknown" + assert span.attributes[OtelAttr.REQUEST_MODEL] == "unknown" async def test_chat_client_streaming_without_model_id_observability( @@ -416,7 +415,7 @@ async def test_chat_client_streaming_without_model_id_observability( span = spans[0] assert span.name == "chat unknown" assert span.attributes[OtelAttr.OPERATION.value] == OtelAttr.CHAT_COMPLETION_OPERATION - assert span.attributes[SpanAttributes.LLM_REQUEST_MODEL] == "unknown" + assert span.attributes[OtelAttr.REQUEST_MODEL] == "unknown" def test_prepend_user_agent_with_none_value(): @@ -491,7 +490,7 @@ async def test_agent_instrumentation_enabled( assert span.attributes[OtelAttr.AGENT_ID] == "test_agent_id" assert span.attributes[OtelAttr.AGENT_NAME] == "test_agent" assert span.attributes[OtelAttr.AGENT_DESCRIPTION] == "Test agent description" - assert span.attributes[SpanAttributes.LLM_REQUEST_MODEL] == "TestModel" + assert span.attributes[OtelAttr.REQUEST_MODEL] == "TestModel" assert span.attributes[OtelAttr.INPUT_TOKENS] == 15 assert span.attributes[OtelAttr.OUTPUT_TOKENS] == 25 if enable_sensitive_data: @@ -521,7 +520,7 @@ async def test_agent_streaming_response_with_diagnostics_enabled( assert span.attributes[OtelAttr.AGENT_ID] == "test_agent_id" assert span.attributes[OtelAttr.AGENT_NAME] == "test_agent" assert span.attributes[OtelAttr.AGENT_DESCRIPTION] == "Test agent description" - assert span.attributes[SpanAttributes.LLM_REQUEST_MODEL] == "TestModel" + assert span.attributes[OtelAttr.REQUEST_MODEL] == "TestModel" if enable_sensitive_data: assert span.attributes.get(OtelAttr.OUTPUT_MESSAGES) is not None # Streaming, so no usage yet @@ -1381,8 +1380,6 @@ def test_get_response_attributes_with_model_id(): """Test _get_response_attributes includes model_id.""" from unittest.mock import Mock - from opentelemetry.semconv_ai import SpanAttributes - from agent_framework.observability import _get_response_attributes response = Mock() @@ -1395,7 +1392,7 @@ def test_get_response_attributes_with_model_id(): attrs = {} result = _get_response_attributes(attrs, response) - assert result[SpanAttributes.LLM_RESPONSE_MODEL] == "gpt-4" + assert result[OtelAttr.RESPONSE_MODEL] == "gpt-4" def test_get_response_attributes_with_usage(): diff --git a/python/packages/core/tests/openai/test_openai_chat_client.py b/python/packages/core/tests/openai/test_openai_chat_client.py index d2d027fcb1..8aa2c1f890 100644 --- a/python/packages/core/tests/openai/test_openai_chat_client.py +++ b/python/packages/core/tests/openai/test_openai_chat_client.py @@ -642,9 +642,8 @@ def test_prepare_message_with_text_reasoning_content(openai_unit_test_env: dict[ assert len(prepared) == 1 assert "reasoning_details" in prepared[0] assert prepared[0]["reasoning_details"] == mock_reasoning_data - # Should also have the text content - assert prepared[0]["content"][0]["type"] == "text" - assert prepared[0]["content"][0]["text"] == "The answer is 42." + # Should also have the text content (flattened to string for text-only) + assert prepared[0]["content"] == "The answer is 42." def test_function_approval_content_is_skipped_in_preparation(openai_unit_test_env: dict[str, str]) -> None: @@ -690,8 +689,7 @@ def test_function_approval_content_is_skipped_in_preparation(openai_unit_test_en ) prepared_mixed = client._prepare_message_for_openai(mixed_message) assert len(prepared_mixed) == 1 # Only text content should remain - assert prepared_mixed[0]["content"][0]["type"] == "text" - assert prepared_mixed[0]["content"][0]["text"] == "I need approval for this action." + assert prepared_mixed[0]["content"] == "I need approval for this action." def test_usage_content_in_streaming_response(openai_unit_test_env: dict[str, str]) -> None: @@ -730,6 +728,43 @@ def test_usage_content_in_streaming_response(openai_unit_test_env: dict[str, str assert usage_content.usage_details["total_token_count"] == 150 +def test_streaming_chunk_with_usage_and_text(openai_unit_test_env: dict[str, str]) -> None: + """Test that text content is not lost when usage data is in the same chunk. + + Some providers (e.g. Gemini) include both usage and text content in the + same streaming chunk. See https://github.com/microsoft/agent-framework/issues/3434 + """ + from openai.types.chat.chat_completion_chunk import ChatCompletionChunk, Choice, ChoiceDelta + from openai.types.completion_usage import CompletionUsage + + client = OpenAIChatClient() + + mock_chunk = ChatCompletionChunk( + id="test-chunk", + object="chat.completion.chunk", + created=1234567890, + model="gemini-2.0-flash-lite", + choices=[ + Choice( + index=0, + delta=ChoiceDelta(content="Hello world", role="assistant"), + finish_reason=None, + ) + ], + usage=CompletionUsage(prompt_tokens=18, completion_tokens=5, total_tokens=23), + ) + + update = client._parse_response_update_from_openai(mock_chunk) + + # Should have BOTH text and usage content + content_types = [c.type for c in update.contents] + assert "text" in content_types, "Text content should not be lost when usage is present" + assert "usage" in content_types, "Usage content should still be present" + + text_content = next(c for c in update.contents if c.type == "text") + assert text_content.text == "Hello world" + + def test_parse_text_with_refusal(openai_unit_test_env: dict[str, str]) -> None: """Test that refusal content is parsed correctly.""" from openai.types.chat.chat_completion import ChatCompletion, Choice @@ -814,7 +849,7 @@ def test_prepare_options_with_instructions(openai_unit_test_env: dict[str, str]) assert "messages" in prepared_options assert len(prepared_options["messages"]) == 2 assert prepared_options["messages"][0]["role"] == "system" - assert prepared_options["messages"][0]["content"][0]["text"] == "You are a helpful assistant." + assert prepared_options["messages"][0]["content"] == "You are a helpful assistant." def test_prepare_message_with_author_name(openai_unit_test_env: dict[str, str]) -> None: @@ -851,6 +886,109 @@ def test_prepare_message_with_tool_result_author_name(openai_unit_test_env: dict assert "name" not in prepared[0] +def test_prepare_system_message_content_is_string(openai_unit_test_env: dict[str, str]) -> None: + """Test that system message content is a plain string, not a list. + + Some OpenAI-compatible endpoints (e.g. NVIDIA NIM) reject system messages + with list content. See https://github.com/microsoft/agent-framework/issues/1407 + """ + client = OpenAIChatClient() + + message = Message(role="system", contents=[Content.from_text(text="You are a helpful assistant.")]) + + prepared = client._prepare_message_for_openai(message) + + assert len(prepared) == 1 + assert prepared[0]["role"] == "system" + assert isinstance(prepared[0]["content"], str) + assert prepared[0]["content"] == "You are a helpful assistant." + + +def test_prepare_developer_message_content_is_string(openai_unit_test_env: dict[str, str]) -> None: + """Test that developer message content is a plain string, not a list.""" + client = OpenAIChatClient() + + message = Message(role="developer", contents=[Content.from_text(text="Follow these rules.")]) + + prepared = client._prepare_message_for_openai(message) + + assert len(prepared) == 1 + assert prepared[0]["role"] == "developer" + assert isinstance(prepared[0]["content"], str) + assert prepared[0]["content"] == "Follow these rules." + + +def test_prepare_system_message_multiple_text_contents_joined(openai_unit_test_env: dict[str, str]) -> None: + """Test that system messages with multiple text contents are joined into a single string.""" + client = OpenAIChatClient() + + message = Message( + role="system", + contents=[ + Content.from_text(text="You are a helpful assistant."), + Content.from_text(text="Be concise."), + ], + ) + + prepared = client._prepare_message_for_openai(message) + + assert len(prepared) == 1 + assert prepared[0]["role"] == "system" + assert isinstance(prepared[0]["content"], str) + assert prepared[0]["content"] == "You are a helpful assistant.\nBe concise." + + +def test_prepare_user_message_text_content_is_string(openai_unit_test_env: dict[str, str]) -> None: + """Test that text-only user message content is flattened to a plain string. + + Some OpenAI-compatible endpoints (e.g. Foundry Local) cannot deserialize + the list format. See https://github.com/microsoft/agent-framework/issues/4084 + """ + client = OpenAIChatClient() + + message = Message(role="user", contents=[Content.from_text(text="Hello")]) + + prepared = client._prepare_message_for_openai(message) + + assert len(prepared) == 1 + assert prepared[0]["role"] == "user" + assert isinstance(prepared[0]["content"], str) + assert prepared[0]["content"] == "Hello" + + +def test_prepare_user_message_multimodal_content_remains_list(openai_unit_test_env: dict[str, str]) -> None: + """Test that multimodal user message content remains a list.""" + client = OpenAIChatClient() + + message = Message( + role="user", + contents=[ + Content.from_text(text="What's in this image?"), + Content.from_uri(uri="https://example.com/image.png", media_type="image/png"), + ], + ) + + prepared = client._prepare_message_for_openai(message) + + # Multimodal content must stay as list for the API + has_list_content = any(isinstance(m.get("content"), list) for m in prepared) + assert has_list_content + + +def test_prepare_assistant_message_text_content_is_string(openai_unit_test_env: dict[str, str]) -> None: + """Test that text-only assistant message content is flattened to a plain string.""" + client = OpenAIChatClient() + + message = Message(role="assistant", contents=[Content.from_text(text="Sure, I can help.")]) + + prepared = client._prepare_message_for_openai(message) + + assert len(prepared) == 1 + assert prepared[0]["role"] == "assistant" + assert isinstance(prepared[0]["content"], str) + assert prepared[0]["content"] == "Sure, I can help." + + def test_tool_choice_required_with_function_name(openai_unit_test_env: dict[str, str]) -> None: """Test that tool_choice with required mode and function name is correctly prepared.""" client = OpenAIChatClient() From 59f048f8a57529ebf90460b58ca11ba4fa64a857 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 23 Feb 2026 11:36:46 +0100 Subject: [PATCH 10/14] reorder imports --- python/packages/core/agent_framework/observability.py | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 8f4ecbd708..659eb3e343 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -38,10 +38,6 @@ else: from typing_extensions import TypeVar # type: ignore # pragma: no cover -# Defined here to avoid circular import with _types.py -EmbeddingInputT = TypeVar("EmbeddingInputT", default="str") -EmbeddingT = TypeVar("EmbeddingT", default="list[float]") - if TYPE_CHECKING: # pragma: no cover from opentelemetry.sdk._logs.export import LogRecordExporter from opentelemetry.sdk.metrics.export import MetricExporter @@ -87,6 +83,8 @@ ] +EmbeddingInputT = TypeVar("EmbeddingInputT", default="str") +EmbeddingT = TypeVar("EmbeddingT", default="list[float]") AgentT = TypeVar("AgentT", bound="SupportsAgentRun") ChatClientT = TypeVar("ChatClientT", bound="SupportsChatGetResponse[Any]") From f9f6acaba147f10c3cd10f9c82f2b4fd906acc93 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 23 Feb 2026 11:46:37 +0100 Subject: [PATCH 11/14] fix: Use OtelAttr.REQUEST_MODEL instead of removed SpanAttributes.LLM_REQUEST_MODEL Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/core/agent_framework/observability.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/packages/core/agent_framework/observability.py b/python/packages/core/agent_framework/observability.py index 659eb3e343..fbc9cb0f53 100644 --- a/python/packages/core/agent_framework/observability.py +++ b/python/packages/core/agent_framework/observability.py @@ -1322,7 +1322,7 @@ async def get_embeddings( service_url=service_url, ) - with _get_span(attributes=attributes, span_name_attribute=SpanAttributes.LLM_REQUEST_MODEL) as span: + with _get_span(attributes=attributes, span_name_attribute=OtelAttr.REQUEST_MODEL) as span: start_time_stamp = perf_counter() try: result = await super_get_embeddings(values, options=options) From fd1ad8a0f5d0e050af48075dcf9891bad65906f4 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 23 Feb 2026 13:32:42 +0100 Subject: [PATCH 12/14] docs: Add score_threshold to vector store plan Reference SK .NET PR #13501 for score threshold filtering semantics. Include score_threshold in SearchOptions from Phase 3. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/features/vector-stores-and-embeddings/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/features/vector-stores-and-embeddings/README.md b/docs/features/vector-stores-and-embeddings/README.md index 851e47afe1..be8e78e855 100644 --- a/docs/features/vector-stores-and-embeddings/README.md +++ b/docs/features/vector-stores-and-embeddings/README.md @@ -194,7 +194,7 @@ This feature ports the vector store abstractions, embedding generator abstractio - No `SearchType` enum — use `Literal["vector", "keyword_hybrid"]` instead, per AF convention of avoiding unnecessary imports - `VectorStoreField` plain class (not Pydantic) - `VectorStoreCollectionDefinition` class (not Pydantic internally, but supports Pydantic models as input) -- `SearchOptions` plain class +- `SearchOptions` plain class — includes `score_threshold: float | None` for filtering results by score (see note below) - `SearchResponse` generic class - `RecordFilterOptions` plain class - `DISTANCE_FUNCTION_DIRECTION_HELPER` dict @@ -383,3 +383,5 @@ Each connector follows the AF package structure: - `create_get_tool(...)` → tool for retrieving records by key - `create_delete_tool(...)` → tool for deleting records - These are separate from search and are placed in a later phase + +10. **Score threshold filtering**: `SearchOptions` includes `score_threshold: float | None` to filter search results by relevance score (ref: [SK .NET PR #13501](https://github.com/microsoft/semantic-kernel/pull/13501)). The semantics depend on the distance function: for similarity functions (cosine similarity, dot product), results *below* the threshold are filtered out; for distance functions (cosine distance, euclidean), results *above* the threshold are filtered out. Use `DISTANCE_FUNCTION_DIRECTION_HELPER` to determine direction. Connectors should implement this natively where the database supports it, falling back to client-side post-filtering otherwise. From de4484ad0fcbc73c83bc4bc4851cf6ccd7c7f944 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 23 Feb 2026 13:34:51 +0100 Subject: [PATCH 13/14] docs: Add reference to roji's SK .NET MEVD work for SQL connectors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/features/vector-stores-and-embeddings/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/features/vector-stores-and-embeddings/README.md b/docs/features/vector-stores-and-embeddings/README.md index be8e78e855..42ffc98041 100644 --- a/docs/features/vector-stores-and-embeddings/README.md +++ b/docs/features/vector-stores-and-embeddings/README.md @@ -314,6 +314,8 @@ Each connector follows the AF package structure: #### 7.2 — SQL Server (`packages/sql-server/`) #### 7.3 — FAISS (`packages/faiss/` or in core extending InMemory) +> **Note:** When implementing any SQL-based connector (PostgreSQL, SQL Server, SQLite, Cosmos DB), review the .NET MEVD changes made by @roji (Shay Rojansky) in SK for design patterns, query building, filter translation, and feature parity: https://github.com/microsoft/semantic-kernel/pulls?q=is%3Apr+author%3Aroji+is%3Aclosed + --- ### Phase 8: Vector Store CRUD Tools From 75ae3ac2321158b1d8881df37b41feedb7a82140 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 23 Feb 2026 17:00:48 +0100 Subject: [PATCH 14/14] fix: Clear env vars in construction tests to avoid CI leakage Tests for missing API key / model ID now use monkeypatch.delenv to ensure env vars from the integration test environment don't prevent the expected ValueError from being raised. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../core/tests/openai/test_openai_embedding_client.py | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/python/packages/core/tests/openai/test_openai_embedding_client.py b/python/packages/core/tests/openai/test_openai_embedding_client.py index bf35aca6a4..f4a9d6052b 100644 --- a/python/packages/core/tests/openai/test_openai_embedding_client.py +++ b/python/packages/core/tests/openai/test_openai_embedding_client.py @@ -56,12 +56,14 @@ def test_openai_construction_from_env(openai_unit_test_env: None) -> None: assert client.model_id == "text-embedding-3-small" -def test_openai_construction_missing_api_key_raises() -> None: +def test_openai_construction_missing_api_key_raises(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.delenv("OPENAI_API_KEY", raising=False) with pytest.raises(ValueError, match="API key is required"): OpenAIEmbeddingClient(model_id="text-embedding-3-small") -def test_openai_construction_missing_model_raises() -> None: +def test_openai_construction_missing_model_raises(monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.delenv("OPENAI_EMBEDDING_MODEL_ID", raising=False) with pytest.raises(ValueError, match="model ID is required"): OpenAIEmbeddingClient(api_key="test-key")