-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[Python] Add agent-framework-azure-ai-contentunderstanding package #4829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
yungshinlintw
wants to merge
75
commits into
microsoft:main
Choose a base branch
from
yungshinlintw:yslin/contentunderstanding-context-provider
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
64f8472
feat: add agent-framework-azure-contentunderstanding package
yungshinlintw 1058963
fix: update CU fixtures with real API data, fix test assertions
yungshinlintw ac552d2
chore: add connector .gitignore, update uv.lock
yungshinlintw e654aad
refactor: rename to azure-ai-contentunderstanding, fix CI issues
yungshinlintw eed2fce
feat: add samples (document_qa, invoice_processing, multimodal_chat)
yungshinlintw d94f08a
feat: add remaining samples (devui_multimodal_agent, large_doc_file_s…
yungshinlintw bb53a0f
feat: add file_search integration for large document RAG
yungshinlintw e5e4137
fix: add key-based auth support to all samples
yungshinlintw ee86b61
FEATURE(python): add analyzer auto-detection, file_search RAG, and la…
yungshinlin 34c1939
feat(cu): MIME sniffing, media-aware formatting, unified timeout, vec…
yungshinlin ba4bfa3
fix: merge all CU content segments for video/audio analysis
yungshinlin e276f76
refactor: improve CU context provider docs and remove ContentLimits
yungshinlintw 90ace98
feat: support user-provided vector store in FileSearchConfig
yungshinlintw 38b3aba
fix: remove ContentLimits from README code block
yungshinlintw c2b9b32
refactor: create CU client in __init__ instead of __aenter__
yungshinlintw 2e6f1b4
docs: add file_search param to class docstring
yungshinlintw 1e4b889
feat: introduce FileSearchBackend abstraction for cross-client support
yungshinlintw 7c39752
refactor: FileSearchBackend abstraction + caller-owned vector store
yungshinlintw 7fce742
fix: file_search reliability and sample improvements
yungshinlintw 8ea7e13
perf: set max_num_results=10 for file_search to reduce token usage
yungshinlintw 41a070c
fix: move import to top of file (E402 lint)
yungshinlintw 44cfcbc
chore: remove unused imports
yungshinlintw 628ad1c
fix: align azure-ai-contentunderstanding with MAF coding conventions
yungshinlin 6076cb2
refactor: improve CU context provider API surface and fix CI
yungshinlin 5cff2f7
fix: improve file_search samples and move tool guidelines to context …
yungshinlin 3607c85
feat: improve source_id, integration tests, and content assertions
yungshinlin aaf97c2
feat: reject duplicate filenames, add integration tests and sample co…
yungshinlin 0437e43
chore: improve doc key derivation, comments, and README
yungshinlin 2081ed5
test: strengthen _format_result assertions with exact expected strings
yungshinlin f377b2c
refactor: move invoice.pdf to shared sample_assets directory
yungshinlin 5e1c2e9
refactor: reorganize samples into numbered dirs and simplify auth
yungshinlin 2cde5fc
fix: resolve CI lint errors (D205, RUF001, E501)
yungshinlin 4981b35
refactor: overhaul samples — FoundryChatClient, sessions, remove get_…
yungshinlin f2cbc45
feat: add 05_background_analysis sample and fix 04 session/max_wait
yungshinlin 9eb35c2
docs: update README and fix sample 06
yungshinlin c1eb370
docs: rewrite README — concise format, prerequisites, CU link
yungshinlin 7e8e62a
fix: resolve pyright errors in _format_result segment cast
yungshinlin 9fc9e4e
docs: add numbered section comments and fresh sample output to all sa…
yungshinlin 4097328
feat: add load_settings support for env var configuration
yungshinlin 2682bfc
docs: polish README — fix duplicate env var, add Next steps, service …
yungshinlin 39b79c3
chore: trim invoice fixture from 199K to 33 lines
yungshinlin bdb4617
feat: per-file analyzer_id override via additional_properties
yungshinlin 4a9196d
Trim PDF test fixture and clarify unique filename requirement
yungshinlin beed5cc
Update python/packages/azure-ai-contentunderstanding/agent_framework_…
yungshinlintw 759d29c
Update python/packages/azure-ai-contentunderstanding/agent_framework_…
yungshinlintw dc01991
Update python/packages/azure-ai-contentunderstanding/samples/02-devui…
yungshinlintw 7222b6c
Update python/packages/azure-ai-contentunderstanding/samples/02-devui…
yungshinlintw 53ab967
Update python/packages/azure-ai-contentunderstanding/samples/01-get-s…
yungshinlintw d5bb27d
Fix AGENTS.md to match implementation; remove unused variable in test…
yungshinlin 618e4fe
Fix premature file_search instruction for background-completed docs
yungshinlin ad08891
fix: wrap long line in devui agent instructions (E501)
yungshinlin 51a3be5
Fix Copilot review: unused logger, stray code in README, await cancel…
yungshinlin 8c9777c
Sanitize doc keys and fix duplicate filename re-injection
yungshinlin a169efd
fix: add type annotation to tasks_to_cancel for pyright
yungshinlin 5c06dfa
Move per-session mutable state to state dict for session isolation
yungshinlin 2895202
Remove unused AnalysisSection enum values
yungshinlin 958568b
Recursively flatten object/array field values for cleaner LLM output
yungshinlin f483b81
Preserve sub-field confidence; compare full expected JSON in tests
yungshinlin 2e1dd6a
Remove incorrect MIME aliases (audio/mp4, video/x-matroska)
yungshinlin 3ae3e66
feat: add AnalysisInput, content_range, warnings, and category support
yungshinlin b09f39c
fix: falsy-0 bug in duration calc; improve test coverage
yungshinlin c2e9cb4
refactor: split _context_provider.py into focused modules
yungshinlin 5698d66
docs: update AGENTS.md with DocumentStatus, FileSearchBackend, and _f…
yungshinlin 0cc0cde
refactor: replace AnalysisSection enum with Literal type for simpler DX
yungshinlin e3f684c
refactor: replace asyncio.Task with continuation tokens for serializa…
yungshinlin 06d46b4
fix: resolve CI lint (RUF052) and mypy (call-overload) errors
yungshinlin d1b858c
feat: add structured output (Pydantic model) to invoice processing sa…
yungshinlin a39a0e6
fix: use FOUNDRY_PROJECT_ENDPOINT and FOUNDRY_MODEL env vars in all s…
yungshinlin 331b3c1
refactor: remove background_analysis sample, use FoundryChatClient in…
yungshinlin c57c814
fix: vector_stores API moved from beta namespace in OpenAI SDK
yungshinlin 493583d
docs: add comments about multi-file support and CU service limits in …
yungshinlin 0b830ba
fix: broken markdown links after sample removal and renumbering
yungshinlin 2c79c7a
fix: migrate BaseContextProvider to ContextProvider (non-deprecated)
yungshinlintw 0c3db4d
fix: Message(text=) -> Message(contents=[]) for API compatibility
yungshinlin 2c22cae
Merge branch 'main' into yslin/contentunderstanding-context-provider
yungshinlintw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # Local-only files (not committed) | ||
| _local_only/ | ||
| *_local_only* |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # AGENTS.md — azure-ai-contentunderstanding | ||
|
|
||
| ## Package Overview | ||
|
|
||
| `agent-framework-azure-ai-contentunderstanding` integrates Azure Content Understanding (CU) | ||
| into the Agent Framework as a context provider. It automatically analyzes file attachments | ||
| (documents, images, audio, video) and injects structured results into the LLM context. | ||
|
|
||
| ## Public API | ||
|
|
||
| | Symbol | Type | Description | | ||
| |--------|------|-------------| | ||
| | `ContentUnderstandingContextProvider` | class | Main context provider — extends `ContextProvider` | | ||
| | `AnalysisSection` | enum | Output section selector (MARKDOWN, FIELDS, etc.) | | ||
| | `DocumentStatus` | enum | Document lifecycle state (ANALYZING, UPLOADING, READY, FAILED) | | ||
| | `FileSearchBackend` | ABC | Abstract vector store file operations interface | | ||
| | `FileSearchConfig` | dataclass | Configuration for CU + vector store RAG mode | | ||
|
|
||
| ## Architecture | ||
|
|
||
| - **`_context_provider.py`** — Main provider implementation. Overrides `before_run()` to detect | ||
| file attachments, call the CU API, manage session state with multi-document tracking, | ||
| and auto-register retrieval tools for follow-up turns. | ||
| - **Analyzer auto-detection** — When `analyzer_id=None` (default), `_resolve_analyzer_id()` | ||
| selects the CU analyzer based on media type prefix: `audio/` → `prebuilt-audioSearch`, | ||
| `video/` → `prebuilt-videoSearch`, everything else → `prebuilt-documentSearch`. | ||
| - **Multi-segment output** — CU splits long video/audio into multiple scene segments | ||
| (each a separate `contents[]` entry with its own `startTimeMs`, `endTimeMs`, `markdown`, | ||
| and `fields`). `_extract_sections()` produces: | ||
| - `segments`: list of per-segment dicts, each with `markdown`, `fields`, `start_time_s`, `end_time_s` | ||
| - `markdown`: concatenated at top level with `---` separators (for file_search uploads) | ||
| - `duration_seconds`: computed from global `min(startTimeMs)` → `max(endTimeMs)` | ||
| - Metadata (`kind`, `resolution`): taken from the first segment | ||
| - **Speaker diarization (not identification)** — CU transcripts label speakers as | ||
| `<Speaker 1>`, `<Speaker 2>`, etc. CU does **not** identify speakers by name. | ||
| - **file_search RAG** — When `FileSearchConfig` is provided, CU-extracted markdown is | ||
| uploaded to an OpenAI vector store and a `file_search` tool is registered on the context | ||
| instead of injecting the full document content. This enables token-efficient retrieval | ||
| for large documents. | ||
| - **`_models.py`** — `AnalysisSection` enum, `DocumentStatus` enum, `DocumentEntry` TypedDict, | ||
| `FileSearchConfig` dataclass. | ||
| - **`_file_search.py`** — `FileSearchBackend` ABC, `OpenAIFileSearchBackend`, | ||
| `FoundryFileSearchBackend`. | ||
|
|
||
| ## Key Patterns | ||
|
|
||
| - Follows the Azure AI Search context provider pattern (same lifecycle, config style). | ||
| - Uses provider-scoped `state` dict for multi-document tracking across turns. | ||
| - Auto-registers `list_documents()` tool via `context.extend_tools()`. | ||
| - Configurable timeout (`max_wait`) with `asyncio.create_task()` background fallback. | ||
| - Strips supported binary attachments from `input_messages` to prevent LLM API errors. | ||
| - Explicit `analyzer_id` always overrides auto-detection (user preference wins). | ||
| - Vector store resources are cleaned up in `close()` / `__aexit__`. | ||
|
|
||
| ## Samples | ||
|
|
||
| | Sample | Description | | ||
| |--------|-------------| | ||
| | `01_document_qa.py` | Upload a PDF via URL, ask questions about it | | ||
| | `02_multi_turn_session.py` | AgentSession persistence across turns | | ||
| | `03_multimodal_chat.py` | PDF + audio + video parallel analysis | | ||
| | `04_invoice_processing.py` | Structured field extraction with `prebuilt-invoice` analyzer | | ||
| | `05_large_doc_file_search.py` | CU extraction + OpenAI vector store RAG | | ||
| | `02-devui/01-multimodal_agent/` | DevUI web UI for CU-powered chat | | ||
| | `02-devui/02-file_search_agent/` | DevUI web UI combining CU + file_search RAG | | ||
|
|
||
| ## Running Tests | ||
|
|
||
| ```bash | ||
| uv run poe test -P azure-ai-contentunderstanding | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| MIT License | ||
|
|
||
| Copyright (c) Microsoft Corporation. | ||
|
|
||
| Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| of this software and associated documentation files (the "Software"), to deal | ||
| in the Software without restriction, including without limitation the rights | ||
| to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| copies of the Software, and to permit persons to whom the Software is | ||
| furnished to do so, subject to the following conditions: | ||
|
|
||
| The above copyright notice and this permission notice shall be included in all | ||
| copies or substantial portions of the Software. | ||
|
|
||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
| SOFTWARE |
127 changes: 127 additions & 0 deletions
127
python/packages/azure-ai-contentunderstanding/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,127 @@ | ||
| # Get Started with Azure Content Understanding in Microsoft Agent Framework | ||
|
|
||
| Please install this package via pip: | ||
|
|
||
| ```bash | ||
| pip install agent-framework-azure-ai-contentunderstanding --pre | ||
| ``` | ||
|
|
||
| ## Azure Content Understanding Integration | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| Before using this package, you need an Azure Content Understanding resource: | ||
|
|
||
| 1. An active **Azure subscription** ([create one for free](https://azure.microsoft.com/pricing/purchase-options/azure-account)) | ||
| 2. A **Microsoft Foundry resource** created in a [supported region](https://learn.microsoft.com/azure/ai-services/content-understanding/language-region-support) | ||
| 3. **Default model deployments** configured for your resource (GPT-4.1, GPT-4.1-mini, text-embedding-3-large) | ||
|
|
||
| Follow the [prerequisites section](https://learn.microsoft.com/azure/ai-services/content-understanding/quickstart/use-rest-api?tabs=portal%2Cdocument&pivots=programming-language-rest#prerequisites) in the Azure Content Understanding quickstart for setup instructions. | ||
|
|
||
| ### Introduction | ||
|
|
||
| The Azure Content Understanding integration provides a context provider that automatically analyzes file attachments (documents, images, audio, video) using [Azure Content Understanding](https://learn.microsoft.com/azure/ai-services/content-understanding/) and injects structured results into the LLM context. | ||
|
|
||
| - **Document & image analysis**: State-of-the-art OCR with markdown extraction, table preservation, and structured field extraction — handles scanned PDFs, handwritten content, and complex layouts | ||
| - **Audio & video analysis**: Transcription, speaker diarization, and per-segment summaries | ||
| - **Background processing**: Configurable timeout with async background fallback for large files | ||
| - **file_search integration**: Optional vector store upload for token-efficient RAG on large documents | ||
|
|
||
| > Learn more about Azure Content Understanding capabilities at [https://learn.microsoft.com/azure/ai-services/content-understanding/](https://learn.microsoft.com/azure/ai-services/content-understanding/) | ||
|
|
||
| ### Basic Usage Example | ||
|
|
||
| See the [samples directory](samples/) which demonstrates: | ||
|
|
||
| - Single PDF upload and Q&A ([01_document_qa](samples/01-get-started/01_document_qa.py)) | ||
| - Multi-turn sessions with cached results ([02_multi_turn_session](samples/01-get-started/02_multi_turn_session.py)) | ||
| - PDF + audio + video parallel analysis ([03_multimodal_chat](samples/01-get-started/03_multimodal_chat.py)) | ||
| - Structured field extraction with prebuilt-invoice ([04_invoice_processing](samples/01-get-started/04_invoice_processing.py)) | ||
| - CU extraction + OpenAI vector store RAG ([05_large_doc_file_search](samples/01-get-started/05_large_doc_file_search.py)) | ||
| - Interactive web UI with DevUI ([02-devui](samples/02-devui/)) | ||
|
|
||
| ```python | ||
| import asyncio | ||
| from agent_framework import Agent, AgentSession, Message, Content | ||
| from agent_framework.foundry import FoundryChatClient | ||
| from agent_framework_azure_ai_contentunderstanding import ContentUnderstandingContextProvider | ||
| from azure.identity import AzureCliCredential | ||
|
|
||
| credential = AzureCliCredential() | ||
|
|
||
| cu = ContentUnderstandingContextProvider( | ||
| endpoint="https://my-resource.cognitiveservices.azure.com/", | ||
| credential=credential, | ||
| max_wait=None, # block until CU extraction completes before sending to LLM | ||
| ) | ||
|
|
||
| client = FoundryChatClient( | ||
| project_endpoint="https://your-project.services.ai.azure.com", | ||
| model="gpt-4.1", | ||
| credential=credential, | ||
| ) | ||
|
|
||
| async def main(): | ||
| async with cu: | ||
| agent = Agent( | ||
| client=client, | ||
| name="DocumentQA", | ||
| instructions="You are a helpful document analyst.", | ||
| context_providers=[cu], | ||
| ) | ||
| session = AgentSession() | ||
|
|
||
| response = await agent.run( | ||
| Message(role="user", contents=[ | ||
| Content.from_text("What's on this invoice?"), | ||
| Content.from_uri( | ||
| "https://raw.githubusercontent.com/Azure-Samples/" | ||
| "azure-ai-content-understanding-assets/main/document/invoice.pdf", | ||
| media_type="application/pdf", | ||
| additional_properties={"filename": "invoice.pdf"}, | ||
| ), | ||
| ]), | ||
| session=session, | ||
| ) | ||
| print(response.text) | ||
|
|
||
| asyncio.run(main()) | ||
| ``` | ||
|
|
||
| ### Supported File Types | ||
|
|
||
| | Category | Types | | ||
| |----------|-------| | ||
| | Documents | PDF, DOCX, XLSX, PPTX, HTML, TXT, Markdown | | ||
| | Images | JPEG, PNG, TIFF, BMP | | ||
| | Audio | WAV, MP3, M4A, FLAC, OGG | | ||
| | Video | MP4, MOV, AVI, WebM | | ||
|
|
||
| For the complete list of supported file types and size limits, see [Azure Content Understanding service limits](https://learn.microsoft.com/azure/ai-services/content-understanding/service-limits#input-file-limits). | ||
|
|
||
| ### Environment Variables | ||
|
|
||
| The provider supports automatic endpoint resolution from environment variables. | ||
| When ``endpoint`` is not passed to the constructor, it is loaded from | ||
| ``AZURE_CONTENTUNDERSTANDING_ENDPOINT``: | ||
|
|
||
| ```python | ||
| # Endpoint auto-loaded from AZURE_CONTENTUNDERSTANDING_ENDPOINT env var | ||
| cu = ContentUnderstandingContextProvider(credential=credential) | ||
| ``` | ||
|
|
||
| Set these in your shell or in a `.env` file: | ||
|
|
||
| ```bash | ||
| AZURE_CONTENTUNDERSTANDING_ENDPOINT=https://your-cu-resource.cognitiveservices.azure.com/ | ||
| AZURE_AI_PROJECT_ENDPOINT=https://your-project.services.ai.azure.com | ||
| AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4.1 | ||
| ``` | ||
|
|
||
| You also need to be logged in with `az login` (for `AzureCliCredential`). | ||
|
|
||
| ### Next steps | ||
|
|
||
| - Explore the [samples directory](samples/) for complete code examples | ||
| - Read the [Azure Content Understanding documentation](https://learn.microsoft.com/azure/ai-services/content-understanding/) for detailed service information | ||
| - Learn more about the [Microsoft Agent Framework](https://aka.ms/agent-framework) | ||
28 changes: 28 additions & 0 deletions
28
...s/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/__init__.py
yungshinlintw marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Copyright (c) Microsoft. All rights reserved. | ||
|
|
||
| """Azure Content Understanding integration for Microsoft Agent Framework. | ||
| Provides a context provider that analyzes file attachments (documents, images, | ||
| audio, video) using Azure Content Understanding and injects structured results | ||
| into the LLM context. | ||
| """ | ||
|
|
||
| import importlib.metadata | ||
|
|
||
| from ._context_provider import ContentUnderstandingContextProvider | ||
| from ._file_search import FileSearchBackend | ||
| from ._models import AnalysisSection, DocumentStatus, FileSearchConfig | ||
|
|
||
| try: | ||
| __version__ = importlib.metadata.version(__name__) | ||
| except importlib.metadata.PackageNotFoundError: | ||
| __version__ = "0.0.0" | ||
|
|
||
| __all__ = [ | ||
| "AnalysisSection", | ||
| "ContentUnderstandingContextProvider", | ||
| "DocumentStatus", | ||
| "FileSearchBackend", | ||
| "FileSearchConfig", | ||
| "__version__", | ||
| ] |
78 changes: 78 additions & 0 deletions
78
...azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_constants.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| # Copyright (c) Microsoft. All rights reserved. | ||
|
|
||
| """Constants for Azure Content Understanding context provider. | ||
|
|
||
| Supported media types, MIME aliases, and analyzer mappings used by | ||
| the file detection and analysis pipeline. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| # MIME types used to match against the resolved media type for routing files to CU analysis. | ||
| # The media type may be provided via Content.media_type or inferred (e.g., via sniffing or filename) | ||
| # when missing or generic (such as application/octet-stream). Only files whose resolved media type is | ||
| # in this set will be processed; others are skipped. | ||
| # | ||
| # Supported input file types: | ||
| # https://learn.microsoft.com/azure/ai-services/content-understanding/service-limits#input-file-limits | ||
| SUPPORTED_MEDIA_TYPES: frozenset[str] = frozenset({ | ||
| # Documents and images | ||
| "application/pdf", | ||
| "image/jpeg", | ||
| "image/png", | ||
| "image/tiff", | ||
| "image/bmp", | ||
| "image/heif", | ||
| "image/heic", | ||
| "application/vnd.openxmlformats-officedocument.wordprocessingml.document", | ||
| "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", | ||
| "application/vnd.openxmlformats-officedocument.presentationml.presentation", | ||
| # Text | ||
| "text/plain", | ||
| "text/html", | ||
| "text/markdown", | ||
| "text/rtf", | ||
| "text/xml", | ||
| "application/xml", | ||
| "message/rfc822", | ||
| "application/vnd.ms-outlook", | ||
| # Audio | ||
| "audio/wav", | ||
| "audio/mpeg", | ||
| "audio/mp3", | ||
| "audio/mp4", | ||
| "audio/m4a", | ||
| "audio/flac", | ||
| "audio/ogg", | ||
| "audio/opus", | ||
| "audio/webm", | ||
| "audio/x-ms-wma", | ||
| "audio/aac", | ||
| "audio/amr", | ||
| "audio/3gpp", | ||
| # Video | ||
| "video/mp4", | ||
| "video/quicktime", | ||
| "video/x-msvideo", | ||
| "video/webm", | ||
| "video/x-flv", | ||
| "video/x-ms-wmv", | ||
| "video/x-ms-asf", | ||
| "video/x-matroska", | ||
| }) | ||
|
|
||
| # Mapping from filetype's MIME output to our canonical SUPPORTED_MEDIA_TYPES values. | ||
| # filetype uses some x-prefixed variants that differ from our set. | ||
| MIME_ALIASES: dict[str, str] = { | ||
| "audio/x-wav": "audio/wav", | ||
| "audio/x-flac": "audio/flac", | ||
| "video/x-m4v": "video/mp4", | ||
| } | ||
|
|
||
| # Mapping from media type prefix to the appropriate prebuilt CU analyzer. | ||
| # Used when analyzer_id is None (auto-detect mode). | ||
| MEDIA_TYPE_ANALYZER_MAP: dict[str, str] = { | ||
| "audio/": "prebuilt-audioSearch", | ||
| "video/": "prebuilt-videoSearch", | ||
| } | ||
| DEFAULT_ANALYZER: str = "prebuilt-documentSearch" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.