Skip to content

Python: Allow @tool functions to return rich content (images, audio)#4331

Open
giles17 wants to merge 3 commits intomicrosoft:mainfrom
giles17:giles/tool-rich-content-results
Open

Python: Allow @tool functions to return rich content (images, audio)#4331
giles17 wants to merge 3 commits intomicrosoft:mainfrom
giles17:giles/tool-rich-content-results

Conversation

@giles17
Copy link
Contributor

@giles17 giles17 commented Feb 26, 2026

Description

Closes #4272

When a @tool function returns a Content object (e.g. Content.from_data(image_bytes, "image/png")), the framework now preserves it as rich content that the model can perceive natively — instead of serializing it to a JSON string.

Problem

Previously, FunctionTool.parse_result() serialized any Content return to JSON text via _make_dumpable(). The model received {"type": "function_call_output", "output": "{...}"} — a text blob, not the actual image. The same issue existed in MCP tool results where ImageContent was JSON-serialized.

Solution

Added an items field to function_result Content that carries rich Content objects (images, audio, files) alongside the text result. Providers format these items using their existing multi-modal content handling.

User API — no decorator changes needed:

@tool
async def capture_screenshot(url: str) -> Content:
    image_bytes = await take_screenshot(url)
    return Content.from_data(data=image_bytes, media_type="image/png")

@tool
async def render_chart(data: str) -> list[Content]:
    image_bytes = render(data)
    return [
        Content.from_text("Chart rendered."),
        Content.from_data(data=image_bytes, media_type="image/png"),
    ]

Changes

Core framework:

  • _types.py: Added items field to Content and from_function_result()
  • _tools.py: Updated parse_result() to preserve Content returns instead of JSON-serializing. Added _build_function_result() helper. Updated invoke() return type.
  • _mcp.py: Updated _parse_tool_result_from_mcp() to return list[Content] for image/audio instead of JSON strings

All 6 providers updated:

  • OpenAI Responses: Injects rich items as user message with input_image after function_call_output
  • OpenAI Chat Completions: Formats tool message content as multi-part array with image_url
  • Anthropic: Formats rich items as native image blocks in tool_result content array
  • Bedrock/Ollama/Azure-AI: Logs warning when rich items present (unsupported by these APIs)

Tests: 8 new tests + 2 updated existing tests, all passing.

…udio)

Add support for tool functions to return Content objects that the model can perceive natively. Closes microsoft#4272

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 26, 2026 19:48
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Feb 26, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/anthropic/agent_framework_anthropic
   _chat_client.py3974688%457, 544, 546, 664–669, 677–678, 683, 687, 721–722, 785, 806–807, 850–852, 854, 867–868, 875–877, 881–883, 887–890, 1003, 1013, 1047, 1069, 1190, 1217–1218, 1235, 1248, 1261, 1286–1287
packages/azure-ai/agent_framework_azure_ai
   _chat_client.py4807684%391–392, 394, 578, 583–584, 586–587, 590, 593, 595, 600, 861–862, 864, 867, 870, 873–878, 881, 883, 891, 903–905, 909, 912–913, 921–924, 934, 942–945, 947–948, 950–951, 958, 966–967, 975–976, 981–982, 986–993, 998, 1001, 1009, 1015, 1023–1025, 1028, 1050–1051, 1184, 1212, 1227, 1343, 1395, 1470
packages/core/agent_framework
   _mcp.py4256484%97–98, 108–113, 124, 129, 181–182, 192–197, 207–208, 222, 269, 278, 341, 349, 500, 567, 602, 604, 608–609, 611–612, 666, 681, 699, 740, 845, 858–863, 885, 934–935, 941–943, 962, 987–988, 992–996, 1013–1017, 1161
   _tools.py8929289%166–167, 322, 324, 342–344, 351, 369, 383, 390, 397, 413, 415, 422, 459, 484, 488, 505–507, 554–556, 579, 603, 646, 668, 731–737, 773, 784–795, 817–819, 824, 828, 842–844, 883, 952, 962, 972, 1028, 1059, 1078, 1356, 1440, 1460, 1531–1535, 1657, 1661, 1685, 1711, 1713, 1729, 1731, 1816, 1846, 1866, 1868, 1921, 1984, 2175–2176, 2224, 2292–2293, 2351, 2356, 2363
   _types.py10258591%59, 68–69, 123, 128, 147, 149, 153, 157, 159, 161, 163, 181, 185, 211, 233, 238, 243, 247, 277, 634–635, 1020, 1083, 1100, 1118, 1123, 1141, 1151, 1168–1169, 1171, 1189–1190, 1192, 1199–1200, 1202, 1237, 1248–1249, 1251, 1289, 1516, 1568, 1659–1664, 1686, 1691, 1857, 1869, 2121, 2142, 2237, 2466, 2673, 2743, 2755, 2773, 2971–2973, 2976–2978, 2982, 2987, 2991, 3075–3077, 3106, 3160, 3179–3180, 3183–3187, 3193
packages/core/agent_framework/openai
   _chat_client.py2913488%210, 240–241, 245, 363, 370, 446–453, 455–458, 468, 546, 548, 564, 576–584, 622, 638, 678
   _responses_client.py6449185%290–293, 297–298, 301–302, 308–309, 314, 327–333, 354, 362, 385, 548, 551, 606, 610, 612, 614, 616, 692, 702, 707, 750, 829, 846, 859, 920, 931, 935–937, 1024, 1029, 1033–1035, 1039–1040, 1063, 1132, 1154–1155, 1170–1171, 1189–1190, 1231–1234, 1343–1344, 1360, 1362, 1441–1449, 1568, 1623, 1638, 1681–1684, 1692–1693, 1695–1697, 1711–1713, 1723–1724, 1730, 1745
TOTAL22238278687% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4695 247 💤 0 ❌ 0 🔥 1m 19s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables @tool-decorated functions to return rich content (images, audio, files) that models can perceive natively, rather than having them serialized to JSON strings. This addresses issue #4272 by allowing vision-in-the-loop workflows where tools like capture_screenshot() or render_chart() can feed image content back into the model for analysis.

Changes:

  • Core framework now preserves Content objects with rich media instead of JSON-serializing them
  • Added items field to function_result Content to carry rich media alongside text results
  • Updated all 6 provider implementations to handle rich content (OpenAI Responses, OpenAI Chat, Anthropic support it natively; Bedrock, Ollama, Azure-AI log warnings)

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
python/packages/core/agent_framework/_types.py Added items parameter to Content.init and from_function_result() to store rich media items; updated to_dict() to serialize items
python/packages/core/agent_framework/_tools.py Updated parse_result() to return str or list[Content] instead of always serializing; added _build_function_result() helper to separate text and rich items; updated invoke() return type
python/packages/core/agent_framework/_mcp.py Updated _parse_tool_result_from_mcp() to return list[Content] for results containing images/audio instead of JSON strings
python/packages/core/agent_framework/openai/_responses_client.py Injects rich items as separate user message with input_image content after function_call_output
python/packages/core/agent_framework/openai/_chat_client.py Formats tool message content as multi-part array with text and image_url/input_audio/file parts when items present
python/packages/anthropic/agent_framework_anthropic/_chat_client.py Formats rich items as native image blocks in tool_result content array; handles both data and uri image types
python/packages/bedrock/agent_framework_bedrock/_chat_client.py Logs warning when rich items present (Bedrock doesn't support them); omits items from tool result
python/packages/ollama/agent_framework_ollama/_chat_client.py Logs warning when rich items present (Ollama doesn't support them); omits items from tool result
python/packages/azure-ai/agent_framework_azure_ai/_chat_client.py Logs warning when rich items present (Azure AI Agents doesn't support them); omits items from tool output
python/packages/core/tests/core/test_types.py Added 8 new tests for parse_result(), _build_function_result(), and Content.from_function_result() with items; updated 2 existing tests to expect list[Content] instead of JSON
python/packages/core/tests/core/test_mcp.py Updated test_parse_tool_result_from_mcp to expect list[Content] for results with images; added test_parse_tool_result_from_mcp_audio_content

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Feature]: Allow @tool functions to return image content that the model can analyze

3 participants