microsoft · thegovind · Oct 5, 2025 · Oct 6, 2025 · Oct 13, 2025 · Oct 13, 2025
diff --git a/python/samples/getting_started/agents/nvidia/README.md b/python/samples/getting_started/agents/nvidia/README.md
@@ -0,0 +1,114 @@
+# NVIDIA NIM Examples
+
+This folder contains examples demonstrating how to use NVIDIA NIM (NVIDIA Inference Microservices) models with the Agent Framework through Azure AI Foundry.
+
+## Prerequisites
+
+Before running these examples, you need to set up NVIDIA NIM models on Azure AI Foundry. Follow the detailed instructions in the [NVIDIA Developer Blog](https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#deploy_a_nim_on_azure_ai_foundry).
+
+### Quick Setup Steps
+
+1. **Access Azure AI Foundry Portal**
+   - Navigate to [ai.azure.com](https://ai.azure.com)
+   - Ensure you have a Hub and Project available
+
+2. **Deploy NVIDIA NIM Model**
+   - Select **Model Catalog** from the left sidebar
+   - In the **Collections** filter, select **NVIDIA**
+   - Choose a NIM microservice (e.g., Llama 3.1 8B Instruct NIM)
+   - Click **Deploy**
+   - Choose deployment name and VM type
+   - Review pricing and terms of use
+   - Click **Deploy** to launch the deployment
+
+3. **Get API Credentials**
+   - Once deployed, note your endpoint URL and API key
+   - The endpoint URL should include `/v1` (e.g., `https://<endpoint>.<region>.inference.ml.azure.com/v1`)
+
+## Examples
+
+| File | Description |
+|------|-------------|
+| [`nvidia_nim_agent_example.py`](nvidia_nim_agent_example.py) | Complete example demonstrating how to use NVIDIA NIM models with the Agent Framework. Shows both streaming and non-streaming responses with tool calling capabilities. |
+| [`nvidia_nim_chat_client.py`](nvidia_nim_chat_client.py) | Custom chat client implementation that handles NVIDIA NIM's specific message format requirements. |
+
+## Environment Variables
+
+Set the following environment variables before running the examples:
+
+- `OPENAI_BASE_URL`: Your Azure AI Foundry endpoint URL (e.g., `https://<endpoint>.<region>.inference.ml.azure.com/v1`)
+- `OPENAI_API_KEY`: Your Azure AI Foundry API key
+- `OPENAI_CHAT_MODEL_ID`: The NVIDIA NIM model to use (e.g., `nvidia/llama-3.1-8b-instruct`)
+
+## Running the Example
+
+After setting up your NVIDIA NIM deployment and environment variables, you can run the example:
+
+```bash
+# Navigate to the examples directory
+cd python/samples/getting_started/agents/nvidia
+
+# Activate the virtual environment (if using one)
+source ../../../.venv/bin/activate
+
+# Set your environment variables
+export OPENAI_BASE_URL="https://your-endpoint.region.inference.ml.azure.com/v1"
+export OPENAI_API_KEY="your-api-key"
+export OPENAI_CHAT_MODEL_ID="nvidia/llama-3.1-8b-instruct"
+
+# Run the example
+python nvidia_nim_agent_example.py
+```
+
+The example will demonstrate:
+- Chat completion with NVIDIA NIM models
+- Function calling capabilities
+- Tool integration
+
+## API Compatibility
+
+NVIDIA NIM models deployed on Azure AI Foundry expose an OpenAI-compatible API, making them easy to integrate with existing OpenAI-based applications and frameworks. The models support:
+
+- Standard OpenAI Chat Completion API
+- Streaming and non-streaming responses
+- Tool calling capabilities
+- System and user messages
+
+### Message Format Differences
+
+NVIDIA NIM models expect the `content` field in messages to be a simple string, not an array of content objects like the standard OpenAI API. The example uses a custom `NVIDIANIMChatClient` that handles this conversion automatically.
+
+**Standard OpenAI format:**
+```json
+{
+  "role": "user",
+  "content": [{"type": "text", "text": "Hello"}]
+}
+```
+
+**NVIDIA NIM format:**
+```json
+{
+  "role": "user", 
+  "content": "Hello"
+}
+```
+
+## Available Models
+
+NVIDIA NIM microservices support a wide range of models including:
+
+- **Meta Llama 3.1 8B Instruct NIM**
+- **Meta Llama 3.3 70B NIM**
+- **NVIDIA Nemotron models**
+- **Community models**
+- **Custom AI models**
+
+For the complete list of available models, check the Model Catalog in Azure AI Foundry.
+
+## Additional Resources
+
+- [NVIDIA NIM on Azure AI Foundry Documentation](https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/)
+- [NVIDIA NIM Microservices](https://developer.nvidia.com/nim)
+- [Azure AI Foundry Portal](https://ai.azure.com)
+- [OpenAI SDK with NIM on Azure AI Foundry](https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#openai_sdk_with_nim_on_azure_ai_foundry)
diff --git a/python/samples/getting_started/agents/nvidia/nvidia_nim_agent_example.py b/python/samples/getting_started/agents/nvidia/nvidia_nim_agent_example.py
@@ -0,0 +1,155 @@
+# Copyright (c) Microsoft. All rights reserved.
+
+import asyncio
+import os
+from random import randint
+from typing import Annotated
+
+from nvidia_nim_chat_client import NVIDIANIMChatClient
+
+"""
+NVIDIA NIM Agent Example
+
+This sample demonstrates using NVIDIA NIM (NVIDIA Inference Microservices) models 
+deployed on Azure AI Foundry with the Agent Framework. It uses a custom chat client
+that handles NVIDIA NIM's specific message format requirements.
+
+## Prerequisites - Deploy NVIDIA NIM on Azure AI Foundry
+
+Before running this example, you must first deploy a NVIDIA NIM model on Azure AI Foundry.
+Follow the detailed instructions at:
+https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#deploy_a_nim_on_azure_ai_foundry
+
+### Quick Setup Steps:
+
+1. **Access Azure AI Foundry Portal**
+   - Navigate to https://ai.azure.com
+   - Ensure you have a Hub and Project available
+
+2. **Deploy NVIDIA NIM Model**
+   - Select "Model Catalog" from the left sidebar
+   - In the "Collections" filter, select "NVIDIA"
+   - Choose a NIM microservice (e.g., Llama 3.1 8B Instruct NIM)
+   - Click "Deploy"
+   - Choose deployment name and VM type
+   - Review pricing and terms of use
+   - Click "Deploy" to launch the deployment
+
+3. **Get API Credentials**
+   - Once deployed, note your endpoint URL and API key
+   - The endpoint URL should include '/v1' (e.g., 'https://<endpoint>.<region>.inference.ml.azure.com/v1')
+
+## Environment Variables
+
+Set the following environment variables before running this example:
+
+- OPENAI_BASE_URL: Your Azure AI Foundry endpoint URL (e.g., 'https://<endpoint>.<region>.inference.ml.azure.com/v1')
+- OPENAI_API_KEY: Your Azure AI Foundry API key
+- OPENAI_CHAT_MODEL_ID: The NVIDIA NIM model to use (e.g., 'nvidia/llama-3.1-8b-instruct')
+
+## Running the Example
+
+After setting up your NVIDIA NIM deployment and environment variables:
+
+```bash
+# Set your environment variables
+export OPENAI_BASE_URL="https://your-endpoint.region.inference.ml.azure.com/v1"
+export OPENAI_API_KEY="your-api-key"
+export OPENAI_CHAT_MODEL_ID="nvidia/llama-3.1-8b-instruct"
+
+# Run the example
+python nvidia_nim_agent_example.py
+```
+
+The example will demonstrate:
+- Chat completion with NVIDIA NIM models
+- Function calling capabilities
+- Tool integration
+
+## API Compatibility
+
+NVIDIA NIM models deployed on Azure AI Foundry expose an OpenAI-compatible API, making them 
+easy to integrate with existing OpenAI-based applications and frameworks. The models support:
+
+- Standard OpenAI Chat Completion API
+- Streaming and non-streaming responses
+- Tool calling capabilities
+- System and user messages
+
+For more information, see:
+https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#openai_sdk_with_nim_on_azure_ai_foundry
+"""
+
+
+def get_weather(
+    location: Annotated[str, "The location to get the weather for."],
+) -> str:
+    """Get the weather for a given location."""
+    conditions = ["sunny", "cloudy", "rainy", "stormy"]
+    return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."
+
+
+def get_ai_insights(
+    topic: Annotated[str, "The AI topic to get insights about."],
+) -> str:
+    """Get AI insights about a specific topic."""
+    insights = [
+        f"AI is revolutionizing {topic} through advanced machine learning techniques.",
+        f"The future of {topic} lies in AI-powered automation and intelligent systems.",
+        f"Recent breakthroughs in AI have significantly impacted {topic} development.",
+        f"AI applications in {topic} are becoming more sophisticated and widespread."
+    ]
+    return insights[randint(0, len(insights) - 1)]
+
+
+async def first_example() -> None:
+    """First example with function calling."""
+    print("=== Response ===")
+
+    agent = NVIDIANIMChatClient(
+        api_key=os.environ["OPENAI_API_KEY"],
+        base_url=os.environ["OPENAI_BASE_URL"],
+        model_id=os.environ["OPENAI_CHAT_MODEL_ID"],
+    ).create_agent(
+        name="NVIDIAAIAgent",
+        instructions="You are a helpful AI assistant powered by NVIDIA NIM models. You can provide weather information and AI insights.",
+        tools=[get_weather, get_ai_insights],
+    )
+
+    query = "What's the weather like in Seattle and tell me about AI in healthcare?"
+    print(f"User: {query}")
+    result = await agent.run(query)
+    print(f"Agent: {result}\n")
+
+
+async def second_example() -> None:
+    """Second example with function calling."""
+    print("=== Response ===")
+
+    agent = NVIDIANIMChatClient(
+        api_key=os.environ["OPENAI_API_KEY"],
+        base_url=os.environ["OPENAI_BASE_URL"],
+        model_id=os.environ["OPENAI_CHAT_MODEL_ID"],
+    ).create_agent(
+        name="NVIDIAAIAgent",
+        instructions="You are a helpful AI assistant powered by NVIDIA NIM models. You can provide weather information and AI insights.",
+        tools=[get_weather, get_ai_insights],
+    )
+
+    query = "What's the weather like in Portland and give me insights about AI in autonomous vehicles?"
+    print(f"User: {query}")
+    result = await agent.run(query)
+    print(f"Agent: {result}\n")
+
+
+async def main() -> None:
+    print("=== NVIDIA NIM Agent Example ===")
+    print("This example demonstrates using NVIDIA NIM models deployed on Azure AI Foundry")
+    print("with the Agent Framework using a custom chat client.\n")
+
+    await first_example()
+    await second_example()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/python/samples/getting_started/agents/nvidia/nvidia_nim_chat_client.py b/python/samples/getting_started/agents/nvidia/nvidia_nim_chat_client.py
@@ -0,0 +1,79 @@
+# Copyright (c) Microsoft. All rights reserved.
+
+from typing import Any
+from collections.abc import Sequence
+
+from agent_framework.openai import OpenAIChatClient
+from agent_framework._types import (
+    ChatMessage, 
+    Contents, 
+    TextContent,
+    Role,
+    FunctionCallContent,
+    FunctionResultContent,
+    FunctionApprovalRequestContent,
+    FunctionApprovalResponseContent,
+    prepare_function_call_results,
+)
+
+
+class NVIDIANIMChatClient(OpenAIChatClient):
+    """Custom OpenAI Chat Client for NVIDIA NIM models.
+
+    NVIDIA NIM models expect the 'content' field in messages to be a simple string,
+    not an array of content objects like standard OpenAI API. This client handles
+    the conversion from the agent framework's content format to NVIDIA NIM's expected format.
+    """
+
+    def _openai_chat_message_parser(self, message: ChatMessage) -> list[dict[str, Any]]:
+        """Parse a chat message into the NVIDIA NIM format.
+
+        NVIDIA NIM expects:
+        - content: string (not array of objects)
+        - role: string
+        """
+        all_messages: list[dict[str, Any]] = []
+
+        for content in message.contents:
+            # Skip approval content - it's internal framework state, not for the LLM
+            if isinstance(content, (FunctionApprovalRequestContent, FunctionApprovalResponseContent)):
+                continue
+
+            args: dict[str, Any] = {
+                "role": message.role.value if isinstance(message.role, Role) else message.role,
+            }
+
+            if message.additional_properties:
+                args["metadata"] = message.additional_properties
+
+            if isinstance(content, FunctionCallContent):
+                if all_messages and "tool_calls" in all_messages[-1]:
+                    # If the last message already has tool calls, append to it
+                    all_messages[-1]["tool_calls"].append(self._openai_content_parser(content))
+                else:
+                    args["tool_calls"] = [self._openai_content_parser(content)]  # type: ignore
+            elif isinstance(content, FunctionResultContent):
+                args["tool_call_id"] = content.call_id
+                if content.result is not None:
+                    args["content"] = prepare_function_call_results(content.result)
+                elif content.exception is not None:
+                    # Send the exception message to the model
+                    args["content"] = "Error: " + str(content.exception)
+            elif isinstance(content, TextContent):
+                # For NVIDIA NIM, content should be a simple string, not an array
+                if "content" not in args:
+                    args["content"] = content.text
+                else:
+                    # If there's already content, append to it
+                    args["content"] += content.text
+            else:
+                # For other content types, convert to string representation
+                if "content" not in args:
+                    args["content"] = str(content)
+                else:
+                    args["content"] += str(content)
+
+            if "content" in args or "tool_calls" in args:
+                all_messages.append(args)
+
+        return all_messages