Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions python/samples/getting_started/agents/nvidia/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# NVIDIA NIM Examples

This folder contains examples demonstrating how to use NVIDIA NIM (NVIDIA Inference Microservices) models with the Agent Framework through Azure AI Foundry.

## Prerequisites

Before running these examples, you need to set up NVIDIA NIM models on Azure AI Foundry. Follow the detailed instructions in the [NVIDIA Developer Blog](https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#deploy_a_nim_on_azure_ai_foundry).

### Quick Setup Steps

1. **Access Azure AI Foundry Portal**
- Navigate to [ai.azure.com](https://ai.azure.com)
- Ensure you have a Hub and Project available

2. **Deploy NVIDIA NIM Model**
- Select **Model Catalog** from the left sidebar
- In the **Collections** filter, select **NVIDIA**
- Choose a NIM microservice (e.g., Llama 3.1 8B Instruct NIM)
- Click **Deploy**
- Choose deployment name and VM type
- Review pricing and terms of use
- Click **Deploy** to launch the deployment

3. **Get API Credentials**
- Once deployed, note your endpoint URL and API key
- The endpoint URL should include `/v1` (e.g., `https://<endpoint>.<region>.inference.ml.azure.com/v1`)

## Examples

| File | Description |
|------|-------------|
| [`nvidia_nim_agent_example.py`](nvidia_nim_agent_example.py) | Complete example demonstrating how to use NVIDIA NIM models with the Agent Framework. Shows both streaming and non-streaming responses with tool calling capabilities. |
| [`nvidia_nim_chat_client.py`](nvidia_nim_chat_client.py) | Custom chat client implementation that handles NVIDIA NIM's specific message format requirements. |

## Environment Variables

Set the following environment variables before running the examples:

- `OPENAI_BASE_URL`: Your Azure AI Foundry endpoint URL (e.g., `https://<endpoint>.<region>.inference.ml.azure.com/v1`)
- `OPENAI_API_KEY`: Your Azure AI Foundry API key
- `OPENAI_CHAT_MODEL_ID`: The NVIDIA NIM model to use (e.g., `nvidia/llama-3.1-8b-instruct`)

## Running the Example

After setting up your NVIDIA NIM deployment and environment variables, you can run the example:

```bash
# Navigate to the examples directory
cd python/samples/getting_started/agents/nvidia

# Activate the virtual environment (if using one)
source ../../../.venv/bin/activate

# Set your environment variables
export OPENAI_BASE_URL="https://your-endpoint.region.inference.ml.azure.com/v1"
export OPENAI_API_KEY="your-api-key"
export OPENAI_CHAT_MODEL_ID="nvidia/llama-3.1-8b-instruct"

# Run the example
python nvidia_nim_agent_example.py
```

The example will demonstrate:
- Chat completion with NVIDIA NIM models
- Function calling capabilities
- Tool integration

## API Compatibility

NVIDIA NIM models deployed on Azure AI Foundry expose an OpenAI-compatible API, making them easy to integrate with existing OpenAI-based applications and frameworks. The models support:

- Standard OpenAI Chat Completion API
- Streaming and non-streaming responses
- Tool calling capabilities
- System and user messages

### Message Format Differences

NVIDIA NIM models expect the `content` field in messages to be a simple string, not an array of content objects like the standard OpenAI API. The example uses a custom `NVIDIANIMChatClient` that handles this conversion automatically.

**Standard OpenAI format:**
```json
{
"role": "user",
"content": [{"type": "text", "text": "Hello"}]
}
```

**NVIDIA NIM format:**
```json
{
"role": "user",
"content": "Hello"
}
```

## Available Models

NVIDIA NIM microservices support a wide range of models including:

- **Meta Llama 3.1 8B Instruct NIM**
- **Meta Llama 3.3 70B NIM**
- **NVIDIA Nemotron models**
- **Community models**
- **Custom AI models**

For the complete list of available models, check the Model Catalog in Azure AI Foundry.

## Additional Resources

- [NVIDIA NIM on Azure AI Foundry Documentation](https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/)
- [NVIDIA NIM Microservices](https://developer.nvidia.com/nim)
- [Azure AI Foundry Portal](https://ai.azure.com)
- [OpenAI SDK with NIM on Azure AI Foundry](https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#openai_sdk_with_nim_on_azure_ai_foundry)
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Copyright (c) Microsoft. All rights reserved.

import asyncio
import os
from random import randint
from typing import Annotated

from nvidia_nim_chat_client import NVIDIANIMChatClient

"""
NVIDIA NIM Agent Example

This sample demonstrates using NVIDIA NIM (NVIDIA Inference Microservices) models
deployed on Azure AI Foundry with the Agent Framework. It uses a custom chat client
that handles NVIDIA NIM's specific message format requirements.

## Prerequisites - Deploy NVIDIA NIM on Azure AI Foundry

Before running this example, you must first deploy a NVIDIA NIM model on Azure AI Foundry.
Follow the detailed instructions at:
https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#deploy_a_nim_on_azure_ai_foundry

### Quick Setup Steps:

1. **Access Azure AI Foundry Portal**
- Navigate to https://ai.azure.com
- Ensure you have a Hub and Project available

2. **Deploy NVIDIA NIM Model**
- Select "Model Catalog" from the left sidebar
- In the "Collections" filter, select "NVIDIA"
- Choose a NIM microservice (e.g., Llama 3.1 8B Instruct NIM)
- Click "Deploy"
- Choose deployment name and VM type
- Review pricing and terms of use
- Click "Deploy" to launch the deployment

3. **Get API Credentials**
- Once deployed, note your endpoint URL and API key
- The endpoint URL should include '/v1' (e.g., 'https://<endpoint>.<region>.inference.ml.azure.com/v1')

## Environment Variables

Set the following environment variables before running this example:

- OPENAI_BASE_URL: Your Azure AI Foundry endpoint URL (e.g., 'https://<endpoint>.<region>.inference.ml.azure.com/v1')
- OPENAI_API_KEY: Your Azure AI Foundry API key
- OPENAI_CHAT_MODEL_ID: The NVIDIA NIM model to use (e.g., 'nvidia/llama-3.1-8b-instruct')

## Running the Example

After setting up your NVIDIA NIM deployment and environment variables:

```bash
# Set your environment variables
export OPENAI_BASE_URL="https://your-endpoint.region.inference.ml.azure.com/v1"
export OPENAI_API_KEY="your-api-key"
export OPENAI_CHAT_MODEL_ID="nvidia/llama-3.1-8b-instruct"

# Run the example
python nvidia_nim_agent_example.py
```

The example will demonstrate:
- Chat completion with NVIDIA NIM models
- Function calling capabilities
- Tool integration

## API Compatibility

NVIDIA NIM models deployed on Azure AI Foundry expose an OpenAI-compatible API, making them
easy to integrate with existing OpenAI-based applications and frameworks. The models support:

- Standard OpenAI Chat Completion API
- Streaming and non-streaming responses
- Tool calling capabilities
- System and user messages

For more information, see:
https://developer.nvidia.com/blog/accelerated-ai-inference-with-nvidia-nim-on-azure-ai-foundry/#openai_sdk_with_nim_on_azure_ai_foundry
"""


def get_weather(
location: Annotated[str, "The location to get the weather for."],
) -> str:
"""Get the weather for a given location."""
conditions = ["sunny", "cloudy", "rainy", "stormy"]
return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."


def get_ai_insights(
topic: Annotated[str, "The AI topic to get insights about."],
) -> str:
"""Get AI insights about a specific topic."""
insights = [
f"AI is revolutionizing {topic} through advanced machine learning techniques.",
f"The future of {topic} lies in AI-powered automation and intelligent systems.",
f"Recent breakthroughs in AI have significantly impacted {topic} development.",
f"AI applications in {topic} are becoming more sophisticated and widespread."
]
return insights[randint(0, len(insights) - 1)]


async def first_example() -> None:
"""First example with function calling."""
print("=== Response ===")

agent = NVIDIANIMChatClient(
api_key=os.environ["OPENAI_API_KEY"],
base_url=os.environ["OPENAI_BASE_URL"],
model_id=os.environ["OPENAI_CHAT_MODEL_ID"],
).create_agent(
name="NVIDIAAIAgent",
instructions="You are a helpful AI assistant powered by NVIDIA NIM models. You can provide weather information and AI insights.",
tools=[get_weather, get_ai_insights],
)

query = "What's the weather like in Seattle and tell me about AI in healthcare?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Agent: {result}\n")


async def second_example() -> None:
"""Second example with function calling."""
print("=== Response ===")

agent = NVIDIANIMChatClient(
api_key=os.environ["OPENAI_API_KEY"],
base_url=os.environ["OPENAI_BASE_URL"],
model_id=os.environ["OPENAI_CHAT_MODEL_ID"],
).create_agent(
name="NVIDIAAIAgent",
instructions="You are a helpful AI assistant powered by NVIDIA NIM models. You can provide weather information and AI insights.",
tools=[get_weather, get_ai_insights],
)

query = "What's the weather like in Portland and give me insights about AI in autonomous vehicles?"
print(f"User: {query}")
result = await agent.run(query)
print(f"Agent: {result}\n")


async def main() -> None:
print("=== NVIDIA NIM Agent Example ===")
print("This example demonstrates using NVIDIA NIM models deployed on Azure AI Foundry")
print("with the Agent Framework using a custom chat client.\n")

await first_example()
await second_example()


if __name__ == "__main__":
asyncio.run(main())
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Copyright (c) Microsoft. All rights reserved.

from typing import Any
from collections.abc import Sequence

from agent_framework.openai import OpenAIChatClient
from agent_framework._types import (
ChatMessage,
Contents,
TextContent,
Role,
FunctionCallContent,
FunctionResultContent,
FunctionApprovalRequestContent,
FunctionApprovalResponseContent,
prepare_function_call_results,
)


class NVIDIANIMChatClient(OpenAIChatClient):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directed to @eavanvalkenburg: we have other clients based on the OpenAIChatClient, why wouldn't we want to bring this NVIDIANIMChatClient into its own package, instead of it being purely in the samples?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question @moonbox3 I've asked Shawn for his opinion

"""Custom OpenAI Chat Client for NVIDIA NIM models.

NVIDIA NIM models expect the 'content' field in messages to be a simple string,
not an array of content objects like standard OpenAI API. This client handles
the conversion from the agent framework's content format to NVIDIA NIM's expected format.
"""

def _openai_chat_message_parser(self, message: ChatMessage) -> list[dict[str, Any]]:
"""Parse a chat message into the NVIDIA NIM format.

NVIDIA NIM expects:
- content: string (not array of objects)
- role: string
"""
all_messages: list[dict[str, Any]] = []

for content in message.contents:
# Skip approval content - it's internal framework state, not for the LLM
if isinstance(content, (FunctionApprovalRequestContent, FunctionApprovalResponseContent)):
continue

args: dict[str, Any] = {
"role": message.role.value if isinstance(message.role, Role) else message.role,
}

if message.additional_properties:
args["metadata"] = message.additional_properties

if isinstance(content, FunctionCallContent):
if all_messages and "tool_calls" in all_messages[-1]:
# If the last message already has tool calls, append to it
all_messages[-1]["tool_calls"].append(self._openai_content_parser(content))
else:
args["tool_calls"] = [self._openai_content_parser(content)] # type: ignore
elif isinstance(content, FunctionResultContent):
args["tool_call_id"] = content.call_id
if content.result is not None:
args["content"] = prepare_function_call_results(content.result)
elif content.exception is not None:
# Send the exception message to the model
args["content"] = "Error: " + str(content.exception)
elif isinstance(content, TextContent):
# For NVIDIA NIM, content should be a simple string, not an array
if "content" not in args:
args["content"] = content.text
else:
# If there's already content, append to it
args["content"] += content.text
else:
# For other content types, convert to string representation
if "content" not in args:
args["content"] = str(content)
else:
args["content"] += str(content)

if "content" in args or "tool_calls" in args:
all_messages.append(args)

return all_messages
Loading
Loading