microsoft · moonbox3 · Nov 21, 2025 · Nov 19, 2025 · Nov 19, 2025 · Nov 19, 2025
diff --git a/python/samples/README.md b/python/samples/README.md
@@ -187,6 +187,7 @@ This directory contains samples demonstrating the capabilities of Microsoft Agen
 |------|-------------|
 | [`getting_started/evaluation/red_teaming/red_team_agent_sample.py`](./getting_started/evaluation/red_teaming/red_team_agent_sample.py) | Red team agent evaluation sample for Azure AI Foundry |
 | [`getting_started/evaluation/self_reflection/self_reflection.py`](./getting_started/evaluation/self_reflection/self_reflection.py) | LLM self-reflection with AI Foundry graders example |
+| [`demos/workflow_evaluation/run_evaluation.py`](./demos/workflow_evaluation/run_evaluation.py) | Multi-agent workflow evaluation demo with travel planning agents evaluated using Azure AI Foundry evaluators |
 
 ## MCP (Model Context Protocol)
 

diff --git a/python/samples/demos/workflow_evaluation/.env.example b/python/samples/demos/workflow_evaluation/.env.example
@@ -0,0 +1,2 @@
+AZURE_AI_PROJECT_ENDPOINT="<your-project-endpoint>"
+AZURE_AI_MODEL_DEPLOYMENT_NAME="<your-model-deployment>"
diff --git a/python/samples/demos/workflow_evaluation/README.md b/python/samples/demos/workflow_evaluation/README.md
@@ -0,0 +1,30 @@
+# Multi-Agent Travel Planning Workflow Evaluation
+
+This sample demonstrates evaluating a multi-agent workflow using Azure AI's built-in evaluators. The workflow processes travel planning requests through seven specialized agents in a fan-out/fan-in pattern: travel request handler, hotel/flight/activity search agents, booking aggregator, booking confirmation, and payment processing.
+
+## Evaluation Metrics
+
+The evaluation uses four Azure AI built-in evaluators:
+
+- **Relevance** - How well responses address the user query
+- **Groundedness** - Whether responses are grounded in available context
+- **Tool Call Accuracy** - Correct tool selection and parameter usage
+- **Tool Output Utilization** - Effective use of tool outputs in responses
+
+## Setup
+
+Create a `.env` file with configuration as in the `.env.example` file in this folder.
+
+## Running the Evaluation
+
+Execute the complete workflow and evaluation:
+
+```bash
+python run_evaluation.py
+```
+
+The script will:
+1. Execute the multi-agent travel planning workflow
+2. Display response summary for each agent
+3. Create and run evaluation on hotel, flight, and activity search agents
+4. Monitor progress and display the evaluation report URL
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		AZURE_AI_PROJECT_ENDPOINT="<your-project-endpoint>"
		AZURE_AI_MODEL_DEPLOYMENT_NAME="<your-model-deployment>"