microsoft · YusakuNo1 · Nov 19, 2025 · Nov 15, 2025 · Nov 15, 2025 · Nov 15, 2025
diff --git a/python/samples/README.md b/python/samples/README.md
@@ -185,6 +185,7 @@ This directory contains samples demonstrating the capabilities of Microsoft Agen
 | File | Description |
 |------|-------------|
 | [`getting_started/evaluation/azure_ai_foundry/red_team_agent_sample.py`](./getting_started/evaluation/azure_ai_foundry/red_team_agent_sample.py) | Red team agent evaluation sample for Azure AI Foundry |
+| [`getting_started/evaluation/azure_ai_foundry/evaluation/self_reflection.py`](./getting_started/evaluation/azure_ai_foundry/evaluation/self_reflection.py) | LLM self-reflection with AI Foundry graders example |
 
 ## MCP (Model Context Protocol)
 

diff --git a/python/samples/getting_started/evaluation/azure_ai_foundry/evaluation/.env.example b/python/samples/getting_started/evaluation/azure_ai_foundry/evaluation/.env.example
@@ -0,0 +1,2 @@
+AZURE_OPENAI_ENDPOINT="..."
+AZURE_OPENAI_API_KEY="..."
diff --git a/python/samples/getting_started/evaluation/azure_ai_foundry/evaluation/README.md b/python/samples/getting_started/evaluation/azure_ai_foundry/evaluation/README.md
@@ -0,0 +1,75 @@
+# Self-Reflection Evaluation Sample
+
+This sample demonstrates the self-reflection pattern using Agent Framework and Azure AI Foundry's Groundedness Evaluator. For details, see [Reflexion: Language Agents with Verbal Reinforcement Learning](https://arxiv.org/abs/2303.11366) (NeurIPS 2023).
+
+## Overview
+
+**What it demonstrates:**
+- Iterative self-reflection loop that automatically improves responses based on groundedness evaluation
+- Batch processing of prompts from Parquet files with progress tracking
+- Using `AzureOpenAIChatClient` with Azure CLI authentication
+- Comprehensive summary statistics and detailed result tracking
+
+## Prerequisites
+
+### Azure Resources
+- **Azure OpenAI**: Deploy models (default: gpt-4.1 for both agent and judge)
+- **Azure CLI**: Run `az login` to authenticate
+
+### Python Environment
+```bash
+pip install agent-framework-core azure-ai-evaluation pandas --pre
+```
+
+### Environment Variables
+```bash
+# .env file
+AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
+AZURE_OPENAI_API_KEY=your-api-key  # Optional with Azure CLI
+```
+
+## Running the Sample
+
+```bash
+# Basic usage
+python self_reflection.py
+
+# With options
+python self_reflection.py --input my_prompts.parquet \
+                          --output results.parquet \
+                          --max-reflections 5 \
+                          -n 10
+```
+
+**CLI Options:**
+- `--input`, `-i`: Input parquet file
+- `--output`, `-o`: Output parquet file
+- `--agent-model`, `-m`: Agent model name (default: gpt-4.1)
+- `--judge-model`, `-e`: Evaluator model name (default: gpt-4.1)
+- `--max-reflections`: Max iterations (default: 3)
+- `--limit`, `-n`: Process only first N prompts
+
+## Understanding Results
+
+The agent iteratively improves responses:
+1. Generate initial response
+2. Evaluate groundedness (1-5 scale)
+3. If score < 5, provide feedback and retry
+4. Stop at max iterations or perfect score (5/5)
+
+**Example output:**
+```
+[1/31] Processing prompt 0...
+  Self-reflection iteration 1/3...
+  Groundedness score: 3/5
+  Self-reflection iteration 2/3...
+  Groundedness score: 5/5
+  ✓ Perfect groundedness score achieved!
+  ✓ Completed with score: 5/5 (best at iteration 2/3)
+```
+
+## Related Resources
+
+- [Reflexion Paper](https://arxiv.org/abs/2303.11366)
+- [Azure AI Evaluation SDK](https://learn.microsoft.com/azure/ai-studio/how-to/develop/evaluate-sdk)
+- [Agent Framework](https://github.com/microsoft/agent-framework)
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		AZURE_OPENAI_ENDPOINT="..."
		AZURE_OPENAI_API_KEY="..."