.Net: Improve OpenAIResponseAgent exception handling (rate limit, auth, content filter, model not found) by Yusuftmle · Pull Request #13011 · microsoft/semantic-kernel

Yusuftmle · 2025-08-26T12:34:58Z

Summary

This PR improves exception handling for Microsoft.SemanticKernel.Agents.OpenAI.OpenAIResponseAgent.

Previously, calls to InvokeAsync(...) and InvokeStreamingAsync(...) could collapse provider failures into a generic NullReferenceException, losing context such as rate limiting, content filter violations, authentication failures, or missing models. This made it difficult to implement robust retry logic and provide meaningful error messages.

A key root cause was that IAsyncEnumerable uses lazy evaluation — the actual network call doesn't happen at the method call site, but inside the await foreach loop. This meant exceptions thrown during enumeration were escaping the original try-catch blocks entirely.

Changes

Added HandleProviderExceptionsAsync — a thin async wrapper that intercepts exceptions during both the initialization phase and MoveNextAsync iteration, wrapping them in KernelException with descriptive messages.
OperationCanceledException is preserved and never wrapped.
Added finally block to guarantee DisposeAsync is called on the enumerator even if the stream fails mid-way.
Replaced synthetic unit tests with integration-style tests using a custom ThrowingHttpMessageHandler that simulates real HTTP failures (429, 500) end-to-end against a real OpenAIResponseAgent instance.

Why

Fixes [#12976] by ensuring that provider errors are surfaced with meaningful messages instead of NullReferenceException.

Notes

This is my first PR to Semantic Kernel
Exception handling approach was updated based on feedback from @markwallace-microsoft to use a broad catch rather than per-status-code mapping.
Feedback is welcome — especially regarding exception type design and whether additional metadata should be attached.
Unit tests included and passing locally (dotnet test).

Environment Tested

.NET 8
Microsoft.SemanticKernel 1.61.0-preview
Microsoft.SemanticKernel.Agents.OpenAI 1.61.0-preview
Azure OpenAI (Responses API)

Yusuftmle · 2025-08-26T12:36:55Z

@microsoft-github-policy-service agree

…proofing - Changed exception handling to catch all Exception types instead of only OpenAI-specific ones - This approach is more robust and won't miss new exception types from future SDK updates - Inner exceptions are preserved for detailed error analysis - Updated unit tests accordingly

Copilot

Pull request overview

This PR aims to improve how Microsoft.SemanticKernel.Agents.OpenAI.OpenAIResponseAgent surfaces provider failures by adding additional exception handling and introducing new unit tests for error mapping.

Changes:

Added broad try/catch blocks in OpenAIResponseAgent.InvokeAsync(...) and InvokeStreamingAsync(...) that wrap initialization failures in KernelException.
Refactored local variable setup in both invoke paths (non-streaming and streaming) to support the new exception handling shape.
Added a new unit test class intended to validate exception mapping behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
dotnet/src/Agents/OpenAI/OpenAIResponseAgent.cs	Adds exception wrapping around invoke setup and adjusts streaming message notification flow.
dotnet/src/Agents/UnitTests/OpenAI/OpenAIResponseAgentExceptionTests.cs	Introduces new tests intended to validate error-to-message mapping for `OpenAIResponseAgent`.

Comments suppressed due to low confidence (1)

dotnet/src/Agents/OpenAI/OpenAIResponseAgent.cs:90

Exceptions thrown while enumerating invokeResults (e.g., from ResponseThreadActions.InvokeAsync during the API call) are not covered by the try/catch, so provider failures can still escape without the intended mapping (and may still surface as the original problematic exception types). If the goal is to surface provider errors consistently, wrap the await foreach enumeration (or use a local iterator that applies try/catch around MoveNextAsync).

        // Yield results with additional error handling
        await foreach (var result in invokeResults.ConfigureAwait(false))
        {
            if (options?.OnIntermediateMessage is not null)
            {
                await options.OnIntermediateMessage(result).ConfigureAwait(false);
            }

            await this.NotifyThreadOfNewMessage(agentThread, result, cancellationToken).ConfigureAwait(false);
            yield return new(result, agentThread);
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Yusuftmle · 2026-05-11T12:49:34Z

+        try
+        {
+            agentThread = await this.EnsureThreadExistsWithMessagesAsync(messages, thread, cancellationToken).ConfigureAwait(false);
+            extensionsContextOptions = await this.FinalizeInvokeOptionsAsync(messages, options, agentThread, cancellationToken).ConfigureAwait(false);
+
+            ChatHistory chatHistory = [.. messages];
+            invokeResults = ResponseThreadActions.InvokeAsync(
+                this,
+                chatHistory,
+                agentThread,
+                extensionsContextOptions,
+                cancellationToken);
+        }
+        catch (Exception ex)
+        {
+            throw new KernelException($"OpenAI provider error for agent '{this.Name}': {ex.Message}", ex);
+        }


The broad catch was suggested by @markwallace-microsoft
in a previous review. Happy to narrow it to
ClientResultException if preferred.
What's the recommended approach?

+        catch (Exception ex)
+        {
+            throw new KernelException($"OpenAI provider error for agent '{this.Name}': {ex.Message}", ex);
+        }


+        await foreach (var result in invokeResults.ConfigureAwait(false))
+        {
+            await NotifyMessagesAsync().ConfigureAwait(false);
+            yield return new(result, agentThread);


+        [Fact]
+        public void InvokeAsync_ShouldMapRateLimitCorrectly()
+        {
+            var ex = new HttpRequestException("HTTP 429 Rate limit exceeded");
+            KernelException? result = null;
+
+            try
+            {
+                if (ex.Message.Contains("429"))
+                    throw new KernelException($"Rate limit exceeded for agent '{AgentName}'. Check Retry-After header and implement backoff.", ex);
+            }
+            catch (KernelException ke)
+            {
+                result = ke;
+            }
+
+            Assert.NotNull(result);
+            Assert.Contains("Rate limit exceeded", result.Message);
+            Assert.Contains("Retry-After header", result.Message);
+            Assert.Equal(ex, result.InnerException);
+        }


+            var thrownException =  Assert.ThrowsAsync<InvalidOperationException>(() =>
+            {
+                throw ex;
+            });
+
+            Assert.Equal("Unknown streaming exception", thrownException.Result.Message);


+using System;
+using System.Net.Http;
+using System.Threading;
+using System.Threading.Tasks;
+using Xunit;
+using Microsoft.SemanticKernel;
+
+namespace SemanticKernel.Agents.UnitTests.OpenAI
+{
+    /// <summary>
+    /// Tests for the updated exception handling logic in OpenAIResponseAgent.
+    /// Verifies that KernelException messages are correct and unknown exceptions propagate.
+    /// </summary>


+        await foreach (var result in invokeResults.ConfigureAwait(false))
+        {
+            await NotifyMessagesAsync().ConfigureAwait(false);
+            yield return new(result, agentThread);
+        }


Yusuftmle · 2026-05-11T11:55:21Z

Hi @westey-m, just checking in after some time.
Is there anything needed from my side
to move this forward? Happy to help!

…ests

Yusuftmle requested a review from a team as a code owner August 26, 2025 12:34

moonbox3 added the .NET Issue or Pull requests regarding .NET code label Aug 26, 2025

github-actions Bot changed the title ~~.NET: Improve OpenAIResponseAgent exception handling (rate limit, auth, content filter, model not found)~~ .Net: Improve OpenAIResponseAgent exception handling (rate limit, auth, content filter, model not found) Aug 26, 2025

markwallace-microsoft reviewed Sep 10, 2025

View reviewed changes

Comment thread dotnet/src/Agents/OpenAI/OpenAIResponseAgent.cs Outdated

Yusuftmle and others added 2 commits September 10, 2025 18:39

Improve OpenAIResponseAgent exception handling and add unit tests

67a7b9c

Yusuftmle force-pushed the improve-openai-exception-handling branch 2 times, most recently from 55698c5 to 67a7b9c Compare September 10, 2025 15:49

Yusuftmle added 3 commits December 17, 2025 13:06

Merge branch 'main' into improve-openai-exception-handling

8d6f33c

Merge branch 'main' into improve-openai-exception-handling

f5fa5bc

Merge branch 'main' into improve-openai-exception-handling

fde21ff

westey-m self-assigned this Apr 29, 2026

westey-m added this to Agent Framework Apr 29, 2026

westey-m moved this to Community PR in Agent Framework Apr 29, 2026

Merge branch 'main' into improve-openai-exception-handling

01a848b

Copilot AI review requested due to automatic review settings May 11, 2026 11:48

Copilot started reviewing on behalf of Yusuftmle May 11, 2026 11:49 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Yusuftmle added 3 commits May 11, 2026 15:22

Fix deferred exception handling on lazy enumeration and update unit t…

30cd7bc

…ests

Merge branch 'main' into improve-openai-exception-handling

19a6859

Merge branch 'main' into improve-openai-exception-handling

8abad24

Yusuftmle mentioned this pull request May 12, 2026

.Net: fix(connectors): Support request-level ModelId overrides for Google, Vertex AI, and OpenAI #13999

Open

Merge branch 'main' into improve-openai-exception-handling

ffd49f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: Improve OpenAIResponseAgent exception handling (rate limit, auth, content filter, model not found)#13011

.Net: Improve OpenAIResponseAgent exception handling (rate limit, auth, content filter, model not found)#13011
Yusuftmle wants to merge 10 commits into
microsoft:mainfrom
Yusuftmle:improve-openai-exception-handling

Yusuftmle commented Aug 26, 2025 •

edited

Loading

Uh oh!

Yusuftmle commented Aug 26, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Yusuftmle May 11, 2026

Uh oh!

Yusuftmle commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Yusuftmle commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Why

Notes

Environment Tested

Uh oh!

Yusuftmle commented Aug 26, 2025

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Yusuftmle May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Yusuftmle commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Yusuftmle commented Aug 26, 2025 •

edited

Loading