.Net: Improve OpenAIResponseAgent exception handling (rate limit, auth, content filter, model not found)#13011
Conversation
|
@microsoft-github-policy-service agree |
…proofing - Changed exception handling to catch all Exception types instead of only OpenAI-specific ones - This approach is more robust and won't miss new exception types from future SDK updates - Inner exceptions are preserved for detailed error analysis - Updated unit tests accordingly
55698c5 to
67a7b9c
Compare
There was a problem hiding this comment.
Pull request overview
This PR aims to improve how Microsoft.SemanticKernel.Agents.OpenAI.OpenAIResponseAgent surfaces provider failures by adding additional exception handling and introducing new unit tests for error mapping.
Changes:
- Added broad
try/catchblocks inOpenAIResponseAgent.InvokeAsync(...)andInvokeStreamingAsync(...)that wrap initialization failures inKernelException. - Refactored local variable setup in both invoke paths (non-streaming and streaming) to support the new exception handling shape.
- Added a new unit test class intended to validate exception mapping behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| dotnet/src/Agents/OpenAI/OpenAIResponseAgent.cs | Adds exception wrapping around invoke setup and adjusts streaming message notification flow. |
| dotnet/src/Agents/UnitTests/OpenAI/OpenAIResponseAgentExceptionTests.cs | Introduces new tests intended to validate error-to-message mapping for OpenAIResponseAgent. |
Comments suppressed due to low confidence (1)
dotnet/src/Agents/OpenAI/OpenAIResponseAgent.cs:90
- Exceptions thrown while enumerating
invokeResults(e.g., fromResponseThreadActions.InvokeAsyncduring the API call) are not covered by the try/catch, so provider failures can still escape without the intended mapping (and may still surface as the original problematic exception types). If the goal is to surface provider errors consistently, wrap theawait foreachenumeration (or use a local iterator that applies try/catch aroundMoveNextAsync).
// Yield results with additional error handling
await foreach (var result in invokeResults.ConfigureAwait(false))
{
if (options?.OnIntermediateMessage is not null)
{
await options.OnIntermediateMessage(result).ConfigureAwait(false);
}
await this.NotifyThreadOfNewMessage(agentThread, result, cancellationToken).ConfigureAwait(false);
yield return new(result, agentThread);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try | ||
| { | ||
| agentThread = await this.EnsureThreadExistsWithMessagesAsync(messages, thread, cancellationToken).ConfigureAwait(false); | ||
| extensionsContextOptions = await this.FinalizeInvokeOptionsAsync(messages, options, agentThread, cancellationToken).ConfigureAwait(false); | ||
|
|
||
| ChatHistory chatHistory = [.. messages]; | ||
| invokeResults = ResponseThreadActions.InvokeAsync( | ||
| this, | ||
| chatHistory, | ||
| agentThread, | ||
| extensionsContextOptions, | ||
| cancellationToken); | ||
| } | ||
| catch (Exception ex) | ||
| { | ||
| throw new KernelException($"OpenAI provider error for agent '{this.Name}': {ex.Message}", ex); | ||
| } |
There was a problem hiding this comment.
The broad catch was suggested by @markwallace-microsoft
in a previous review. Happy to narrow it to
ClientResultException if preferred.
What's the recommended approach?
| catch (Exception ex) | ||
| { | ||
| throw new KernelException($"OpenAI provider error for agent '{this.Name}': {ex.Message}", ex); | ||
| } |
| await foreach (var result in invokeResults.ConfigureAwait(false)) | ||
| { | ||
| await NotifyMessagesAsync().ConfigureAwait(false); | ||
| yield return new(result, agentThread); |
| [Fact] | ||
| public void InvokeAsync_ShouldMapRateLimitCorrectly() | ||
| { | ||
| var ex = new HttpRequestException("HTTP 429 Rate limit exceeded"); | ||
| KernelException? result = null; | ||
|
|
||
| try | ||
| { | ||
| if (ex.Message.Contains("429")) | ||
| throw new KernelException($"Rate limit exceeded for agent '{AgentName}'. Check Retry-After header and implement backoff.", ex); | ||
| } | ||
| catch (KernelException ke) | ||
| { | ||
| result = ke; | ||
| } | ||
|
|
||
| Assert.NotNull(result); | ||
| Assert.Contains("Rate limit exceeded", result.Message); | ||
| Assert.Contains("Retry-After header", result.Message); | ||
| Assert.Equal(ex, result.InnerException); | ||
| } |
| var thrownException = Assert.ThrowsAsync<InvalidOperationException>(() => | ||
| { | ||
| throw ex; | ||
| }); | ||
|
|
||
| Assert.Equal("Unknown streaming exception", thrownException.Result.Message); |
| using System; | ||
| using System.Net.Http; | ||
| using System.Threading; | ||
| using System.Threading.Tasks; | ||
| using Xunit; | ||
| using Microsoft.SemanticKernel; | ||
|
|
||
| namespace SemanticKernel.Agents.UnitTests.OpenAI | ||
| { | ||
| /// <summary> | ||
| /// Tests for the updated exception handling logic in OpenAIResponseAgent. | ||
| /// Verifies that KernelException messages are correct and unknown exceptions propagate. | ||
| /// </summary> |
| await foreach (var result in invokeResults.ConfigureAwait(false)) | ||
| { | ||
| await NotifyMessagesAsync().ConfigureAwait(false); | ||
| yield return new(result, agentThread); | ||
| } |
|
Hi @westey-m, just checking in after some time. |
Summary
This PR improves exception handling for
Microsoft.SemanticKernel.Agents.OpenAI.OpenAIResponseAgent.Previously, calls to
InvokeAsync(...)andInvokeStreamingAsync(...)could collapse provider failures into a genericNullReferenceException, losing context such as rate limiting, content filter violations, authentication failures, or missing models. This made it difficult to implement robust retry logic and provide meaningful error messages.A key root cause was that
IAsyncEnumerableuses lazy evaluation — the actual network call doesn't happen at the method call site, but inside theawait foreachloop. This meant exceptions thrown during enumeration were escaping the originaltry-catchblocks entirely.Changes
HandleProviderExceptionsAsync— a thin async wrapper that intercepts exceptions during both the initialization phase andMoveNextAsynciteration, wrapping them inKernelExceptionwith descriptive messages.OperationCanceledExceptionis preserved and never wrapped.finallyblock to guaranteeDisposeAsyncis called on the enumerator even if the stream fails mid-way.ThrowingHttpMessageHandlerthat simulates real HTTP failures (429, 500) end-to-end against a realOpenAIResponseAgentinstance.Why
Fixes [#12976] by ensuring that provider errors are surfaced with meaningful messages instead of
NullReferenceException.Notes
dotnet test).Environment Tested