fix: improve HTTP error messages with source attribution and stream context#918
fix: improve HTTP error messages with source attribution and stream context#918Patrick Nilan (pnilan) wants to merge 1 commit intomainfrom
Conversation
…and stream context Default HTTP error messages now clearly attribute errors to the source's API (not Airbyte platform), and HttpClient injects the stream name and HTTP status code for user-facing context. Example: "Stream 'users': HTTP 403. Source's API denied access. Configured credentials have insufficient permissions." Also fixes the default fallback connector error message to be more specific.
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@pnilan/fix-error-messages#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch pnilan/fix-error-messagesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
There was a problem hiding this comment.
Pull request overview
This pull request improves HTTP error messages throughout the Airbyte CDK by clarifying that errors originate from the source's API (not Airbyte) and adding stream context to make debugging easier. The changes update default error mappings, introduce a new formatting method to prepend stream names and HTTP status codes, and update the generic connector error fallback message.
Changes:
- Updated all HTTP status code error messages (400-504) in the default error mapping to clearly attribute errors to "source's API" rather than being ambiguous about the error origin
- Added
_format_error_message()method toHttpClientthat prepends stream name and HTTP status code context to error messages - Changed the default connector error fallback from "Something went wrong in the connector. See the logs for more details." to "Unhandled connector error. See logs for details."
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py |
Updated all HTTP status code error messages to attribute errors to source's API instead of being ambiguous |
airbyte_cdk/sources/streams/http/http_client.py |
Added _format_error_message() method and integrated it into error handling to prepend stream name and HTTP status context |
airbyte_cdk/utils/traced_exception.py |
Updated default fallback error message to be more concise |
unit_tests/**/*.py |
Updated all test assertions to match new error message formats |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _format_error_message(self, error_resolution: ErrorResolution, response: Optional[requests.Response]) -> Optional[str]: | ||
| """Prepend stream name and HTTP status code to the error resolution message for user-facing context.""" | ||
| if not error_resolution.error_message: | ||
| return None | ||
| status_prefix = f"HTTP {response.status_code}. " if response is not None else "" | ||
| return f"Stream '{self._name}': {status_prefix}{error_resolution.error_message}" |
There was a problem hiding this comment.
The new _format_error_message method lacks dedicated unit tests. While existing tests will implicitly verify that the formatted messages contain the expected substrings, there are no tests that explicitly verify the stream name and HTTP status code are correctly prepended to error messages. Consider adding unit tests specifically for this method to verify its behavior with and without a response object, and to ensure the formatting is correct.
📝 WalkthroughWalkthroughThe PR refines HTTP error messaging throughout the Airbyte CDK by introducing a helper function to enrich error messages with stream context and HTTP status codes, updating default error messages for better user clarity, and aligning test expectations with these changes. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py (1)
16-30: The exception-based mappings still use the old verbose style — wdyt about aligning them too?The three exception-type entries (
InvalidSchema,InvalidURL,RequestException) retain the legacy"Exception: requests.exceptions.XXX"suffix, while every HTTP-status entry was updated to the shorter, source/API-focused style. If the goal is consistent user-facing messaging, could these three be updated in the same pass?✏️ Suggested alignment
InvalidSchema: ErrorResolution( response_action=ResponseAction.FAIL, failure_type=FailureType.config_error, - error_message="Invalid Protocol Schema: The endpoint that data is being requested from is using an invalid or insecure. Exception: requests.exceptions.InvalidSchema", + error_message="Invalid or insecure protocol schema in the source's endpoint URL.", ), InvalidURL: ErrorResolution( response_action=ResponseAction.RETRY, failure_type=FailureType.transient_error, - error_message="Invalid URL specified or DNS error occurred: The endpoint that data is being requested from is not a valid URL. Exception: requests.exceptions.InvalidURL", + error_message="Invalid URL or DNS error for the source's endpoint.", ), RequestException: ErrorResolution( response_action=ResponseAction.RETRY, failure_type=FailureType.transient_error, - error_message="An exception occurred when making the request. Exception: requests.exceptions.RequestException", + error_message="An exception occurred when making the request to source's API.", ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py` around lines 16 - 30, Update the three exception mappings (InvalidSchema, InvalidURL, RequestException) in default_error_mapping.py to use the shorter, API/source-focused error_message style like the HTTP-status entries: edit the ErrorResolution instances for InvalidSchema, InvalidURL, and RequestException so their error_message strings remove the legacy "Exception: requests.exceptions.XXX" suffix and instead give a concise, user-facing description (keeping response_action and failure_type as-is) so messages are consistent across exception- and status-based mappings.unit_tests/sources/streams/http/test_availability_strategy.py (1)
108-110: Could we assert the new stream/status prefix here too, wdyt?Line 108-Line 110 currently validates the mapped message, but not the newly introduced formatted context (
Stream ...+HTTP 404.). Adding that check would better guard_format_error_message()behavior.🧪 Possible assertion hardening
- assert "Requested resource not found on source's API." in reason + assert reason is not None + assert reason.startswith("Stream '") + assert "HTTP 404." in reason + assert "Requested resource not found on source's API." in reason🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unit_tests/sources/streams/http/test_availability_strategy.py` around lines 108 - 110, The test currently asserts only the mapped message; update the assertion after calling HttpAvailabilityStrategy().check_availability(http_stream, logger) to also verify the formatted context produced by _format_error_message — specifically assert that reason includes the "Stream <stream.name> reported HTTP 404." (or the literal "Stream" prefix plus "HTTP 404.") alongside "Requested resource not found on source's API."; locate the test using is_available, reason, http_stream, logger and add the extra assertion checking the new stream/status prefix to harden the expectation.unit_tests/sources/declarative/checks/test_check_stream.py (1)
524-615: Would you consider extracting these repeated expected messages into shared constants, wdyt?It could reduce duplication and make future wording changes less error-prone in this parametrized block.
♻️ Small DRY refactor idea
+NOT_FOUND_MSG = "Requested resource not found on source's API." +FORBIDDEN_MSG = "Source's API denied access. Configured credentials have insufficient permissions." +UNAUTHORIZED_MSG = "Authentication failed on source's API. Credentials may be invalid, expired, or lack required access." ... - ["Requested resource not found on source's API."], + [NOT_FOUND_MSG], ... - ["Source's API denied access. Configured credentials have insufficient permissions."], + [FORBIDDEN_MSG], ... - ["Authentication failed on source's API. Credentials may be invalid, expired, or lack required access."], + [UNAUTHORIZED_MSG],🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unit_tests/sources/declarative/checks/test_check_stream.py` around lines 524 - 615, Extract the repeated expected message strings into module-level constants (e.g., EXPECTED_MSG_RESOURCE_NOT_FOUND, EXPECTED_MSG_ACCESS_DENIED, EXPECTED_MSG_AUTH_FAILED) in unit_tests/sources/declarative/checks/test_check_stream.py and replace the literal strings inside the pytest.param entries (the repeated messages in the parametrized CheckStream cases) with those constants; ensure the constants' names are clear and reuse them for both single-stream and dynamic_streams_check_configs cases so future wording changes require only one edit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.py`:
- Around line 16-30: Update the three exception mappings (InvalidSchema,
InvalidURL, RequestException) in default_error_mapping.py to use the shorter,
API/source-focused error_message style like the HTTP-status entries: edit the
ErrorResolution instances for InvalidSchema, InvalidURL, and RequestException so
their error_message strings remove the legacy "Exception:
requests.exceptions.XXX" suffix and instead give a concise, user-facing
description (keeping response_action and failure_type as-is) so messages are
consistent across exception- and status-based mappings.
In `@unit_tests/sources/declarative/checks/test_check_stream.py`:
- Around line 524-615: Extract the repeated expected message strings into
module-level constants (e.g., EXPECTED_MSG_RESOURCE_NOT_FOUND,
EXPECTED_MSG_ACCESS_DENIED, EXPECTED_MSG_AUTH_FAILED) in
unit_tests/sources/declarative/checks/test_check_stream.py and replace the
literal strings inside the pytest.param entries (the repeated messages in the
parametrized CheckStream cases) with those constants; ensure the constants'
names are clear and reuse them for both single-stream and
dynamic_streams_check_configs cases so future wording changes require only one
edit.
In `@unit_tests/sources/streams/http/test_availability_strategy.py`:
- Around line 108-110: The test currently asserts only the mapped message;
update the assertion after calling
HttpAvailabilityStrategy().check_availability(http_stream, logger) to also
verify the formatted context produced by _format_error_message — specifically
assert that reason includes the "Stream <stream.name> reported HTTP 404." (or
the literal "Stream" prefix plus "HTTP 404.") alongside "Requested resource not
found on source's API."; locate the test using is_available, reason,
http_stream, logger and add the extra assertion checking the new stream/status
prefix to harden the expectation.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
airbyte_cdk/sources/streams/http/error_handlers/default_error_mapping.pyairbyte_cdk/sources/streams/http/http_client.pyairbyte_cdk/utils/traced_exception.pyunit_tests/sources/declarative/checks/test_check_dynamic_stream.pyunit_tests/sources/declarative/checks/test_check_stream.pyunit_tests/sources/declarative/requesters/error_handlers/test_default_error_handler.pyunit_tests/sources/declarative/requesters/error_handlers/test_http_response_filter.pyunit_tests/sources/streams/http/error_handlers/test_http_status_error_handler.pyunit_tests/sources/streams/http/test_availability_strategy.pyunit_tests/sources/streams/http/test_http.pyunit_tests/sources/test_abstract_source.pyunit_tests/utils/test_traced_exception.py
PyTest Results (Fast)3 114 tests - 755 3 101 ✅ - 756 6m 27s ⏱️ -24s For more details on these failures, see this check. Results for commit cd9aa33. ± Comparison against base commit 7f41401. This pull request removes 757 and adds 2 tests. Note that renamed tests count towards both. |
PyTest Results (Full)3 872 tests 3 853 ✅ 10m 59s ⏱️ For more details on these failures, see this check. Results for commit cd9aa33. |
|
Closing for now -- don't feel just rewording errors is useful. |
Summary
_format_error_message()toHttpClientthat injects stream name and HTTP status code into user-facing messages (e.g.Stream 'users': HTTP 403. Source's API denied access.)Error message changes
HTTP Status Code: 400. Bad request.Bad request response from source's API.HTTP Status Code: 401. Unauthorized...Authentication failed on source's API. Credentials may be invalid, expired, or lack required access.HTTP Status Code: 403. Forbidden...Source's API denied access. Configured credentials have insufficient permissions.HTTP Status Code: 404. Requested resource not found.Requested resource not found on source's API.HTTP Status Code: 405. Method not allowed.Method not allowed by source's API.HTTP Status Code: 408. Request timeout.Request to source's API timed out.HTTP Status Code: 429. Rate limit exceeded.Rate limit exceeded on source's API.HTTP Status Code: 500. Internal server error.Internal server error from source's API.HTTP Status Code: 502. Bad gateway.Bad gateway response from source's API.HTTP Status Code: 503. Service unavailable.Source's API is temporarily unavailable.HTTP Status Code: 504. Gateway timeout.Gateway timeout from source's API.Test plan
_format_error_messagecorrectly prepends stream name and HTTP status🤖 Generated with Claude Code
Summary by CodeRabbit