Skip to content

fix: handle malformed api_constraints without crashing endpoint (#273)#285

Open
Gokul-social wants to merge 6 commits intoopenml:mainfrom
Gokul-social:fix/273-graceful-api-constraints
Open

fix: handle malformed api_constraints without crashing endpoint (#273)#285
Gokul-social wants to merge 6 commits intoopenml:mainfrom
Gokul-social:fix/273-graceful-api-constraints

Conversation

@Gokul-social
Copy link

Summary

Hardens the /tasktype/{task_type_id} endpoint against inconsistent api_constraints data. Fixes #273.

The Problem

The existing implementation was fragile, relying on strict json.loads() and direct key access. This led to unnecessary 500 errors when encountering:

  • Malformed JSON strings or unexpected dict types in the persistence layer.
  • Missing data_type keys, triggering unhandled KeyErrors.
  • Type mismatches that caused the parser to bail instead of degrading gracefully.

The Solution

I've introduced a defensive parse_api_constraints helper to act as a buffer between the DB and the response.

  • Type Agnostic Parsing: Gracefully handles both JSON strings and pre-parsed dictionaries.
  • Non-Breaking Validation: Switched to .get() with strict type-checking for data_type. If a value is invalid, it is omitted from the output rather than crashing the request.
  • Observability: Added structured logging (WARNING for structural failures, DEBUG for missing data) prefixed with api_constraints: to help us audit bad data in production.

Testing

Comprehensive unit tests added to cover:

  • Valid/Malformed JSON and raw dict inputs.
  • Boundary cases: Empty values, nulls, and unsupported schema types.
  • Verification that response structure remains backward compatible.

Copilot AI review requested due to automatic review settings March 21, 2026 15:11
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 02939bc3-0108-4e88-af35-3c81d87801ee

📥 Commits

Reviewing files that changed from the base of the PR and between ea00485 and 4fca9fb.

📒 Files selected for processing (2)
  • src/routers/openml/tasktype.py
  • tests/routers/openml/test_parse_api_constraints.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/routers/openml/test_parse_api_constraints.py

Walkthrough

Adds module-level logging and a new helper parse_api_constraints(api_constraints: str | Mapping[str, object] | None, *, task_type_id: int, input_name: str) -> str | None in src/routers/openml/tasktype.py that defensively handles None, dict, and JSON-string inputs and returns a non-empty data_type string or None. Updates get_task_type so input[].data_type is included only when the helper returns a valid data_type; the function docstring documents this field as optional. Adds tests/routers/openml/test_parse_api_constraints.py covering parsing edge cases and expected logging.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: handling malformed api_constraints to prevent endpoint crashes, with a direct reference to the issue.
Description check ✅ Passed The description comprehensively explains the problem, solution, and testing approach, directly relating to the changeset's defensive parsing and logging improvements.
Linked Issues check ✅ Passed The PR fully addresses all coding requirements from issue #273: defensive parsing of both JSON strings and dicts, graceful error handling, conditional data_type inclusion, and structured logging with proper observability.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the api_constraints parsing issue: new parse_api_constraints helper, updated get_task_type behavior, comprehensive unit tests, and logging additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In parse_api_constraints, you currently only accept plain dict instances; if upstream ever passes a Mapping (e.g. dict-like) object it will be treated as unsupported, so consider loosening the type check to collections.abc.Mapping to make the helper more robust to different ORM or serialization layers.
  • The test_valid_constraint_no_warning assertion len(caplog.records) == 0 can be brittle if any other logger emits during the test run; it would be safer to assert that there are no records from the specific logger or containing the api_constraints: prefix instead of enforcing an absolutely empty log.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `parse_api_constraints`, you currently only accept plain `dict` instances; if upstream ever passes a `Mapping` (e.g. `dict`-like) object it will be treated as unsupported, so consider loosening the type check to `collections.abc.Mapping` to make the helper more robust to different ORM or serialization layers.
- The `test_valid_constraint_no_warning` assertion `len(caplog.records) == 0` can be brittle if any other logger emits during the test run; it would be safer to assert that there are no records from the specific logger or containing the `api_constraints:` prefix instead of enforcing an absolutely empty log.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens the /tasktype/{task_type_id} endpoint so malformed or inconsistent api_constraints data no longer causes 500s, aligning behavior with issue #273.

Changes:

  • Added parse_api_constraints helper to defensively parse api_constraints and safely extract data_type.
  • Updated task type response construction to include input[].data_type only when valid.
  • Added unit tests covering valid/malformed inputs and logging behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/routers/openml/tasktype.py Introduces defensive parsing + logging and uses it when building the /tasktype/{id} response.
tests/routers/openml/test_parse_api_constraints.py Adds unit tests for parsing outcomes and emitted log levels/messages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tasktype endpoint should gracefully handle malformed api_constraints values

2 participants