Skip to content

Add GET /runs/trace/{run_id} endpoint#267

Open
saathviksheerla wants to merge 4 commits intoopenml:mainfrom
saathviksheerla:feat/get-run-trace-endpoint
Open

Add GET /runs/trace/{run_id} endpoint#267
saathviksheerla wants to merge 4 commits intoopenml:mainfrom
saathviksheerla:feat/get-run-trace-endpoint

Conversation

@saathviksheerla
Copy link
Contributor

Closes #40

Implements the GET /runs/trace/{run_id} endpoint, migrating behavior from the PHP API.

Changes

  • src/database/runs.py — SQL queries for run existence check and trace fetch
  • src/schemas/runs.py — Pydantic models matching PHP response structure
  • src/routers/openml/runs.py — FastAPI endpoint
  • src/main.py — router registration
  • tests/routers/openml/runs_test.py — 6 tests covering happy path, ordering, and error cases

Behavior

  • Returns 571 if run does not exist
  • Returns 572 if run exists but has no trace data
  • Response matches PHP API structure exactly, including string types for all fields

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

Walkthrough

Adds database helpers to fetch a run and its ordered trace iterations in src/database/runs.py. Adds Pydantic schemas for trace responses in src/schemas/runs.py. Introduces a FastAPI router at /runs/trace/{run_id} in src/routers/openml/runs.py that checks run existence, returns structured trace data, or responds with 412 codes for missing run or missing trace. Registers the new router in src/main.py. Adds tests in tests/routers/openml/runs_test.py covering successful retrieval, ordering, nonexistent run, missing trace, and invalid run_id cases.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely describes the primary change: adding a new GET endpoint for retrieving run traces.
Description check ✅ Passed The description comprehensively explains the implementation, lists all changed files, and details the endpoint behavior including error codes.
Linked Issues check ✅ Passed The PR successfully implements the GET /runs/trace/{run_id} endpoint specified in issue #40 with complete implementation, testing, and PHP API compatibility.
Out of Scope Changes check ✅ Passed All changes are focused on implementing the run trace endpoint: database queries, API schemas, router registration, and comprehensive test coverage; no extraneous modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The expdb dependency in get_run_trace doesn't need a default of None—dropping the default makes the intent clearer that it must always be provided by FastAPI's dependency injection.
  • In TraceIteration, consider constraining selected to a Literal["true", "false"] to encode the PHP-compatible value space directly in the schema and catch accidental deviations early.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `expdb` dependency in `get_run_trace` doesn't need a default of `None`—dropping the default makes the intent clearer that it must always be provided by FastAPI's dependency injection.
- In `TraceIteration`, consider constraining `selected` to a `Literal["true", "false"]` to encode the PHP-compatible value space directly in the schema and catch accidental deviations early.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)

35-47: Normalize setup_string, evaluation, and selected to match the explicit type-casting of other fields.

The code explicitly normalizes run_id, repeat, fold, and iteration with str(), but forwards setup_string, evaluation, and selected directly from the SQL row into a schema requiring str. While tests validate these fields are already strings and selected is "true" or "false", this inconsistency creates a fragility point: if the database ever stores a NULL value or non-string type for these fields, Pydantic model construction will fail without a clear error message.

Add str() calls to these three fields at lines 44-46 to match the pattern used for the other iteration fields and ensure consistent defensive programming.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 35 - 47, The TraceIteration
construction in RunTraceResponse currently casts run_id/repeat/fold/iteration to
str but passes setup_string, evaluation, and selected directly; update the
TraceIteration creation (inside RunTraceResponse -> RunTrace -> trace_iteration)
to wrap setup_string, evaluation, and selected with str() as well so all
iteration fields are normalized (e.g., use str(row.setup_string),
str(row.evaluation), str(row.selected)) to avoid Pydantic errors from
NULL/non-string DB values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 35-47: The TraceIteration construction in RunTraceResponse
currently casts run_id/repeat/fold/iteration to str but passes setup_string,
evaluation, and selected directly; update the TraceIteration creation (inside
RunTraceResponse -> RunTrace -> trace_iteration) to wrap setup_string,
evaluation, and selected with str() as well so all iteration fields are
normalized (e.g., use str(row.setup_string), str(row.evaluation),
str(row.selected)) to avoid Pydantic errors from NULL/non-string DB values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7a66e163-c68e-4081-935d-0ad5753181e2

📥 Commits

Reviewing files that changed from the base of the PR and between 72989df and 64ee6c2.

📒 Files selected for processing (5)
  • src/database/runs.py
  • src/main.py
  • src/routers/openml/runs.py
  • src/schemas/runs.py
  • tests/routers/openml/runs_test.py

@codecov
Copy link

codecov bot commented Mar 8, 2026

Codecov Report

❌ Patch coverage is 70.00000% with 9 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@72989df). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/routers/openml/runs.py 62.50% 6 Missing ⚠️
src/database/runs.py 71.42% 2 Missing ⚠️
src/main.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #267   +/-   ##
=======================================
  Coverage        ?   54.76%           
=======================================
  Files           ?       35           
  Lines           ?     1165           
  Branches        ?      103           
=======================================
  Hits            ?      638           
  Misses          ?      525           
  Partials        ?        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 40-47: Convert the `selected` value to a string when constructing
the TraceIteration so it matches the other iteration fields and the Pydantic
schema: change the construction in the TraceIteration call that currently passes
selected=row.selected to use selected=str(row.selected) (locate the
TraceIteration instantiation that sets repeat=str(row.repeat),
fold=str(row.fold), iteration=str(row.iteration)) so `selected` becomes a string
literal "true"/"false" as expected by the model.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b3a49664-e599-42cd-b1f4-f36558498629

📥 Commits

Reviewing files that changed from the base of the PR and between 64ee6c2 and 257c444.

📒 Files selected for processing (2)
  • src/routers/openml/runs.py
  • src/schemas/runs.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/schemas/runs.py

@saathviksheerla
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Line 51: Normalize the selected value explicitly instead of relying on
truthiness: replace the ternary that uses row.selected (the selected key in the
serialized output) with an explicit normalization check that handles booleans,
integers, enums, and strings (for example, treat values as true only if they
match True, 1, "1", "true", "t", or "yes" case-insensitively); update the
serialization that sets selected="true"/"false" to use that normalization logic
(or a small helper like normalize_selected(value)) so string values like "false"
no longer serialize as "true".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1415a658-924d-4dbd-a450-2656165973f4

📥 Commits

Reviewing files that changed from the base of the PR and between 257c444 and ae1f456.

📒 Files selected for processing (1)
  • src/routers/openml/runs.py

@saathviksheerla
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 8, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 45-51: When constructing the TraceIteration instance, explicitly
cast the database value for evaluation to a string instead of forwarding the raw
value; update the TraceIteration(...) call (the evaluation argument alongside
repeat, fold, iteration) to use str(row.evaluation) so TraceIteration.evaluation
is always a string and matches the model/tests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0101722f-6ed4-4983-a863-5d2df0d9f22d

📥 Commits

Reviewing files that changed from the base of the PR and between ae1f456 and 6b876f3.

📒 Files selected for processing (1)
  • src/routers/openml/runs.py

Comment on lines +45 to +51
TraceIteration(
repeat=str(row.repeat),
fold=str(row.fold),
iteration=str(row.iteration),
setup_string=row.setup_string,
evaluation=row.evaluation,
selected=row.selected,
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the TraceIteration schema definition
fd -t f "runs.py" -path "*/schemas/*" | head -5

Repository: openml/server-api

Length of output: 229


🏁 Script executed:

# Also search for TraceIteration class definition
rg "class TraceIteration" --type py -A 10

Repository: openml/server-api

Length of output: 488


🏁 Script executed:

# Check the test file for evaluation assertions
fd -t f "runs_test.py" -path "*/tests/*"

Repository: openml/server-api

Length of output: 229


🏁 Script executed:

# Look for database schema/migrations related to trace tables
fd -t f -e "py" -path "*migration*" | grep -i trace | head -10

Repository: openml/server-api

Length of output: 229


🏁 Script executed:

# Examine the actual code at lines 45-51 in the file under review
cat -n src/routers/openml/runs.py | sed -n '40,55p'

Repository: openml/server-api

Length of output: 759


🏁 Script executed:

# Find and examine the test file
find . -path "*/tests/*" -name "*runs*test*" -type f

Repository: openml/server-api

Length of output: 96


🏁 Script executed:

# Check for database schema definitions (migrations or fixtures)
find . -path "*migration*" -o -path "*fixture*" -o -path "*schema*" | grep -i "\.py$\|\.sql$" | head -20

Repository: openml/server-api

Length of output: 688


🏁 Script executed:

# Search for evaluation field references in tests
rg "evaluation" tests/ -A 2 -B 2 | head -50

Repository: openml/server-api

Length of output: 3343


🏁 Script executed:

# Look for the database model/ORM definition for the runs trace table
rg "class.*Trace\|run.*trace\|trace.*iteration" src/ --type py -i | grep -i "class\|table" | head -10

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Look for SQLAlchemy ORM models
find src -name "*.py" | xargs rg "class.*Trace|Table.*trace" -i | head -20

Repository: openml/server-api

Length of output: 215


🏁 Script executed:

# Search for database models or declarative base
rg "declarative_base\|@dataclass\|class.*Base\|from sqlalchemy" src/ | head -20

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Look in src/db or similar for model definitions
find src -type d -name "*db*" -o -name "*model*" -o -name "*orm*"

Repository: openml/server-api

Length of output: 83


🏁 Script executed:

# Check if there's any initialization SQL or schema file
find . -name "*.sql" -o -name "*init*.py" -path "*/db/*" | head -10

Repository: openml/server-api

Length of output: 128


🏁 Script executed:

# Search for trace table definition
rg "trace_iteration\|selected.*ENUM\|evaluation.*DECIMAL\|evaluation.*Float" -i

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Search for trace table definition in the SQL schema files
rg "trace_iteration|CREATE TABLE.*trace" docker/mysql/data/ -i -A 15

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Look for the specific evaluation column definition
rg "evaluation" docker/mysql/data/ -B 5 -A 2 | head -60

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Check the size of the SQL files first
wc -l docker/mysql/data/openml.sql docker/mysql/data/openml_expdb.sql

Repository: openml/server-api

Length of output: 139


🏁 Script executed:

# Search for trace in the SQL files
grep -n -i "trace" docker/mysql/data/openml.sql docker/mysql/data/openml_expdb.sql | head -20

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Look for run related tables
grep -n -i "run" docker/mysql/data/openml.sql docker/mysql/data/openml_expdb.sql | grep -i "create table" | head -10

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Read the SQL files directly
cat docker/mysql/data/openml.sql
cat docker/mysql/data/openml_expdb.sql

Repository: openml/server-api

Length of output: 158


🏁 Script executed:

# Search for where the trace rows are queried - this will show us the table structure
rg "trace.*row\|select.*trace\|from.*trace" src/ -i -B 2 -A 2 | head -40

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Show more context from runs.py to find where trace_rows comes from
cat -n src/routers/openml/runs.py | head -60

Repository: openml/server-api

Length of output: 2277


🏁 Script executed:

# Search for the query that selects trace_rows
rg "trace_rows\|trace.*select" src/routers/openml/runs.py -B 5 -A 5

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Find the database.runs module
find src -path "*database*runs*" -name "*.py"

Repository: openml/server-api

Length of output: 81


🏁 Script executed:

# If not found, check the structure of database directory
find src -type d -name "database" -o -name "db"

Repository: openml/server-api

Length of output: 73


🏁 Script executed:

# Read the database/runs.py file
cat src/database/runs.py

Repository: openml/server-api

Length of output: 1128


🏁 Script executed:

# Look for any migrations or table definitions mentioning trace table
find . -name "*.py" -path "*migration*" | head -20

Repository: openml/server-api

Length of output: 401


🏁 Script executed:

# Search for trace table in all Python files
rg "trace" src/ -t py -B 2 -A 2 | grep -i "decimal\|float\|varchar\|numeric" | head -20

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Check if there are any comments or docstrings about the trace table
rg "trace.*table\|evaluation.*column" src/ -i -B 3 -A 3

Repository: openml/server-api

Length of output: 43


🏁 Script executed:

# Let's check the test file to see if there's any fixture or test data that shows the type
cat -n tests/routers/openml/runs_test.py | head -100

Repository: openml/server-api

Length of output: 3383


Stringify evaluation before building TraceIteration.

TraceIteration.evaluation is declared as str, and tests explicitly assert it must be a string, but line 50 forwards the raw DB value. Lines 46–48 cast repeat, fold, and iteration with str() for PHP compatibility; evaluation should follow the same pattern. If that column is numeric in the DB, the current code relies on implicit Pydantic coercion instead of explicit serialization.

🛠️ Proposed fix
                 TraceIteration(
                     repeat=str(row.repeat),
                     fold=str(row.fold),
                     iteration=str(row.iteration),
                     setup_string=row.setup_string,
-                    evaluation=row.evaluation,
+                    evaluation=str(row.evaluation),
                     selected=row.selected,
                 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 45 - 51, When constructing the
TraceIteration instance, explicitly cast the database value for evaluation to a
string instead of forwarding the raw value; update the TraceIteration(...) call
(the evaluation argument alongside repeat, fold, iteration) to use
str(row.evaluation) so TraceIteration.evaluation is always a string and matches
the model/tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluation is declared as varchar(265) in the database and SQLAlchemy returns it as a plain string — verified via HEX(LEFT(evaluation,4)). Explicit str() cast is unnecessary here, same as selected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saathviksheerla, thanks for verifying that directly — if evaluation is varchar(265) and SQLAlchemy already returns it as a plain string, the explicit str() cast is indeed redundant. My suggestion was precautionary without confirmed knowledge of the column type; your hex verification settles it. No change needed here.


✏️ Learnings added
Learnt from: saathviksheerla
Repo: openml/server-api PR: 267
File: src/routers/openml/runs.py:45-51
Timestamp: 2026-03-08T07:59:51.515Z
Learning: In the openml/server-api Python codebase (src/routers/openml/runs.py), the `evaluation` column on the run trace table is declared as `varchar(265)` in the database. SQLAlchemy returns it as a plain Python string, so no explicit `str()` cast is required when constructing `TraceIteration(evaluation=row.evaluation, ...)`. This is analogous to `selected` (MySQL ENUM) which is also passed through as-is.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: saathviksheerla
Repo: openml/server-api PR: 267
File: src/routers/openml/runs.py:0-0
Timestamp: 2026-03-08T07:45:07.969Z
Learning: In the openml/server-api Python codebase, non-boolean ENUM values stored as strings coming from the database should be treated as their literal string values, not coerced via truthiness checks. Ensure that fields like row.selected (with MySQL ENUM('true','false')) are passed through as the exact string (e.g., 'true' or 'false') and validated by Pydantic using Literal['true', 'false'] at the schema boundary. Do not apply a ternary like 'true' if row.selected else 'false'. Review places where the value is transformed before response: rely on the string value and let Pydantic enforce the literal type, and adjust any serialization if necessary to avoid implicit boolean conversion.

@saathviksheerla
Copy link
Contributor Author

@PGijsbers ready for your review.
All local tests pass (144 passed).
The failing CI check is test_estimation_procedure_list in evaluationmeasures_test.py — it's a pre-existing failure on main unrelated to this PR (extra 'percentage': None field in the response diff).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GET /run/trace/{id}

1 participant