Add GET /runs/trace/{run_id} endpoint#267
Add GET /runs/trace/{run_id} endpoint#267saathviksheerla wants to merge 4 commits intoopenml:mainfrom
GET /runs/trace/{run_id} endpoint#267Conversation
WalkthroughAdds database helpers to fetch a run and its ordered trace iterations in src/database/runs.py. Adds Pydantic schemas for trace responses in src/schemas/runs.py. Introduces a FastAPI router at /runs/trace/{run_id} in src/routers/openml/runs.py that checks run existence, returns structured trace data, or responds with 412 codes for missing run or missing trace. Registers the new router in src/main.py. Adds tests in tests/routers/openml/runs_test.py covering successful retrieval, ordering, nonexistent run, missing trace, and invalid run_id cases. 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The
expdbdependency inget_run_tracedoesn't need a default ofNone—dropping the default makes the intent clearer that it must always be provided by FastAPI's dependency injection. - In
TraceIteration, consider constrainingselectedto aLiteral["true", "false"]to encode the PHP-compatible value space directly in the schema and catch accidental deviations early.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `expdb` dependency in `get_run_trace` doesn't need a default of `None`—dropping the default makes the intent clearer that it must always be provided by FastAPI's dependency injection.
- In `TraceIteration`, consider constraining `selected` to a `Literal["true", "false"]` to encode the PHP-compatible value space directly in the schema and catch accidental deviations early.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)
35-47: Normalizesetup_string,evaluation, andselectedto match the explicit type-casting of other fields.The code explicitly normalizes
run_id,repeat,fold, anditerationwithstr(), but forwardssetup_string,evaluation, andselecteddirectly from the SQL row into a schema requiringstr. While tests validate these fields are already strings andselectedis "true" or "false", this inconsistency creates a fragility point: if the database ever stores aNULLvalue or non-string type for these fields, Pydantic model construction will fail without a clear error message.Add
str()calls to these three fields at lines 44-46 to match the pattern used for the other iteration fields and ensure consistent defensive programming.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/runs.py` around lines 35 - 47, The TraceIteration construction in RunTraceResponse currently casts run_id/repeat/fold/iteration to str but passes setup_string, evaluation, and selected directly; update the TraceIteration creation (inside RunTraceResponse -> RunTrace -> trace_iteration) to wrap setup_string, evaluation, and selected with str() as well so all iteration fields are normalized (e.g., use str(row.setup_string), str(row.evaluation), str(row.selected)) to avoid Pydantic errors from NULL/non-string DB values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 35-47: The TraceIteration construction in RunTraceResponse
currently casts run_id/repeat/fold/iteration to str but passes setup_string,
evaluation, and selected directly; update the TraceIteration creation (inside
RunTraceResponse -> RunTrace -> trace_iteration) to wrap setup_string,
evaluation, and selected with str() as well so all iteration fields are
normalized (e.g., use str(row.setup_string), str(row.evaluation),
str(row.selected)) to avoid Pydantic errors from NULL/non-string DB values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7a66e163-c68e-4081-935d-0ad5753181e2
📒 Files selected for processing (5)
src/database/runs.pysrc/main.pysrc/routers/openml/runs.pysrc/schemas/runs.pytests/routers/openml/runs_test.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #267 +/- ##
=======================================
Coverage ? 54.76%
=======================================
Files ? 35
Lines ? 1165
Branches ? 103
=======================================
Hits ? 638
Misses ? 525
Partials ? 2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 40-47: Convert the `selected` value to a string when constructing
the TraceIteration so it matches the other iteration fields and the Pydantic
schema: change the construction in the TraceIteration call that currently passes
selected=row.selected to use selected=str(row.selected) (locate the
TraceIteration instantiation that sets repeat=str(row.repeat),
fold=str(row.fold), iteration=str(row.iteration)) so `selected` becomes a string
literal "true"/"false" as expected by the model.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b3a49664-e599-42cd-b1f4-f36558498629
📒 Files selected for processing (2)
src/routers/openml/runs.pysrc/schemas/runs.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/schemas/runs.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routers/openml/runs.py`:
- Line 51: Normalize the selected value explicitly instead of relying on
truthiness: replace the ternary that uses row.selected (the selected key in the
serialized output) with an explicit normalization check that handles booleans,
integers, enums, and strings (for example, treat values as true only if they
match True, 1, "1", "true", "t", or "yes" case-insensitively); update the
serialization that sets selected="true"/"false" to use that normalization logic
(or a small helper like normalize_selected(value)) so string values like "false"
no longer serialize as "true".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 1415a658-924d-4dbd-a450-2656165973f4
📒 Files selected for processing (1)
src/routers/openml/runs.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 45-51: When constructing the TraceIteration instance, explicitly
cast the database value for evaluation to a string instead of forwarding the raw
value; update the TraceIteration(...) call (the evaluation argument alongside
repeat, fold, iteration) to use str(row.evaluation) so TraceIteration.evaluation
is always a string and matches the model/tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0101722f-6ed4-4983-a863-5d2df0d9f22d
📒 Files selected for processing (1)
src/routers/openml/runs.py
| TraceIteration( | ||
| repeat=str(row.repeat), | ||
| fold=str(row.fold), | ||
| iteration=str(row.iteration), | ||
| setup_string=row.setup_string, | ||
| evaluation=row.evaluation, | ||
| selected=row.selected, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the TraceIteration schema definition
fd -t f "runs.py" -path "*/schemas/*" | head -5Repository: openml/server-api
Length of output: 229
🏁 Script executed:
# Also search for TraceIteration class definition
rg "class TraceIteration" --type py -A 10Repository: openml/server-api
Length of output: 488
🏁 Script executed:
# Check the test file for evaluation assertions
fd -t f "runs_test.py" -path "*/tests/*"Repository: openml/server-api
Length of output: 229
🏁 Script executed:
# Look for database schema/migrations related to trace tables
fd -t f -e "py" -path "*migration*" | grep -i trace | head -10Repository: openml/server-api
Length of output: 229
🏁 Script executed:
# Examine the actual code at lines 45-51 in the file under review
cat -n src/routers/openml/runs.py | sed -n '40,55p'Repository: openml/server-api
Length of output: 759
🏁 Script executed:
# Find and examine the test file
find . -path "*/tests/*" -name "*runs*test*" -type fRepository: openml/server-api
Length of output: 96
🏁 Script executed:
# Check for database schema definitions (migrations or fixtures)
find . -path "*migration*" -o -path "*fixture*" -o -path "*schema*" | grep -i "\.py$\|\.sql$" | head -20Repository: openml/server-api
Length of output: 688
🏁 Script executed:
# Search for evaluation field references in tests
rg "evaluation" tests/ -A 2 -B 2 | head -50Repository: openml/server-api
Length of output: 3343
🏁 Script executed:
# Look for the database model/ORM definition for the runs trace table
rg "class.*Trace\|run.*trace\|trace.*iteration" src/ --type py -i | grep -i "class\|table" | head -10Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Look for SQLAlchemy ORM models
find src -name "*.py" | xargs rg "class.*Trace|Table.*trace" -i | head -20Repository: openml/server-api
Length of output: 215
🏁 Script executed:
# Search for database models or declarative base
rg "declarative_base\|@dataclass\|class.*Base\|from sqlalchemy" src/ | head -20Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Look in src/db or similar for model definitions
find src -type d -name "*db*" -o -name "*model*" -o -name "*orm*"Repository: openml/server-api
Length of output: 83
🏁 Script executed:
# Check if there's any initialization SQL or schema file
find . -name "*.sql" -o -name "*init*.py" -path "*/db/*" | head -10Repository: openml/server-api
Length of output: 128
🏁 Script executed:
# Search for trace table definition
rg "trace_iteration\|selected.*ENUM\|evaluation.*DECIMAL\|evaluation.*Float" -iRepository: openml/server-api
Length of output: 43
🏁 Script executed:
# Search for trace table definition in the SQL schema files
rg "trace_iteration|CREATE TABLE.*trace" docker/mysql/data/ -i -A 15Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Look for the specific evaluation column definition
rg "evaluation" docker/mysql/data/ -B 5 -A 2 | head -60Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Check the size of the SQL files first
wc -l docker/mysql/data/openml.sql docker/mysql/data/openml_expdb.sqlRepository: openml/server-api
Length of output: 139
🏁 Script executed:
# Search for trace in the SQL files
grep -n -i "trace" docker/mysql/data/openml.sql docker/mysql/data/openml_expdb.sql | head -20Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Look for run related tables
grep -n -i "run" docker/mysql/data/openml.sql docker/mysql/data/openml_expdb.sql | grep -i "create table" | head -10Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Read the SQL files directly
cat docker/mysql/data/openml.sql
cat docker/mysql/data/openml_expdb.sqlRepository: openml/server-api
Length of output: 158
🏁 Script executed:
# Search for where the trace rows are queried - this will show us the table structure
rg "trace.*row\|select.*trace\|from.*trace" src/ -i -B 2 -A 2 | head -40Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Show more context from runs.py to find where trace_rows comes from
cat -n src/routers/openml/runs.py | head -60Repository: openml/server-api
Length of output: 2277
🏁 Script executed:
# Search for the query that selects trace_rows
rg "trace_rows\|trace.*select" src/routers/openml/runs.py -B 5 -A 5Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Find the database.runs module
find src -path "*database*runs*" -name "*.py"Repository: openml/server-api
Length of output: 81
🏁 Script executed:
# If not found, check the structure of database directory
find src -type d -name "database" -o -name "db"Repository: openml/server-api
Length of output: 73
🏁 Script executed:
# Read the database/runs.py file
cat src/database/runs.pyRepository: openml/server-api
Length of output: 1128
🏁 Script executed:
# Look for any migrations or table definitions mentioning trace table
find . -name "*.py" -path "*migration*" | head -20Repository: openml/server-api
Length of output: 401
🏁 Script executed:
# Search for trace table in all Python files
rg "trace" src/ -t py -B 2 -A 2 | grep -i "decimal\|float\|varchar\|numeric" | head -20Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Check if there are any comments or docstrings about the trace table
rg "trace.*table\|evaluation.*column" src/ -i -B 3 -A 3Repository: openml/server-api
Length of output: 43
🏁 Script executed:
# Let's check the test file to see if there's any fixture or test data that shows the type
cat -n tests/routers/openml/runs_test.py | head -100Repository: openml/server-api
Length of output: 3383
Stringify evaluation before building TraceIteration.
TraceIteration.evaluation is declared as str, and tests explicitly assert it must be a string, but line 50 forwards the raw DB value. Lines 46–48 cast repeat, fold, and iteration with str() for PHP compatibility; evaluation should follow the same pattern. If that column is numeric in the DB, the current code relies on implicit Pydantic coercion instead of explicit serialization.
🛠️ Proposed fix
TraceIteration(
repeat=str(row.repeat),
fold=str(row.fold),
iteration=str(row.iteration),
setup_string=row.setup_string,
- evaluation=row.evaluation,
+ evaluation=str(row.evaluation),
selected=row.selected,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/routers/openml/runs.py` around lines 45 - 51, When constructing the
TraceIteration instance, explicitly cast the database value for evaluation to a
string instead of forwarding the raw value; update the TraceIteration(...) call
(the evaluation argument alongside repeat, fold, iteration) to use
str(row.evaluation) so TraceIteration.evaluation is always a string and matches
the model/tests.
There was a problem hiding this comment.
evaluation is declared as varchar(265) in the database and SQLAlchemy returns it as a plain string — verified via HEX(LEFT(evaluation,4)). Explicit str() cast is unnecessary here, same as selected
There was a problem hiding this comment.
@saathviksheerla, thanks for verifying that directly — if evaluation is varchar(265) and SQLAlchemy already returns it as a plain string, the explicit str() cast is indeed redundant. My suggestion was precautionary without confirmed knowledge of the column type; your hex verification settles it. No change needed here.
✏️ Learnings added
Learnt from: saathviksheerla
Repo: openml/server-api PR: 267
File: src/routers/openml/runs.py:45-51
Timestamp: 2026-03-08T07:59:51.515Z
Learning: In the openml/server-api Python codebase (src/routers/openml/runs.py), the `evaluation` column on the run trace table is declared as `varchar(265)` in the database. SQLAlchemy returns it as a plain Python string, so no explicit `str()` cast is required when constructing `TraceIteration(evaluation=row.evaluation, ...)`. This is analogous to `selected` (MySQL ENUM) which is also passed through as-is.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: saathviksheerla
Repo: openml/server-api PR: 267
File: src/routers/openml/runs.py:0-0
Timestamp: 2026-03-08T07:45:07.969Z
Learning: In the openml/server-api Python codebase, non-boolean ENUM values stored as strings coming from the database should be treated as their literal string values, not coerced via truthiness checks. Ensure that fields like row.selected (with MySQL ENUM('true','false')) are passed through as the exact string (e.g., 'true' or 'false') and validated by Pydantic using Literal['true', 'false'] at the schema boundary. Do not apply a ternary like 'true' if row.selected else 'false'. Review places where the value is transformed before response: rely on the string value and let Pydantic enforce the literal type, and adjust any serialization if necessary to avoid implicit boolean conversion.
|
@PGijsbers ready for your review. |
Closes #40
Implements the
GET /runs/trace/{run_id}endpoint, migrating behavior from the PHP API.Changes
src/database/runs.py— SQL queries for run existence check and trace fetchsrc/schemas/runs.py— Pydantic models matching PHP response structuresrc/routers/openml/runs.py— FastAPI endpointsrc/main.py— router registrationtests/routers/openml/runs_test.py— 6 tests covering happy path, ordering, and error casesBehavior