evals: add benchmark evals for cuopt-developer skill by rgsl888prabhu · Pull Request #1399 · NVIDIA/cuopt

rgsl888prabhu · 2026-06-05T20:09:32Z

Adds skills/cuopt-developer/evals/evals.json with 3 eval cases covering the benchmarks/ folder.

coderabbitai · 2026-06-05T20:14:00Z

Linter diff in the way? Review this PR in Change Stack to focus on meaningful changes and expand context only when needed.

📝 Walkthrough

Walkthrough

A new evaluation registry file is added containing three benchmark evaluation entries for cuOpt developer skill assessment: Mittelmann LP benchmark with LP-specific build configuration, MIPLIB benchmark with per-instance logging and time limits, and multi-GPU MIPLIB batching scenario with machine-level instance distribution.

Changes

Benchmark Evaluation Configurations

Layer / File(s)	Summary
New benchmark evaluation entries `skills/cuopt-developer/evals/evals.json`	Three structured evaluation items define benchmark questions, required build flags and environment setup in `ground_truth`, and expected response checklist items in `expected_behavior` for Mittelmann LP, MIPLIB, and multi-GPU MIPLIB batching scenarios.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested labels

non-breaking, improvement

Suggested reviewers

Iroy30

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title 'evals: add benchmark evals for cuopt-developer skill' clearly and concisely summarizes the main change—adding evaluation cases for the cuopt-developer skill focused on benchmarks.
Description check	✅ Passed	The description 'Adds skills/cuopt-developer/evals/evals.json with 3 eval cases covering the benchmarks/ folder' directly relates to the changeset by specifying the file added and its purpose.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-benchmark-evals-cuopt-developer

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Adds skills/cuopt-developer/evals/evals.json with 3 eval cases covering the benchmarks/ folder (not covered by any existing eval): - dev-eval-001: Mittelmann LP benchmark (solve_LP binary, BUILD_LP_BENCHMARKS flag, CUDA_MODULE_LOADING=EAGER, get_datasets.py) - dev-eval-002: MIPLIB setup and run (download/gunzip, run_mps_files.sh, --write-log-file, --presolve t, --log-to-console false) - dev-eval-003: multi-GPU + batch splitting (--gpus-per-instance 2, --batch-num / --n-batches, CUDA_VISIBLE_DEVICES, --cut-mode) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

rgsl888prabhu · 2026-06-05T20:22:08Z

/nvskills-ci

rgsl888prabhu requested a review from a team as a code owner June 5, 2026 20:09

rgsl888prabhu requested a review from tmckayus June 5, 2026 20:09

rgsl888prabhu force-pushed the add-benchmark-evals-cuopt-developer branch from de5bdb5 to e5f1ae0 Compare June 5, 2026 20:21

rgsl888prabhu self-assigned this Jun 5, 2026

rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Jun 5, 2026

rgsl888prabhu changed the title ~~evals: add 3 benchmark evals for cuopt-developer skill~~ evals: add benchmark evals for cuopt-developer skill Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evals: add benchmark evals for cuopt-developer skill#1399

evals: add benchmark evals for cuopt-developer skill#1399
rgsl888prabhu wants to merge 1 commit into
mainfrom
add-benchmark-evals-cuopt-developer

rgsl888prabhu commented Jun 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Uh oh!

rgsl888prabhu commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rgsl888prabhu commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Uh oh!

rgsl888prabhu commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rgsl888prabhu commented Jun 5, 2026 •

edited

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading