Redesign simulation data model around cases and executions by Copilot · Pull Request #107 · E3SM-Project/simboard

Copilot · 2026-02-24T22:10:52Z

Description

Introduces a Case model to group simulations by case name (from timing files), replaces per-case canonical folding with per-execution Simulation records, and restructures the API so Case is the aggregate root. Each archive subdirectory (<jobID>.<timestamp>) becomes its own Simulation row linked to a Case via case_id. The first successful run per case is the canonical baseline; subsequent runs store a config diff dict against it.

Data Model

New Case table: name (unique), case_group (nullable, indexed — from CASE_GROUP in env_case.xml), canonical_simulation_id FK
Simulation changes: adds case_id FK, execution_id (unique), run_config_deltas (JSONB dict); drops case_name, name, group_name
CASE_HASH evaluated but intentionally not used — case grouping is directory-name based

Ingestion

Every successful run creates its own Simulation; first per case → canonical (run_config_deltas = NULL)
Non-canonical runs store dict[field, {"canonical": ..., "current": ...}] deltas
Deduplication by execution_id; incomplete runs skipped with warning
_get_or_create_case() resolves by case_name; CASE_GROUP stored on Case with conflict-safe upsert

API

GET /cases, GET /cases/{id} → CaseOut with nested SimulationSummaryOut[] via selectinload
GET /simulations → flat browser with Case metadata joined (case_name, case_group)
GET /simulations/{id} → full SimulationOut with derived is_canonical, change_count
SimulationCreate/SimulationOut drop name, groupName; use caseId + executionId

Frontend

Case name/group filter dropdowns on SimulationsPage
All name/groupName references replaced with caseName/executionId/caseGroup
Upload form uses caseId + executionId

Config delta format

{
  "compiler": { "canonical": "gcc-11", "current": "gcc-12" },
  "grid_resolution": { "canonical": "0.9x1.25", "current": "1.9x2.5" }
}

Checklist

Code follows project style guidelines
Self-reviewed code
No new warnings
Tests added or updated (if needed)
All tests pass (locally and CI/CD)
Documentation/comments updated (if needed)
Breaking change noted (if applicable)

Breaking changes: SimulationCreate replaces caseName with caseId (UUID) + executionId (string), drops name/groupName. SimulationOut adds caseName, caseGroup, isCanonical, changeCount; drops name/groupName. CaseOut includes caseGroup and nested simulations: SimulationSummaryOut[]. run_config_deltas type changes from list[dict] to dict | None. env_case.xml required for ingestion.

Deployment Notes (if any)

Single squashed Alembic migration 20260304_400000 (revises 20260219_000000):

Creates cases table, backfills from existing case_name values
Adds case_id, execution_id to simulations; derives execution_id from UUID for existing rows
Sets canonical per case (earliest simulation)
Converts legacy list-format run_config_deltas to dict
Backfills case_group from existing group_name
Drops case_name, name, group_name columns and old constraints

Full downgrade is not allowed as it involves destructive changes to data to work.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

astral.sh
- Triggering command: /usr/bin/curl curl -LsSf REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

This section details on the original issue you should resolve

<issue_title>[Enhancement]: Formalize performance_archive semantics and implement canonical run ingestion</issue_title>
<issue_description>## Is your feature request related to a problem?

There is ambiguity in how directories under performance_archive and case subdirectories (<jobID>.<timestamp>) should be interpreted.

This impacts:

SimBoard’s data model (Case vs. Run)
Canonical run selection
Incremental update behavior
Ingestion logic (e.g., main_parser())

Current assumptions may incorrectly treat all archive entries as completed production runs or treat job subdirectories as independent experiments rather than successive executions of the same case.

Describe the solution you'd like

Document and formalize conventions for how performance_archive should be interpreted and ingested, and update the data model and ingestion logic accordingly.

1. `performance_archive` Semantics

The archive can include incomplete or failed runs.
Some directories may contain build provenance only, with no run provenance (e.g., job never executed or crashed).
Ingestion must not assume all entries are completed production runs.
A run is considered successful only if all required run metadata is present.
Incomplete runs should be detected and either skipped or classified explicitly.

2. Case Directory Structure

Example:

ac.golaz/v3.LR.historical_0121
├── 1081156.251218-200923
├── 1081290.251218-211543
└── 1129347.260120-133246

Subdirectories (`<jobID>.<timestamp>`)

These represent multiple executions of the same case, not separate experiments.
Long simulations (e.g., 100-year runs) require multiple job submissions.
Each submission appears separately in performance_archive.
Timing may vary across runs due to machine load.
All subdirectories are logically grouped under the same casename.

3. Canonical Run Strategy

Recommended ingestion semantics:

Treat the casename (experiment name) as the primary entity.
Identify and store the first successful run as the canonical baseline.
- “Successful” = all required metadata fields are present.
For subsequent successful runs:
- Compare against the canonical baseline.
- Record only configuration differences (e.g., changed namelist values).
- Do not ingest timing data (handled by PACE).
Expect certain values to differ across runs (e.g., total days run, starting day).

4. Handling New Runs (Incremental Updates)

When a new <jobID>.<timestamp> directory appears under an existing casename:

Do not replace existing data.
Read the new run.
If successful:
- Compare it to the stored canonical baseline.
- Record any configuration differences.
Maintain grouping under the casename.

Runs are successive executions of the same case whose configuration deltas are tracked over time.

Required Changes

Data Model

Explicitly model Case (casename) as the primary entity.
Model Run as a child entity representing individual executions.
Preserve the existing unique constraint on simulations:
- (case_name, machine_id, simulation_start_date)
Ensure ingestion respects this constraint to maintain idempotency.

Ingestion (`main_parser()`)

Update logic to:

Detect and classify incomplete/failed runs.
Group <jobID>.<timestamp> directories under a single casename.
Identify the first successful run (all required metadata present) as canonical.
Compare subsequent runs against the canonical baseline.
Support idempotent ingestion using the existing uniqueness constraint.
Support incremental updates without overwriting prior data.

Initial DB Population Rules

Define and document whether:

All successful runs are stored (baseline + deltas), or
Only canonical baseline runs are fully stored, with others stored as diffs.

Deliverables

Update data model definitions (if needed).
Modify main_parser() to implement the semantics described above.
Add or update tests covering:
- Incomplete runs
- Multiple runs under one casename
- Canonical selection behavior
- Incremental ingestion
Update relevant documentation.

Describe alternatives you've considered

No response

Additional context

Follow up to:
#88 (comment)

Rob answer:
#88 (comment)</issue_description>

<agent_instructions>You are implementing the semantics defined in this issue.

Scope constraints:

Do not introduce unrelated architectural refactors.
Limit changes to:
- Data model (if required)
- main_parser() and directly related ingestion utilities
- Tests
- Documentation updates tied to this behavior

Implementation requirements:

Canonical Selection
- A run is "successful" only if all required metadata fields are pre...

Fixes [Enhancement]: Formalize performance_archive semantics and implement canonical run ingestion #104

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- main_parser: skip incomplete runs gracefully, sort dirs deterministically for reproducible canonical selection - ingest_archive: group runs by case_name, ingest first successful run as canonical, store config deltas for subsequent runs in extra JSONB field - Add _compute_config_delta and _attach_config_delta helpers - Add skipped_count to IngestArchiveResult - Update test for new incomplete-run skip behavior Co-authored-by: tomvothecoder <25624127+tomvothecoder@users.noreply.github.com>

idempotent/incremental ingestion Tests cover: - Incomplete runs skipped alongside valid runs - Multiple runs under same casename (performance_archive) - Deterministic sort order for canonical selection - Config delta stored for non-canonical runs - No delta when configs are identical - Different case_names create separate simulations - Idempotent re-ingestion - Incremental ingestion of new runs Co-authored-by: tomvothecoder <25624127+tomvothecoder@users.noreply.github.com>

- Add run_config_deltas column to SQLAlchemy Simulation model - Add field to SimulationCreate and SimulationOut schemas - Create Alembic migration with data migration from extra - Update _attach_config_delta to write to new column - Update tests to assert on new column instead of extra - Remove all legacy extra["run_config_deltas"] references Co-authored-by: tomvothecoder <25624127+tomvothecoder@users.noreply.github.com>

- Change type from dict[str, Any] | None to list[dict[str, Any]] | None to match actual list-append usage in _attach_config_delta - Improve field description to clarify that it stores deltas between canonical and non-canonical runs, not just canonical simulations Co-authored-by: tomvothecoder <25624127+tomvothecoder@users.noreply.github.com>

tomvothecoder · 2026-03-03T18:59:09Z

@copilot Fix recent CI/CD issues

- Import Case, _derive_execution_id, _get_or_create_case - Replace case_name on Simulation with case_id + execution_id - Use path-like exp_dir keys for _derive_execution_id - Update canonical run tests: every run gets its own SimulationCreate - Change run_config_deltas from list[dict] to dict - Deduplication now by execution_id instead of composite key - Update assertions for new model structure Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Replace caseName with caseId (UUID) and executionId (string) in all SimulationCreate mock objects - Import Case model and uuid4 - Create Case records in DB before each test that persists simulations - Replace Simulation.case_name queries with Simulation.case_id queries using pre-created Case objects Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

rljacob · 2026-03-05T00:05:25Z

No the CASE_HASH should be the same in different submissions. Where are those simulation directories?

tomvothecoder · 2026-03-05T18:36:53Z

No the CASE_HASH should be the same in different submissions. Where are those simulation directories?

/lcrc/group/e3sm/PERF_Chrysalis/performance_archive/ac.golaz/v3.LR.historical_0121

- Fix UUID and PG_UUID mypy type annotation issues

Copilot

Pull request overview

Copilot reviewed 31 out of 32 changed files in this pull request and generated 4 comments.

Initial plan

ecdf59d

Copilot AI assigned Copilot and tomvothecoder Feb 24, 2026

Copilot started work on behalf of tomvothecoder February 24, 2026 22:11 View session

Copilot AI and others added 2 commits February 24, 2026 22:19

Copilot AI changed the title ~~[WIP] Enhance performance_archive semantics and ingestion logic~~ Formalize performance_archive semantics and implement canonical run ingestion Feb 24, 2026

Copilot AI requested a review from tomvothecoder February 24, 2026 22:28

Copilot finished work on behalf of tomvothecoder February 24, 2026 22:28

Copilot started work on behalf of tomvothecoder February 24, 2026 22:39 View session

Copilot stopped work on behalf of tomvothecoder due to an error February 24, 2026 22:39
The session was cancelled by the user.

This comment was marked as outdated.

Sign in to view

Remove single_value

c22374d

Copilot started work on behalf of tomvothecoder March 3, 2026 18:10 View session

tomvothecoder and others added 3 commits March 3, 2026 10:11

Remove single_value test

02a40dd

This comment was marked as outdated.

Sign in to view

Copilot finished work on behalf of tomvothecoder March 3, 2026 18:19

Copilot started work on behalf of tomvothecoder March 3, 2026 19:02 View session

Copilot stopped work on behalf of tomvothecoder due to an error March 3, 2026 22:25
Copilot encountered an unknown Git error. If the problem persists, please contact GitHub Support. To retry, leave a comment on this pull request asking Copilot to try again.

This comment was marked as outdated.

Sign in to view

Copilot started work on behalf of tomvothecoder March 3, 2026 22:27 View session

This comment was marked as outdated.

Sign in to view

Copilot finished work on behalf of tomvothecoder March 3, 2026 22:30

Copilot started work on behalf of tomvothecoder March 3, 2026 22:33 View session

Copilot AI and others added 3 commits March 3, 2026 22:50

Fix execution_id format in datetime parsing test

f9ca49d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot finished work on behalf of tomvothecoder March 4, 2026 23:47

tomvothecoder requested a review from Copilot March 4, 2026 23:51

Copilot started reviewing on behalf of tomvothecoder March 4, 2026 23:52 View session

This comment was marked as resolved.

Sign in to view

tomvothecoder added 7 commits March 5, 2026 10:45

Update docs in migration file for accuracy

c7debbd

Block ability to downgrade migration as it is destructive

0039b87

Add support for skipped_count in ingestion

e0f7caf

Update create_simulation API to set canonical on case

8fffa9d

- Fix UUID and PG_UUID mypy type annotation issues

Replace uuid.UUID with UUID and .first() with .one_or_none()

7abe6ef

Refactor ingest_archive and add complete code coverage

60db180

Fix UUID type annotation

60b65da

tomvothecoder added this to the FY26Q2 (01/01/26 – 03/31/26) milestone Mar 5, 2026

tomvothecoder requested a review from Copilot March 5, 2026 21:07

Copilot started reviewing on behalf of tomvothecoder March 5, 2026 21:07 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

tomvothecoder added 2 commits March 5, 2026 14:42

Update variable names for clarity

b167c15

Fix module docstring

c725b6f

tomvothecoder requested a review from Copilot March 5, 2026 22:45

This was referenced Mar 5, 2026

[Enhancement]: Align Simulation pages with execution-centric model #120

Closed

[Enhancement]: Support Case-Centered Browsing Across Home and Browse Views #121

Closed

tomvothecoder changed the title ~~Refactor: Case-based grouping, canonical baselines, per-execution simulations, and CASE_GROUP normalization~~ Redesign simulation data model around cases and executions Mar 5, 2026

tomvothecoder approved these changes Mar 5, 2026

View reviewed changes

tomvothecoder merged commit 040d00e into main Mar 5, 2026
1 check failed

tomvothecoder deleted the copilot/formalize-performance-archive-semantics branch March 5, 2026 23:00

This was referenced Mar 10, 2026

[Enhancement]: Evaluate CASE_HASH as the canonical Case identifier #136

Open

[Enhancement]: Drop parent_simulation_id from Simulation objects #137

Closed

Copilot AI mentioned this pull request Mar 23, 2026

Drop parent_simulation_id from Simulation surfaces #149

Merged

7 tasks

Copilot AI review requested due to automatic review settings March 23, 2026 23:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign simulation data model around cases and executions#107

Redesign simulation data model around cases and executions#107
tomvothecoder merged 42 commits intomainfrom
copilot/formalize-performance-archive-semantics

Copilot AI commented Feb 24, 2026 •

edited by tomvothecoder

Loading

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

tomvothecoder commented Mar 3, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

rljacob commented Mar 5, 2026

Uh oh!

tomvothecoder commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Copilot AI commented Feb 24, 2026 • edited by tomvothecoder Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Data Model

Ingestion

API

Frontend

Config delta format

Checklist

Deployment Notes (if any)

I tried to connect to the following addresses, but was blocked by firewall rules:

Describe the solution you'd like

1. performance_archive Semantics

2. Case Directory Structure

Subdirectories (<jobID>.<timestamp>)

3. Canonical Run Strategy

4. Handling New Runs (Incremental Updates)

Required Changes

Data Model

Ingestion (main_parser())

Initial DB Population Rules

Deliverables

Describe alternatives you've considered

Additional context

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

tomvothecoder commented Mar 3, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

rljacob commented Mar 5, 2026

Uh oh!

tomvothecoder commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Feb 24, 2026 •

edited by tomvothecoder

Loading

1. `performance_archive` Semantics

Subdirectories (`<jobID>.<timestamp>`)

Ingestion (`main_parser()`)