[Experimental PR] Self-Hosted Container Tests #1088

Malmahrouqi3 · 2025-12-15T03:13:06Z

User description

Description

Weekly tests on the latest gpu/cpu containers on Phoenix.

PR Type

Tests

Description

Add weekly Docker container testing workflow on Phoenix cluster
Test GPU and CPU containers with OpenMP and ACC interfaces
Execute MFC test suite with GPU-aware device configuration
Run tests on Friday midnight UTC schedule with manual trigger option

Diagram Walkthrough

flowchart LR
  A["GitHub Actions Scheduler"] -->|"Friday UTC"| B["Docker Test Workflow"]
  B -->|"Matrix Strategy"| C["GPU-ACC Tests"]
  B -->|"Matrix Strategy"| D["GPU-OMP Tests"]
  B -->|"Matrix Strategy"| E["CPU Tests"]
  C -->|"Execute"| F["Container Script"]
  D -->|"Execute"| F
  E -->|"Execute"| F
  F -->|"Run MFC Tests"| G["Phoenix Cluster"]

File Walkthrough

Relevant files

Tests

docker-test.yml `Weekly Docker container testing workflow definition` .github/workflows/docker-test.yml Define GitHub Actions workflow scheduled weekly on Fridays at midnight UTC Configure matrix strategy for GPU-ACC, GPU-OMP, and CPU test variants Set up self-hosted runner targeting Phoenix cluster with Georgia Tech labels Execute container test script with device and interface parameters	+45/-0
container.sh `Container build and MFC test execution script` .github/workflows/phoenix/container.sh Build Apptainer container images from Docker registry based on device and interface Execute MFC test suite with dry-run validation and actual test execution Configure GPU-specific settings including NVIDIA support and device enumeration Set test thread count dynamically based on GPU availability	+44/-0

CodeAnt-AI Description

Add weekly self-hosted container tests on the Phoenix cluster

What Changed

A scheduled GitHub Actions workflow now runs every Friday at midnight UTC (and can be triggered manually) to test container images on Phoenix
Tests run in a matrix: GPU with OpenACC, GPU with OpenMP, and CPU; they use self-hosted Phoenix runners labeled for Georgia Tech
Each job builds or pulls the container, runs the MFC test suite inside an Apptainer container with GPU-aware options, retries failing tests up to 3 times, and prints per-job logs after completion

Impact

✅ Earlier detection of container regressions
✅ Fewer late CI failures caused by missing GPU/CPU checks
✅ Clearer per-job test logs for GPU and CPU container runs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Summary by CodeRabbit

New Features
- Added a scheduled and manually-triggerable container test workflow and a supporting container execution script.
Chores
- Removed numerous CI workflows and cluster helper scripts (benchmarks, coverage, docs, formatting, linting, tooling, Homebrew, containerization, line counting, spelling, PMD, and cluster submit/build/test helpers).
- Consolidated CI surface to a single configured container test environment.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

PR Type

Other

Description

Removed multiple GitHub Actions workflows (bench, cleanliness, coverage, deploy-tap, docker, docs, formatting, homebrew, line-count, lint-source, lint-toolchain, pmd, spelling, test)
Added new docker-test.yml workflow for weekly container testing on Phoenix cluster
Created container.sh script for building and testing Apptainer containers with GPU/CPU support
Removed Frontier cluster test scripts (build.sh, bench.sh, submit.sh, test.sh)
Modified Phoenix submit.sh to comment out mfc.sh load command

Diagram Walkthrough

flowchart LR
  A["Old Workflows<br/>bench, test, docs, etc."] -->|"Removed"| B["Deleted Files"]
  C["New docker-test.yml"] -->|"Weekly Schedule"| D["Phoenix Cluster"]
  D -->|"Matrix Strategy"| E["GPU-ACC/OMP/CPU"]
  E -->|"Execute"| F["container.sh"]
  F -->|"Build & Test"| G["Apptainer Container"]
  G -->|"Run MFC Tests"| H["Test Results"]

File Walkthrough

Relevant files

Enhancement

2 files

docker-test.yml `New weekly Docker container testing workflow`	+47/-0
container.sh `Container build and MFC test execution script`	+46/-0

Configuration changes

1 files

submit.sh `Comment out mfc.sh load command`	+2/-2

Miscellaneous

19 files

bench.sh `Removed Frontier benchmark script`	+0/-22
build.sh `Removed Frontier build script`	+0/-26
submit-bench.sh `Removed Frontier benchmark submission script`	+0/-54
submit.sh `Removed Frontier submission script`	+0/-56
test.sh `Removed Frontier test script`	+0/-20
bench.yml `Removed benchmark workflow`	+0/-124
cleanliness.yml `Removed code cleanliness check workflow`	+0/-127
coverage.yml `Removed coverage check workflow`	+0/-50
deploy-tap.yml `Removed Homebrew tap deployment workflow`	+0/-120
docker.yml `Removed containerization workflow`	+0/-132
docs.yml `Removed documentation build workflow`	+0/-84
formatting.yml `Removed code formatting workflow`	+0/-19
homebrew.yml `Removed Homebrew formula test workflow`	+0/-259
line-count.yml `Removed line count workflow`	+0/-54
lint-source.yml `Removed source linting workflow`	+0/-53
lint-toolchain.yml `Removed toolchain linting workflow`	+0/-22
pmd.yml `Removed PMD source DRYness workflow`	+0/-131
spelling.yml `Removed spell check workflow`	+0/-17
test.yml `Removed main test suite workflow`	+0/-162

codeant-ai · 2025-12-15T03:13:10Z

CodeAnt AI is reviewing your PR.

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

qodo-code-review · 2025-12-15T03:13:24Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Possible Issue Using --fakeroot in apptainer exec may fail on the self-hosted runner if user namespaces are not enabled or permitted; verify runner supports fakeroot or remove it for execution phase. apptainer exec \$NV_FLAG --fakeroot --writable-tmpfs \ --bind "\$SCRATCH_DIR":/scratch \ --env job_slug="\$job_slug" \ --env job_device="\$job_device" \ "\$CONTAINER" \ Unset Var Use device_opts is only set for GPU path; for CPU it remains empty but is referenced later. Ensure it is initialized to an empty string to avoid unbound variable issues if set -u is used or future changes enforce strict mode. device_opts="-g $gpu_ids" n_test_threads=`expr $gpu_count \* 2` fi ./mfc.sh test --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix ' Deprecated Node Forcing ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION to node16 with ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION may break future actions and reduces security; consider upgrading or removing this override. NODE_OPTIONS: ${{ matrix.lbl == 'gt' && '--max-old-space-size=2048' \|\| '' }} ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16 ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true

coderabbitai · 2025-12-15T03:13:27Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a scheduled/manual GitHub Actions workflow (.github/workflows/docker-test.yml) and an Apptainer container script (.github/workflows/phoenix/container.sh), and removes many CI workflows and Frontier helper scripts under .github/workflows/... (bench, cleanliness, coverage, deploy-tap, docker, docs, formatting, homebrew, line-count, lint-, pmd, spelling, test, frontier/).

Changes

Cohort / File(s)	Summary
New workflow `\.github/workflows/docker-test\.yml`	Adds "Test Docker Containers" workflow with cron + workflow_dispatch triggers, a matrix (active `gt` label), self-hosted runner/group, 480-minute timeout, conditional `NODE_OPTIONS` by matrix label, checkout step, conditional cluster-specific test step, and unconditional log printing.
New container orchestration script `\.github/workflows/phoenix/container\.sh`	Adds Apptainer script selecting container image by interface/device, setting NV/GPU flags and binds, unsetting `LD_PRELOAD`, computing GPU/thread counts, performing a dry-run, and running `mfc.sh` tests with device-specific args and retries.
Removed top-level CI workflows `.github/workflows/(bench.yml	cleanliness.yml
Removed Frontier scripts & submit helpers `.github/workflows/frontier/(bench.sh	build.sh
Phoenix submit change `\.github/workflows/phoenix/submit\.sh`	Comments out the `. ./mfc.sh load -c p -m $2` line (disabled) and adds a trailing newline.

Sequence Diagram(s)

sequenceDiagram
    participant GH as GitHub Actions
    participant Runner as Self-hosted Runner
    participant Host as Runner Host
    participant Apptainer as Apptainer
    participant Container as Container Env
    participant MFC as mfc.sh

    GH->>Runner: trigger docker-test (cron/dispatch/push/PR)
    Runner->>Host: prepare env, set NODE_OPTIONS, enforce timeout
    Host->>Apptainer: select image by device/interface & launch
    Apptainer->>Container: instantiate with NV/GPU flags, binds, env
    Container->>MFC: run dry-run (mfc.sh --dry-run)
    MFC-->>Container: dry-run output
    Container->>Container: compute gpu_count, gpu_ids, thread counts
    Container->>MFC: run tests (mfc.sh with device opts, retries)
    MFC-->>Container: emit logs & exit status
    Container->>Host: return logs/artifacts
    Runner->>GH: print/upload logs, finish job

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Heterogeneous edits: new workflow + container runner script and large deletions across many CI workflows and Frontier scripts.
Pay special attention to:
- .github/workflows/docker-test.yml — matrix configuration, runner/group, timeouts, conditional steps.
- .github/workflows/phoenix/container.sh — Apptainer invocation, GPU detection, LD_PRELOAD handling, device/thread calculations, mfc.sh argument construction.
- References elsewhere to removed workflows/scripts (badges, docs, invocation scripts).

Possibly related issues

#1081 — Docker/CI changes: the new docker-test workflow and phoenix/container.sh implement container build/test automation referenced by the issue.

Possibly related PRs

#1078 — Modifies CI workflows (bench/test); strongly related due to overlapping workflow/script removals and changes.
#1066 — Changes lint-toolchain workflow which this PR removes; directly related/conflicting on that file.

Suggested labels

size:XL

Suggested reviewers

sbryngelson

Poem

🐇 I hop through builds at dawn and night,
I bind the paths and set the NV light,
A dry-run first, then threads I count with care,
I spin the tests and gather logs to share,
The Phoenix wakes — containers hum with delight.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[Experimental PR] Self-Hosted Container Tests' clearly and concisely summarizes the main change: adding experimental self-hosted container testing capabilities, matching the workflow additions in the changeset.
Description check	✅ Passed	The PR description is substantially complete with a clear summary of changes, type classification (Tests/Other), detailed diagram walkthrough, and file-by-file explanation of additions and removals. While some checklist items are unchecked (appropriate for infrastructure changes), the core required information is present and well-documented.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 45c7a6e and a5c0a47.

📒 Files selected for processing (1)

.github/workflows/phoenix/container.sh (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: After each successful build step, run focused tests using `./mfc.sh test -j $(nproc) -f EA8FA07E -t 9E2CA336` instead of running all tests

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

.github/workflows/phoenix/container.sh

🧬 Code graph analysis (1)

.github/workflows/phoenix/container.sh (1)

toolchain/mfc/run/run.py (1)

run (151-179)

🪛 Shellcheck (0.11.0)

.github/workflows/phoenix/container.sh

[warning] 3-3: job_interface is referenced but not assigned.

(SC2154)

[warning] 4-4: job_device is referenced but not assigned.

(SC2154)

[warning] 18-18: job_slug is referenced but not assigned.

(SC2154)

🔇 Additional comments (2)

.github/workflows/phoenix/container.sh (2)

23-31: Build options logic looks correct, contingent on job_interface fix.

The conditional build options construction for GPU with mp/acc variants is appropriate. However, this logic depends on job_interface being available inside the container—ensure the --env job_interface fix is applied.

33-44: Test execution logic is sound.

The dry-run followed by --no-build with retry logic (--max-attempts 3) and dynamic thread count based on GPU availability is well-structured.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

.github/workflows/docker-test.yml

.github/workflows/phoenix/container.sh

.github/workflows/docker-test.yml

codeant-ai · 2025-12-15T03:15:53Z

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review Stray/invalid token The file ends with a bare `EOT` token with no matching heredoc start. This will be interpreted as a command (and fail) when the script runs. Remove the stray `EOT` or correctly implement a heredoc if intended. Variable expansion bug Many occurrences escape shell variables (e.g. `\$job_device`, `\$job_slug`, `\$CONTAINER`, etc.). This prevents expansion on the host and likely passes literal `$...` strings into apptainer or the container environment, causing incorrect image names, env values, and logic to fail. Review whether variables should be expanded by the host or preserved for evaluation inside the container and remove unneeded backslashes. GPU count handling Code assumes `nvidia-smi` exists and that `gpu_count` > 0. If no GPUs are present or `nvidia-smi` is missing, `seq` with a negative end or unhandled zero value will error. Add checks and fallbacks for zero GPUs and handle `nvidia-smi` failures.

codeant-ai · 2025-12-15T03:16:55Z

.github/workflows/phoenix/container.sh

+    apptainer build mfc:latest-\$job_device.sif docker://sbryngelson/mfc:latest-mp-\$job_device
+elif [ "$job_interface" = "acc" ]; then
+    apptainer build mfc:latest-\$job_device.sif docker://sbryngelson/mfc:latest-\$job_device
+fi
+
+CONTAINER="mfc:latest-\$job_device.sif"


✅ Suggestion: Host-side variables are escaped (e.g. \$job_device) so they are not expanded by the shell on the host; this produces literal strings like "$job_device" for built image names and the CONTAINER variable, causing wrong container filenames and failed builds or execs. Remove the backslashes so variables expand on the host and quote the resulting values to avoid word-splitting. [logic error]

Severity Level: Minor ⚠️

Suggested change

apptainer build mfc:latest-\$job_device.sif docker://sbryngelson/mfc:latest-mp-\$job_device

elif [ "$job_interface" = "acc" ]; then

apptainer build mfc:latest-\$job_device.sif docker://sbryngelson/mfc:latest-\$job_device

fi

CONTAINER="mfc:latest-\$job_device.sif"

apptainer build "mfc:latest-$job_device.sif" "docker://sbryngelson/mfc:latest-mp-$job_device"

elif [ "$job_interface" = "acc" ]; then

apptainer build "mfc:latest-$job_device.sif" "docker://sbryngelson/mfc:latest-$job_device"

fi

CONTAINER="mfc:latest-$job_device.sif"

Why it matters? ⭐

This is a real logic bug: those backslashes prevent the host shell from expanding $job_device when running apptainer build and when setting CONTAINER, producing literal names like "mfc:latest-$job_device.sif" instead of the intended concrete filenames. Unescaping and quoting the expansions fixes the filenames and avoids word-splitting.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** .github/workflows/phoenix/container.sh **Line:** 4:9 **Comment:** *Logic Error: Host-side variables are escaped (e.g. `\$job_device`) so they are not expanded by the shell on the host; this produces literal strings like "$job_device" for built image names and the `CONTAINER` variable, causing wrong container filenames and failed builds or execs. Remove the backslashes so variables expand on the host and quote the resulting values to avoid word-splitting. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

codeant-ai · 2025-12-15T03:16:55Z

.github/workflows/phoenix/container.sh

+[ "\$job_device" = "gpu" ] && NV_FLAG="--nv"
+
+apptainer exec \$NV_FLAG --fakeroot --writable-tmpfs \
+  --bind "\$SCRATCH_DIR":/scratch \
+  --env job_slug="\$job_slug" \
+  --env job_device="\$job_device" \
+  "\$CONTAINER" \


✅ Suggestion: The apptainer exec invocation and its arguments contain escaped variables (e.g. \$SCRATCH_DIR, "\$CONTAINER", \$NV_FLAG) so the host will pass literal tokens rather than expanded values; unescape these so the runtime receives actual paths, env values and the container name. [logic error]

Severity Level: Minor ⚠️

Suggested change

[ "\$job_device" = "gpu" ] && NV_FLAG="--nv"

apptainer exec \$NV_FLAG --fakeroot --writable-tmpfs \

--bind "\$SCRATCH_DIR":/scratch \

--env job_slug="\$job_slug" \

--env job_device="\$job_device" \

"\$CONTAINER" \

[ "$job_device" = "gpu" ] && NV_FLAG="--nv"

apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \

--bind "$SCRATCH_DIR":/scratch \

--env job_slug="$job_slug" \

--env job_device="$job_device" \

"$CONTAINER" \

Why it matters? ⭐

Correct: the exec arguments (NV_FLAG, --bind, --env, container name) must be expanded by the host before calling apptainer. The current escaping passes literal tokens like "$SCRATCH_DIR" or "$CONTAINER" which is wrong. Unescaping these fixes the environment and path propagation into the container.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** .github/workflows/phoenix/container.sh **Line:** 12:18 **Comment:** *Logic Error: The apptainer exec invocation and its arguments contain escaped variables (e.g. `\$SCRATCH_DIR`, `"\$CONTAINER"`, `\$NV_FLAG`) so the host will pass literal tokens rather than expanded values; unescape these so the runtime receives actual paths, env values and the container name. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

.github/workflows/phoenix/container.sh

codeant-ai · 2025-12-15T03:17:01Z

CodeAnt AI finished reviewing your PR.

Copilot

Pull request overview

This PR adds weekly automated testing of MFC's Docker containers (CPU and GPU variants) on the Phoenix cluster at Georgia Tech. The testing uses Apptainer to pull and run the latest published Docker images, executing the full MFC test suite within each container to verify the containerized builds work correctly.

Adds a new GitHub Actions workflow (docker-test.yml) that runs weekly on Fridays
Creates a new script (container.sh) that pulls Docker images via Apptainer and runs tests inside containers
Tests three configurations: GPU with OpenACC, GPU with OpenMP, and CPU

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 16 comments.

File	Description
`.github/workflows/docker-test.yml`	Defines the weekly workflow schedule and matrix for testing CPU and GPU containers with different parallelization interfaces
`.github/workflows/phoenix/container.sh`	Script that pulls Docker containers via Apptainer and executes MFC test suite inside them

.github/workflows/phoenix/container.sh

.github/workflows/docker-test.yml

.github/workflows/phoenix/container.sh

.github/workflows/docker-test.yml

.github/workflows/phoenix/container.sh

cubic-dev-ai

3 issues found across 2 files

Prompt for AI agents (all 3 issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name=".github/workflows/docker-test.yml">

<violation number="1" location=".github/workflows/docker-test.yml:45">
P2: Log filename mismatch: The workflow expects `test-{device}-{interface}.out` but submit.sh generates `container-{device}-{interface}.out` based on the script basename. This will cause the &#39;Print Logs&#39; step to fail with &#39;file not found&#39;.</violation>
</file>

<file name=".github/workflows/phoenix/container.sh">

<violation number="1" location=".github/workflows/phoenix/container.sh:4">
P1: Inconsistent variable escaping: `$job_interface` is unescaped (expands normally) but `\$job_device` is escaped (won&#39;t expand, remains literal). Remove the backslashes to allow `$job_device` to expand properly.</violation>

<violation number="2" location=".github/workflows/phoenix/container.sh:44">
P1: Orphan `EOT` heredoc terminator without a corresponding heredoc start. This will cause a shell error when the script executes. Either remove this line or add the missing heredoc opening statement.</violation>
</file>

Since this is your first cubic review, here's how it works:

cubic automatically reviews your code and comments on bugs and improvements
Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
Ask questions if you need clarification on any suggestion

_{Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR}

.github/workflows/docker-test.yml

.github/workflows/phoenix/container.sh

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (6)

.github/workflows/phoenix/container.sh (5)
43-44: Remove the stray EOT token—it will cause a "command not found" error.

Line 44 contains an orphan EOT token with no corresponding here-document. The bash -c command string properly ends on line 43 with a closing single quote. The EOT will be executed as a shell command at runtime, causing the script to fail with "command not found."

Apply this diff to remove the extraneous line:
./mfc.sh test --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix
  '
-EOT
1-2: Add set -e for fail-fast error handling.

Shell scripts should exit immediately on command failure rather than continue silently. This ensures that errors in apptainer invocations or test execution are caught early instead of masking failures.

Apply this diff to add error handling:
 #!/bin/bash
+set -e

 if [ "$job_interface" = "omp" ]; then
3-7: Add missing CPU case for container build.

The script handles only GPU interfaces (omp and acc), leaving the CPU case undefined. When job_device=cpu and job_interface=none (as configured in the workflow matrix on lines 28-29), the script falls through to the else block and attempts to build with undefined values, causing a build failure.

Apply this diff to handle the CPU case:
 if [ "$job_interface" = "omp" ]; then
     apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-mp-$job_device
 elif [ "$job_interface" = "acc" ]; then
     apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-$job_device
+elif [ "$job_device" = "cpu" ]; then
+    apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-$job_device
 fi
9-18: Fix variable escaping in apptainer invocation—variables must expand on host.

The variables on lines 14-18 are escaped with backslashes (\$NV_FLAG, \$SCRATCH_DIR, etc.), which prevents the shell from expanding them. This means apptainer receives literal strings like "$CONTAINER" instead of the actual container name, causing container lookup failures and incorrect environment bindings.

Variables set on the host (like $CONTAINER on line 9) must be unescaped so they expand before being passed to apptainer.

Apply this diff to fix the escaping:
 CONTAINER="mfc:latest-$job_device.sif"

 NV_FLAG=""
-[ "\$job_device" = "gpu" ] && NV_FLAG="--nv"
+[ "$job_device" = "gpu" ] && NV_FLAG="--nv"

-apptainer exec \$NV_FLAG --fakeroot --writable-tmpfs \
-  --bind "\$SCRATCH_DIR":/scratch \
-  --env job_slug="\$job_slug" \
-  --env job_device="\$job_device" \
-  "\$CONTAINER" \
+apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
+  --bind "$SCRATCH_DIR":/scratch \
+  --env job_slug="$job_slug" \
+  --env job_device="$job_device" \
+  "$CONTAINER" \
15-15: Define the SCRATCH_DIR environment variable before running this script.

The --bind "\$SCRATCH_DIR":/scratch mount on line 15 will fail if SCRATCH_DIR is undefined, resulting in an invalid apptainer bind argument --bind "":/scratch. This variable is not defined in the script or passed via the SLURM environment. Either define it explicitly (e.g., from $TMPDIR or a hardcoded Phoenix scratch path) or add a fallback:
 NV_FLAG=""
 [ "\$job_device" = "gpu" ] && NV_FLAG="--nv"

+SCRATCH_DIR="${SCRATCH_DIR:-.}"  # Use SCRATCH_DIR if set, else current directory
+
 apptainer exec \$NV_FLAG --fakeroot --writable-tmpfs \
.github/workflows/docker-test.yml (1)
45-47: Fix log filename mismatch and add wait for async job completion.

Two issues prevent reliable log retrieval:

Filename mismatch (critical): The "Print Logs" step expects test-{device}-{interface}.out, but submit.sh generates log files using the script basename, producing container-{device}-{interface}.out. This will cause "file not found" errors.

Race condition (major): The submit.sh script submits a job to the cluster scheduler and returns immediately, but test execution happens asynchronously on the cluster. The "Print Logs" step runs without waiting, so the log file may not exist yet.

Apply this diff to fix the filename and add explicit waiting:
       - name: Pull & Test
         if:   matrix.lbl == 'gt'
         run:  bash .github/workflows/phoenix/submit.sh .github/workflows/phoenix/container.sh ${{ matrix.device }} ${{ matrix.interface }}

+      - name: Wait for job completion
+        if:   matrix.lbl == 'gt'
+        run:  |
+          timeout=3600  # 1 hour timeout
+          elapsed=0
+          while [ $elapsed -lt $timeout ]; do
+            if [ -f "container-${{ matrix.device }}-${{ matrix.interface }}.out" ]; then
+              echo "Log file found"
+              exit 0
+            fi
+            sleep 10
+            elapsed=$((elapsed + 10))
+          done
+          echo "Timeout waiting for job completion"
+          exit 1
+
       - name: Print Logs
         if:   always()
-        run:  cat test-${{ matrix.device }}-${{ matrix.interface }}.out
+        run:  cat container-${{ matrix.device }}-${{ matrix.interface }}.out || echo "Log file not found"

🧹 Nitpick comments (2)

.github/workflows/phoenix/container.sh (1)
39-39: Modernize command substitution syntax from backticks to $(…).

Line 39 uses deprecated backtick syntax for command substitution, which is inconsistent with the $(...) syntax already used on lines 36-37. Backticks are harder to nest and read; the modern form is preferred in all modern bash style guides.

Apply this diff:
-    n_test_threads=`expr $gpu_count \* 2`
+    n_test_threads=$(expr $gpu_count \* 2)
Alternatively, use arithmetic expansion for clearer intent:
-    n_test_threads=`expr $gpu_count \* 2`
+    n_test_threads=$((gpu_count * 2))
.github/workflows/docker-test.yml (1)
41-43: Remove redundant conditional—all matrix entries have lbl: 'gt'.

The if: matrix.lbl == 'gt' conditional on line 42 is redundant because all matrix configurations (commented-out and active) specify lbl: 'gt'. This conditional logic adds unnecessary complexity without providing flexibility. Either remove the conditional if all jobs should always run this step, or clarify the intent if other labels will be added in the future.

Simplify by removing the conditional:
       - name: Pull & Test
-        if:   matrix.lbl == 'gt'
         run:  bash .github/workflows/phoenix/submit.sh .github/workflows/phoenix/container.sh ${{ matrix.device }} ${{ matrix.interface }}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81ec440 and bce6956.

📒 Files selected for processing (21)

.github/workflows/bench.yml (0 hunks)
.github/workflows/cleanliness.yml (0 hunks)
.github/workflows/coverage.yml (0 hunks)
.github/workflows/deploy-tap.yml (0 hunks)
.github/workflows/docker-test.yml (1 hunks)
.github/workflows/docker.yml (0 hunks)
.github/workflows/docs.yml (0 hunks)
.github/workflows/formatting.yml (0 hunks)
.github/workflows/frontier/bench.sh (0 hunks)
.github/workflows/frontier/build.sh (0 hunks)
.github/workflows/frontier/submit-bench.sh (0 hunks)
.github/workflows/frontier/submit.sh (0 hunks)
.github/workflows/frontier/test.sh (0 hunks)
.github/workflows/homebrew.yml (0 hunks)
.github/workflows/line-count.yml (0 hunks)
.github/workflows/lint-source.yml (0 hunks)
.github/workflows/lint-toolchain.yml (0 hunks)
.github/workflows/phoenix/container.sh (1 hunks)
.github/workflows/pmd.yml (0 hunks)
.github/workflows/spelling.yml (0 hunks)
.github/workflows/test.yml (0 hunks)

💤 Files with no reviewable changes (19)

.github/workflows/lint-source.yml
.github/workflows/deploy-tap.yml
.github/workflows/coverage.yml
.github/workflows/cleanliness.yml
.github/workflows/formatting.yml
.github/workflows/lint-toolchain.yml
.github/workflows/frontier/bench.sh
.github/workflows/test.yml
.github/workflows/frontier/submit-bench.sh
.github/workflows/docker.yml
.github/workflows/frontier/test.sh
.github/workflows/line-count.yml
.github/workflows/bench.yml
.github/workflows/frontier/submit.sh
.github/workflows/frontier/build.sh
.github/workflows/docs.yml
.github/workflows/homebrew.yml
.github/workflows/spelling.yml
.github/workflows/pmd.yml

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: After each successful build step, run focused tests using `./mfc.sh test -j $(nproc) -f EA8FA07E -t 9E2CA336` instead of running all tests

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

.github/workflows/phoenix/container.sh

🪛 actionlint (1.7.9)

.github/workflows/docker-test.yml

26-26: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

🪛 Shellcheck (0.11.0)

.github/workflows/phoenix/container.sh

[warning] 3-3: job_interface is referenced but not assigned.

(SC2154)

[warning] 4-4: job_device is referenced but not assigned.

(SC2154)

[warning] 9-9: CONTAINER appears unused. Verify use (or export if used externally).

(SC2034)

[warning] 12-12: This expression is constant. Did you forget the $ on a variable?

(SC2050)

[warning] 12-12: NV_FLAG appears unused. Verify use (or export if used externally).

(SC2034)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Georgia Tech | Phoenix (cpu)

coderabbitai · 2025-12-15T05:16:07Z

.github/workflows/docker-test.yml

+      NODE_OPTIONS: ${{ matrix.lbl == 'gt' && '--max-old-space-size=2048' || '' }}
+      ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16
+      ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true


⚠️ Potential issue | 🟠 Major

Upgrade Node.js to a supported LTS version—Node 16 is end-of-life.

The workflow forces Node.js 16 and explicitly allows running insecure node versions. Node 16 reached end-of-life in September 2023 and contains known security vulnerabilities. Modern GitHub Actions runners include Node.js 20 LTS or later. Remove these version overrides and allow the runner's default Node.js to be used, or explicitly specify a current LTS version like node20.

Apply this diff to remove the insecure overrides:

env: NODE_OPTIONS: ${{ matrix.lbl == 'gt' && '--max-old-space-size=2048' || '' }} - ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16 - ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true

If you need to enforce a specific Node.js version, use setup-node@v4 action in a step instead:

- name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20'

🤖 Prompt for AI Agents

.github/workflows/docker-test.yml lines 34-36: the workflow forces Node.js 16 and allows insecure Node versions; remove the unsafe overrides and either let the runner default Node be used or explicitly set a current LTS via setup-node@v4. Delete the ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION and ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION lines, and if you must pin a version add a setup-node@v4 step with node-version: '20' (or another supported LTS) instead of forcing the runner environment.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

.github/workflows/phoenix/container.sh (2)
19-27: Pass job_interface to the container via --env so the GPU interface detection works.

Inside the bash -c command (single-quoted, lines 19–43), the script references $job_interface on line 24 to choose between OpenMP (mp) and OpenACC (acc). However, job_interface is not exported to the container environment (compare with job_device and job_slug on lines 16–17). Since the bash -c uses single quotes, shell variables won't expand from the outer scope—only environment variables will be accessible. Without --env job_interface="$job_interface", the conditional on line 24 will fail or behave unexpectedly.
 apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
   --bind "$SCRATCH_DIR":/scratch \
   --env job_slug="$job_slug" \
   --env job_device="$job_device" \
+  --env job_interface="$job_interface" \
   "$CONTAINER" \
44-44: Remove the stray EOT line—it will cause a "command not found" error.

The bash -c command ends with the closing single quote on line 43, so the orphaned EOT token on line 44 will be executed as a shell command at runtime and fail.
 ./mfc.sh test --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix
   '
-EOT

🧹 Nitpick comments (3)

.github/workflows/phoenix/container.sh (3)
39-39: Replace deprecated backticks with modern $() command substitution syntax.

The use of backticks for command substitution is outdated in modern Bash. Adopt the $() syntax for consistency and readability.
-    n_test_threads=`expr $gpu_count \* 2`
+    n_test_threads=$(expr $gpu_count \* 2)
1-2: Add set -e to fail fast on errors.

The script lacks error handling. Without set -e, if apptainer build, apptainer exec, or any MFC test command fails, the script will continue silently and the workflow may report false success. Compare with other Phoenix scripts (e.g., submit.sh) which use set -e to exit immediately on errors.
 #!/bin/bash
+set -e
3-7: Consider adding explicit handling for the cpu test case.

The current container image selection (lines 3–7) only explicitly handles the omp and acc GPU interfaces. If a cpu job runs (as indicated in the PR summary's matrix strategy), the else branch will attempt to build mfc:latest-cpu. Verify that this image exists in the Docker registry, or add an explicit cpu case for clarity. For example:
 if [ "$job_interface" = "omp" ]; then
     apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-mp-$job_device
 elif [ "$job_interface" = "acc" ]; then
     apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-$job_device
+elif [ "$job_device" = "cpu" ]; then
+    apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-cpu
 fi

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bce6956 and 506ba34.

📒 Files selected for processing (1)

.github/workflows/phoenix/container.sh (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: After each successful build step, run focused tests using `./mfc.sh test -j $(nproc) -f EA8FA07E -t 9E2CA336` instead of running all tests

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

.github/workflows/phoenix/container.sh

🪛 Shellcheck (0.11.0)

.github/workflows/phoenix/container.sh

[warning] 3-3: job_interface is referenced but not assigned.

(SC2154)

[warning] 4-4: job_device is referenced but not assigned.

(SC2154)

[warning] 16-16: job_slug is referenced but not assigned.

(SC2154)

coderabbitai · 2025-12-15T05:20:09Z

.github/workflows/phoenix/container.sh

+apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
+  --bind "$SCRATCH_DIR":/scratch \


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's find and read the container.sh file find . -name "container.sh" -type f | head -20

Repository: MFlowCode/MFC

Length of output: 97

🏁 Script executed:

# List the phoenix workflows directory fd . .github/workflows/phoenix/ 2>/dev/null | head -20

Repository: MFlowCode/MFC

Length of output: 242

🏁 Script executed:

# Search for SCRATCH_DIR references rg "SCRATCH_DIR" --no-heading

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

cat -n .github/workflows/phoenix/container.sh

Repository: MFlowCode/MFC

Length of output: 1522

🏁 Script executed:

# Search for calls to container.sh rg "container\.sh" --no-heading -A 5 -B 5

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# List all YAML files in the workflows directory fd "\.ya?ml$" .github/workflows/

Repository: MFlowCode/MFC

Length of output: 90

🏁 Script executed:

# Find all YAML workflow files find .github/workflows -name "*.yml" -o -name "*.yaml"

Repository: MFlowCode/MFC

Length of output: 90

🏁 Script executed:

# Search for where container.sh is called rg "container\.sh" .github/workflows/ -l

Repository: MFlowCode/MFC

Length of output: 90

🏁 Script executed:

# Check the phoenix directory for workflow files cat .github/workflows/phoenix/*.yml 2>/dev/null || echo "No .yml files found" cat .github/workflows/phoenix/*.yaml 2>/dev/null || echo "No .yaml files found"

Repository: MFlowCode/MFC

Length of output: 97

🏁 Script executed:

cat -n .github/workflows/docker-test.yml

Repository: MFlowCode/MFC

Length of output: 1852

🏁 Script executed:

# Search for any file that might call the container.sh script rg "container\.sh" --no-heading -B 10 -A 10

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

cat -n .github/workflows/phoenix/submit.sh

Repository: MFlowCode/MFC

Length of output: 1943

🏁 Script executed:

# Also check for any environment variable definitions or exports rg "SCRATCH_DIR\s*=" .github/workflows/ -B 5 -A 5

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Check if mfc.sh defines or sets SCRATCH_DIR rg "SCRATCH_DIR" . --no-heading -B 5 -A 5

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Look at mfc.sh file, particularly the load command cat -n mfc.sh | head -100

Repository: MFlowCode/MFC

Length of output: 2473

🏁 Script executed:

# Search for SCRATCH_DIR in mfc.sh rg "SCRATCH_DIR" mfc.sh -B 5 -A 5

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Check the modules.sh file which is called by "mfc.sh load" cat -n toolchain/bootstrap/modules.sh | head -150

Repository: MFlowCode/MFC

Length of output: 5022

🏁 Script executed:

# Search for SCRATCH in the entire repo more broadly rg "SCRATCH" . --no-heading -B 3 -A 3

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Check if there are any environment setup scripts fd "\.sh$" toolchain/ | head -20

Repository: MFlowCode/MFC

Length of output: 326

🏁 Script executed:

# Check the toolchain/modules file which contains environment variables cat -n toolchain/modules | grep -A 10 -B 5 "^p"

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Also search for SCRATCH in toolchain rg "SCRATCH" toolchain/ --no-heading -B 3 -A 3

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Let's look at the full toolchain/modules file to understand the format cat toolchain/modules

Repository: MFlowCode/MFC

Length of output: 3098

🏁 Script executed:

# Search for SLURM-related variables or SCRATCH references in comments rg "SLURM|SCRATCH|TMPDIR" . --no-heading -B 2 -A 2

Repository: MFlowCode/MFC

Length of output: 3570

🏁 Script executed:

# Check if there's a README or documentation that mentions SCRATCH_DIR fd "README|readme|SETUP|setup" . -type f | head -10

Repository: MFlowCode/MFC

Length of output: 227

🏁 Script executed:

# Look for any CI/CD documentation or workflow notes find . -name "*.md" -type f | xargs rg "SCRATCH_DIR|container\.sh" 2>/dev/null || true

Repository: MFlowCode/MFC

Length of output: 39

Ensure SCRATCH_DIR environment variable is defined before use.

The script binds $SCRATCH_DIR to /scratch inside the container, but this variable is never defined in the codebase. It must be set by the self-hosted runner environment or the workflow. If not provided, the bind mount will fail. Define a fallback (e.g., using TMPDIR or a hardcoded path) to prevent runtime failures.

🤖 Prompt for AI Agents

.github/workflows/phoenix/container.sh around lines 14-15: the script uses $SCRATCH_DIR in the apptainer bind but never guarantees it exists; set a safe fallback before the apptainer exec (for example default SCRATCH_DIR to ${TMPDIR:-/tmp} if unset) and ensure the variable is exported/validated (exists and is writable) so the bind mount cannot fail at runtime.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

.github/workflows/phoenix/container.sh (3)
43-44: Remove orphan EOT terminator.

Line 44 contains a stray EOT token with no matching heredoc start (<<EOT). This will be executed as a command and fail with a "command not found" error. The bash -c command correctly ends at line 43 with the closing single quote.

Apply this diff:
   ./mfc.sh test --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix
   '
-EOT
14-18: Export job_interface as environment variable to container.

job_interface is used at line 24 inside the container, but it is not passed via --env. Since the bash -c command is single-quoted, $job_interface will not expand and the conditional will fail, breaking GPU interface detection.

Apply this diff:
 apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
   --bind "$SCRATCH_DIR":/scratch \
   --env job_slug="$job_slug" \
   --env job_device="$job_device" \
+  --env job_interface="$job_interface" \
   "$CONTAINER" \
14-15: Define or validate SCRATCH_DIR before use.

SCRATCH_DIR is used in the bind mount but is never defined in this script or in the codebase. If the Phoenix cluster environment does not provide this variable, the apptainer bind will fail at runtime. Either define a fallback (e.g., using TMPDIR or a hardcoded path) or add validation to ensure it exists.

Apply this diff to add a safe fallback:
 #!/bin/bash
 set -e
 
+# Use SCRATCH_DIR if set, otherwise default to a safe temporary location
+SCRATCH_DIR="${SCRATCH_DIR:-${TMPDIR:-/tmp}}"
+
 if [ "$job_interface" = "omp" ]; then

🧹 Nitpick comments (2)

.github/workflows/phoenix/container.sh (2)
1-2: Add error handling to prevent silent failures.

Without set -e, failed apptainer or mfc.sh commands may go unnoticed, allowing the workflow to continue despite errors. This is especially critical for critical infrastructure like CI/CD pipelines.

Apply this diff:
 #!/bin/bash
+set -e
 
39-39: Use modern $() syntax instead of deprecated backticks.

Line 39 uses backticks for command substitution, which is deprecated. Lines 36–37 already use the modern $(...) syntax; for consistency, update this line as well.

Apply this diff:
-    n_test_threads=`expr $gpu_count \* 2`
+    n_test_threads=$((gpu_count * 2))
Note: $((...)) is preferred over expr for arithmetic and is more efficient.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 506ba34 and 2fe5084.

📒 Files selected for processing (1)

.github/workflows/phoenix/container.sh (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: After each successful build step, run focused tests using `./mfc.sh test -j $(nproc) -f EA8FA07E -t 9E2CA336` instead of running all tests

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

.github/workflows/phoenix/container.sh

🪛 Shellcheck (0.11.0)

.github/workflows/phoenix/container.sh

[warning] 3-3: job_interface is referenced but not assigned.

(SC2154)

[warning] 4-4: job_device is referenced but not assigned.

(SC2154)

[warning] 16-16: job_slug is referenced but not assigned.

(SC2154)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Georgia Tech | Phoenix (cpu)

🔇 Additional comments (2)

.github/workflows/phoenix/container.sh (2)

3-7: Container image selection logic is correct.

The conditional build correctly differentiates between omp and acc interfaces, pulling appropriate images from the registry.

9-12: GPU flag and container name setup is correct.

The NV_FLAG is properly set only for GPU jobs, and CONTAINER name matches the built image.

Added a new matrix configuration for GPU testing.

Malmahrouqi3 · 2025-12-15T05:48:04Z

unset LD_PRELOAD to avoid bombardment of warnings (xalt/3.1.4/lib64/libxalt_init.so' from LD_PRELOAD cannot be preloaded)

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (4)

.github/workflows/phoenix/container.sh (3)
44-46: Critical: Remove the orphan EOT line.

Line 45 closes the bash -c command with a single quote '. Line 46 contains a stray EOT token with no matching heredoc, which will cause a "command not found: EOT" runtime error when the script executes.

Remove line 46:
  ./mfc.sh test --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix
    '
-EOT
\ No newline at end of file
14-17: Major: Validate that SCRATCH_DIR is defined before use.

Line 17 binds $SCRATCH_DIR into the container as /scratch, but the variable is never defined or validated in this script. If the self-hosted Phoenix runner does not export SCRATCH_DIR, the bind mount will fail or bind an incorrect path, causing the container execution to fail silently or incorrectly.

Add validation at the start of the script to ensure SCRATCH_DIR is set, or define a safe fallback:
 #!/bin/bash
+set -e
+
+# Ensure SCRATCH_DIR is defined; fall back to TMPDIR or /tmp if not set
+export SCRATCH_DIR="${SCRATCH_DIR:-${TMPDIR:-/tmp}}"
16-29: Critical: job_interface is not passed to the container but is used inside it.

Lines 26 and 28 reference $job_interface inside the bash -c command block, but the variable is not exported via --env (only job_slug and job_device are on lines 18–19). Inside the container, $job_interface will be empty or undefined, breaking GPU interface detection logic.

Add job_interface to the environment variables:
  apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
    --bind "$SCRATCH_DIR":/scratch \
    --env job_slug="$job_slug" \
    --env job_device="$job_device" \
+   --env job_interface="$job_interface" \
    "$CONTAINER" \
.github/workflows/docker-test.yml (1)
34-36: Major: Remove Node.js 16 EOL configuration and explicitly allow insecure versions.

Node.js 16 reached end-of-life in September 2023 and contains known security vulnerabilities. Forcing the runner to use Node 16 and explicitly allowing insecure versions (lines 35–36) introduces unnecessary security risk. Modern GitHub Actions runners include Node 20 LTS or later by default.

Remove the version override lines and let the runner's default Node.js be used, or explicitly pin a current LTS version:
     env:
       NODE_OPTIONS: ${{ matrix.lbl == 'gt' && '--max-old-space-size=2048' || '' }}
-      ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16
-      ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
     steps:
       - name: Clone
If you must enforce a specific Node.js version, add a setup-node@v4 step in the workflow instead of forcing it via environment variables.

🧹 Nitpick comments (2)

.github/workflows/phoenix/container.sh (1)
1-1: Use modern $(...) command substitution syntax and add set -e for safety.

Line 41 uses backticks for command substitution, which is deprecated. Lines 38–39 already use the modern $(...) syntax—keep it consistent.

The script lacks set -e, which means failures in apptainer or MFC test commands may go unnoticed, allowing the workflow to proceed despite errors.

Apply this diff to add error handling and modernize syntax:
 #!/bin/bash
+set -e
 
 if [ "$job_interface" = "omp" ]; then
     apptainer build mfc:latest-$job_device.sif docker://sbryngelson/mfc:latest-mp-$job_device
@@ -38,7 +40,7 @@ if [ "$job_device" = "gpu" ]; then
     gpu_count=$(nvidia-smi -L | wc -l)        # number of GPUs on node
     gpu_ids=$(seq -s " " 0 $(($gpu_count-1))) # 0,1,2,...,gpu_count-1
     device_opts="-g $gpu_ids"
-    n_test_threads=`expr $gpu_count \* 2`
+    n_test_threads=$(expr $gpu_count \* 2)
 fi
Also applies to: 41-41
.github/workflows/docker-test.yml (1)
42-42: Optional: Simplify redundant matrix label condition.

Line 42 checks if: matrix.lbl == 'gt', but all active and commented-out matrix entries have lbl: 'gt' (lines 18, 23, 27). The conditional adds complexity without filtering. If all jobs should always run this step, remove the condition. If this condition is intentional for future extensibility with other labels, consider adding a clarifying comment.
       - name: Pull & Test
-        if:   matrix.lbl == 'gt'
         run:  bash .github/workflows/phoenix/submit.sh .github/workflows/phoenix/container.sh ${{ matrix.device }} ${{ matrix.interface }}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fe5084 and d275dab.

📒 Files selected for processing (2)

.github/workflows/docker-test.yml (1 hunks)
.github/workflows/phoenix/container.sh (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: After each successful build step, run focused tests using `./mfc.sh test -j $(nproc) -f EA8FA07E -t 9E2CA336` instead of running all tests

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

.github/workflows/phoenix/container.sh

🪛 actionlint (1.7.9)

.github/workflows/docker-test.yml

18-18: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

🪛 Shellcheck (0.11.0)

.github/workflows/phoenix/container.sh

[warning] 3-3: job_interface is referenced but not assigned.

(SC2154)

[warning] 4-4: job_device is referenced but not assigned.

(SC2154)

[warning] 18-18: job_slug is referenced but not assigned.

(SC2154)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Georgia Tech | Phoenix (gpu-acc)

Malmahrouqi3 · 2025-12-15T06:18:31Z

Almost done with the implementation, but it requires a little bit of debugging. Overall, the concept is shown to be feasible as far as I am concerned. Will deprecate for now and reinstate if needed.

Comment out the unset LD_PRELOAD command in container.sh

Malmahrouqi3 · 2025-12-16T01:11:13Z

[ 16%] Building Fortran object CMakeFiles/syscheck_lib.dir/modules/syscheck/m_thermochem.f90.o
[ 33%] Building Fortran object CMakeFiles/syscheck_lib.dir/fypp/syscheck/syscheck.fpp.f90.o
nvfortran-Fatal-nvfortran-Fatal-The value of CUDAROOT is not a directory: /usr/local/pace-apps/spack/packages/linux-rhel9-x86_64_v3/gcc-11.3.1/cuda-12.1.1-ebglvvqo7uhjvhvff2qlsjtjd54louafThe value of CUDAROOT is not a directory: /usr/local/pace-apps/spack/packages/linux-rhel9-x86_64_v3/gcc-11.3.1/cuda-12.1.1-ebglvvqo7uhjvhvff2qlsjtjd54louaf

nvfortran-Error-nvfortran-Error-A CUDA toolkit matching the current driver version (12.9) or a supported older version (11.0 or 12.3) was not installed with this HPC SDK.A CUDA toolkit matching the current driver version (12.9) or a supported older version (11.0 or 12.3) was not installed with this HPC SDK.

gmake[3]: *** [CMakeFiles/syscheck_lib.dir/build.make:80: CMakeFiles/syscheck_lib.dir/modules/syscheck/m_thermochem.f90.o] Error 2
gmake[3]: *** Waiting for unfinished jobs....
gmake[3]: *** [CMakeFiles/syscheck_lib.dir/build.make:93: CMakeFiles/syscheck_lib.dir/fypp/syscheck/syscheck.fpp.f90.o] Error 2
gmake[2]: *** [CMakeFiles/Makefile2:111: CMakeFiles/syscheck_lib.dir/all] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:92: CMakeFiles/syscheck.dir/rule] Error 2
gmake: *** [Makefile:170: syscheck] Error 2
 

Error: Failed to build the syscheck target.

Terminated

Comment out the loading of mfc.sh in submit.sh

Updated Docker test workflow to include new interface configurations.

codeant-ai · 2025-12-16T01:15:05Z

CodeAnt AI is running Incremental review

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

qodo-code-review · 2025-12-16T01:15:53Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 Security concerns Insecure runtime configuration: The workflow sets `ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true` and forces `ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16`. Allowing an insecure/outdated Node runtime can increase exposure to known vulnerabilities in the Actions runtime/tooling and should be avoided unless strictly necessary (and ideally time-bounded with a clear migration plan).
⚡ Recommended focus areas for review Possible Issue The script appears to end with a stray `EOT` token even though no heredoc is opened in this file. This will likely cause a runtime error when the script is executed and should be removed or aligned with an actual heredoc usage. n_test_threads=`expr $gpu_count \* 2` fi ./mfc.sh test --no-build --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix ' EOT Possible Issue The matrix uses `interface: mp` for the GPU OpenMP variant, but `container.sh` branches on `job_interface` values like `omp`/`acc`. This mismatch likely makes the OpenMP-specific image/build options never apply (and may pull the wrong container tag). matrix: include: - lbl: 'gt' cluster_name: 'Georgia Tech \| Phoenix' device: 'gpu' interface: 'acc' - lbl: 'gt' cluster_name: 'Georgia Tech \| Phoenix' device: 'gpu' interface: 'mp' - lbl: 'gt' cluster_name: 'Georgia Tech \| Phoenix' device: 'cpu' interface: 'none' Robustness GPU enumeration and device options can fail in edge cases: if `nvidia-smi` is unavailable/returns no GPUs, `gpu_count` may be 0 leading to `seq ... 0 -1` and invalid `-g` args, and `expr` usage is brittle. Consider guarding for `gpu_count >= 1` and quoting variables to avoid word-splitting issues in `apptainer exec` and other command invocations. ./mfc.sh test --dry-run -j 8 $build_opts n_test_threads=8 if [ "$job_device" = "gpu" ]; then gpu_count=$(nvidia-smi -L \| wc -l) # number of GPUs on node gpu_ids=$(seq -s " " 0 $(($gpu_count-1))) # 0,1,2,...,gpu_count-1 device_opts="-g $gpu_ids" n_test_threads=`expr $gpu_count \* 2` fi ./mfc.sh test --no-build --max-attempts 3 -a -j $n_test_threads $device_opts -- -c phoenix

qodo-code-review · 2025-12-16T01:16:57Z

.github/workflows/test.yml

High-level Suggestion

The PR should be revised to restore the many CI workflows that were deleted, as their removal severely damages test coverage and repository automation. Only the new container testing workflow should be added, not replace the existing infrastructure. [High-level, importance: 10]

Solution Walkthrough:

Before:

# .github/workflows/ - bench.yml - cleanliness.yml - coverage.yml - docker.yml - docs.yml - test.yml - ... (12 other workflow files deleted) + docker-test.yml

After:

# .github/workflows/ - bench.yml (reinstated) - cleanliness.yml (reinstated) - coverage.yml (reinstated) - docker.yml (reinstated) - docs.yml (reinstated) - test.yml (reinstated) - ... (all other workflows reinstated) + docker-test.yml (added)

qodo-code-review · 2025-12-16T01:16:58Z

.github/workflows/docker-test.yml

+          - lbl: 'gt'
+            cluster_name: 'Georgia Tech | Phoenix'
+            device: 'gpu'
+            interface: 'mp'


Suggestion: In .github/workflows/docker-test.yml, change the interface for the GPU test case from 'mp' to 'omp'. This ensures it matches the value expected by the .github/workflows/phoenix/container.sh script. [possible issue, importance: 7]

Suggested change

- lbl: 'gt'

cluster_name: 'Georgia Tech | Phoenix'

device: 'gpu'

interface: 'mp'

- lbl: 'gt'

cluster_name: 'Georgia Tech | Phoenix'

device: 'gpu'

interface: 'omp'

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

.github/workflows/phoenix/container.sh (1)
16-20: Pass job_interface to the container environment.

The script references $job_interface inside the container (lines 26-29) but only passes job_slug and job_device via --env. This will cause the interface-specific build options to fail silently.

Apply this diff:
 apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
   --bind "$SCRATCH_DIR":/scratch \
   --env job_slug="$job_slug" \
   --env job_device="$job_device" \
+  --env job_interface="$job_interface" \
   "$CONTAINER" \
Based on past review comments flagging this issue.
.github/workflows/docker-test.yml (1)

35-36: Node.js 16 end-of-life security risk—extensively flagged in previous reviews.

These lines force Node.js 16 (EOL September 2023) and allow insecure versions, creating known security vulnerabilities. This issue has been thoroughly documented by multiple reviewers with detailed remediation steps.

See prior review comments from coderabbitai and qodo-code-review for complete analysis and fix suggestions.

🧹 Nitpick comments (1)

.github/workflows/phoenix/container.sh (1)
14-14: Clarify the intent of the LD_PRELOAD comment.

The comment suggests LD_PRELOAD should be unset (per PR objectives, to suppress xalt warnings), but no unset command is present.

If the intent is to unset the variable, apply this diff:
-# unset LD_PRELOAD
+unset LD_PRELOAD || true
If this is intentionally commented out for debugging, consider adding a note explaining why.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d275dab and 45c7a6e.

📒 Files selected for processing (3)

.github/workflows/docker-test.yml (1 hunks)
.github/workflows/phoenix/container.sh (1 hunks)
.github/workflows/phoenix/submit.sh (1 hunks)

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

.github/workflows/phoenix/submit.sh
.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:16.713Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: After each successful build step, run focused tests using `./mfc.sh test -j $(nproc) -f EA8FA07E -t 9E2CA336` instead of running all tests

Applied to files:

.github/workflows/phoenix/container.sh

📚 Learning: 2025-11-24T21:50:46.909Z

Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

.github/workflows/phoenix/container.sh

🪛 actionlint (1.7.9)

.github/workflows/docker-test.yml

18-18: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

22-22: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

26-26: label "gt" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-26-xlarge", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

🪛 Shellcheck (0.11.0)

.github/workflows/phoenix/container.sh

[warning] 3-3: job_interface is referenced but not assigned.

(SC2154)

[warning] 4-4: job_device is referenced but not assigned.

(SC2154)

[warning] 18-18: job_slug is referenced but not assigned.

(SC2154)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Georgia Tech | Phoenix (gpu-acc)

🔇 Additional comments (1)

.github/workflows/phoenix/submit.sh (1)

60-60: No action needed. The commented-out mfc.sh load step is intentional for container-based execution.

Since submit.sh submits container.sh, which runs all work inside an Apptainer container, the host environment modules loaded by mfc.sh load are unnecessary. The container is self-contained with all dependencies pre-installed and pre-configured in the Dockerfile.

Likely an incorrect or invalid review comment.

Malmahrouqi3 · 2025-12-16T14:46:26Z

It took <50 min ea. which is perfect for the intended purpose.

self-hosted tests

09d678d

Malmahrouqi3 requested a review from sbryngelson as a code owner December 15, 2025 03:13

Copilot AI review requested due to automatic review settings December 15, 2025 03:13

qodo-code-review bot added the Review effort 2/5 label Dec 15, 2025

Copilot started reviewing on behalf of Malmahrouqi3 December 15, 2025 03:13 View session

codeant-ai bot added the size:M This PR changes 30-99 lines, ignoring generated files label Dec 15, 2025

varying triggers

81ec440

qodo-code-review bot reviewed Dec 15, 2025

View reviewed changes

.github/workflows/docker-test.yml Show resolved Hide resolved

qodo-code-review bot reviewed Dec 15, 2025

View reviewed changes

.github/workflows/phoenix/container.sh Show resolved Hide resolved

.github/workflows/docker-test.yml Outdated Show resolved Hide resolved

codeant-ai bot reviewed Dec 15, 2025

View reviewed changes

Copilot AI reviewed Dec 15, 2025

View reviewed changes

cubic-dev-ai bot reviewed Dec 15, 2025

View reviewed changes

.github/workflows/docker-test.yml Outdated Show resolved Hide resolved

.github/workflows/phoenix/container.sh Outdated Show resolved Hide resolved

.github/workflows/phoenix/container.sh Outdated Show resolved Hide resolved

Malmahrouqi3 added 4 commits December 14, 2025 22:23

removed all tests, and fixed cpu issue sorta

2ca0d44

my bad, commented wrong line

9f04550

fixed syntax errors

bce6956

replaced all "\$" with "$" to not be escaped ig

506ba34

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

Fix quotes in GPU ID sequence generation

2fe5084

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

Add GPU configuration to docker-test workflow

1259b82

Added a new matrix configuration for GPU testing.

new changes to see logs

d275dab

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

Malmahrouqi3 closed this Dec 15, 2025

Malmahrouqi3 added 2 commits December 15, 2025 10:08

Modify mfc.sh command to include --no-build option

0a74d81

Comment out LD_PRELOAD unset in container script

a06151a

Comment out the unset LD_PRELOAD command in container.sh

Malmahrouqi3 added 2 commits December 15, 2025 20:13

Comment out mfc.sh load command

9e56be5

Comment out the loading of mfc.sh in submit.sh

Modify Docker test workflow for GPU and CPU interfaces

45c7a6e

Updated Docker test workflow to include new interface configurations.

Malmahrouqi3 reopened this Dec 16, 2025

qodo-code-review bot added Possible security concern Review effort 3/5 labels Dec 16, 2025

qodo-code-review bot reviewed Dec 16, 2025

View reviewed changes

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

Fix formatting in container.sh script

a5c0a47

Malmahrouqi3 closed this Dec 16, 2025

Malmahrouqi3 mentioned this pull request Dec 16, 2025

Containerization Enhancements #1092

Closed

		apptainer exec $NV_FLAG --fakeroot --writable-tmpfs \
		--bind "$SCRATCH_DIR":/scratch \

[Experimental PR] Self-Hosted Container Tests #1088

[Experimental PR] Self-Hosted Container Tests #1088

Uh oh!

Conversation

Malmahrouqi3 commented Dec 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

User description

User description

Description

PR Type

Description

Diagram Walkthrough

File Walkthrough

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Summary by CodeRabbit

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

codeant-ai bot commented Dec 15, 2025

Thanks for using CodeAnt! 🎉

Uh oh!

qodo-code-review bot commented Dec 15, 2025

PR Reviewer Guide 🔍

Uh oh!

coderabbitai bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codeant-ai bot commented Dec 15, 2025

Nitpicks 🔍

Uh oh!

codeant-ai bot Dec 15, 2025 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codeant-ai bot Dec 15, 2025 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codeant-ai bot commented Dec 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Malmahrouqi3 commented Dec 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 15, 2025 •

edited

Loading

codeant-ai bot Dec 15, 2025 •

edited by qodo-code-review bot

Loading

codeant-ai bot Dec 15, 2025 •

edited by qodo-code-review bot

Loading