Skip to content

Add network-tests and other script updates#77562

Open
wabouhamad wants to merge 1 commit intoopenshift:mainfrom
wabouhamad:update-sanity-deploy-cluster-2026-04-06
Open

Add network-tests and other script updates#77562
wabouhamad wants to merge 1 commit intoopenshift:mainfrom
wabouhamad:update-sanity-deploy-cluster-2026-04-06

Conversation

@wabouhamad
Copy link
Copy Markdown
Contributor

@wabouhamad wabouhamad commented Apr 8, 2026

Adedd network-tests and other script updates:

  • modified: ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml
  • modified: ci-operator/jobs/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main-presubmits.yaml
  • modified: ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh
  • new file: ci-operator/step-registry/dpf/hypervisor/network-tests/OWNERS
  • new file: ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh
  • new file: ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.metadata.json
  • new file: ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml
  • modified: ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh

Summary by CodeRabbit

  • New Features

    • Added a network-tests CI job with presubmit trigger to run Kubernetes Network Flow tests and collect logs.
    • Added a deploy-cluster CI job for full cluster deployment runs.
  • Chores

    • Slack reporting now includes network-tests alongside sanity and deploy-cluster.
    • Increased sanity timeout to 2h; set deploy-cluster timeout to 6h.
    • Added ownership/metadata for the new network-tests step.
  • Refactor

    • Improved test sequencing and pre-checks in deployment and sanity scripts for more reliable verification.

@openshift-ci openshift-ci bot requested review from linoyaslan and szigmon April 8, 2026 19:40
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wabouhamad

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 8, 2026
@wabouhamad
Copy link
Copy Markdown
Contributor Author

/uncc @szigmon @linoyaslan

@openshift-ci openshift-ci bot removed request for linoyaslan and szigmon April 8, 2026 19:41
@eelgaev
Copy link
Copy Markdown

eelgaev commented Apr 8, 2026

/pj-rehearse network-access-allowed

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@eelgaev: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@eelgaev: not allowed to allow network access rehearsals. This must be done by a member of the openshift org

@tsorya
Copy link
Copy Markdown
Contributor

tsorya commented Apr 9, 2026

/pj-rehearse network-access-allowed

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@tsorya: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-sanity

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-sanity

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-sanity

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad wabouhamad force-pushed the update-sanity-deploy-cluster-2026-04-06 branch from f7886d0 to 6430aee Compare April 13, 2026 14:24
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new network-tests CI step and presubmit; extends Slack reporting to include network-tests; adjusts test timeouts and matrix entries; and reorders/augments remote env verification and SSH handling in deploy-cluster and sanity scripts.

Changes

Cohort / File(s) Summary
Slack Reporter Configuration
ci-operator/config/rh-ecosystem-edge/openshift-dpf/.config.prowgen
Extended slack_reporter.job_names to include network-tests.
CI Job Matrix
ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml
Added a deploy-cluster test (runs dpf-hypervisor-deploy-cluster, timeout 6h); increased sanity timeout 1h→2h; replaced prior deploy-cluster entry with network-tests (runs dpf-hypervisor-network-tests, timeout 2h).
Presubmit Jobs
ci-operator/jobs/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main-presubmits.yaml
Added presubmit pull-ci-...-network-tests (trigger /test network-tests, rerun /test network-tests, --target=network-tests, timeout 2h); updated sanity presubmit timeout 1h→2h.
Step Registry: Network Tests Metadata & Ownership
ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml, ...-ref.metadata.json, .../OWNERS
New step ref, metadata, and OWNERS; owners/approvers/reviewers: szigmon, tsorya, wabouhamad, linoyaslan; step mounts dpf-ci creds, default REMOTE_HOST, requests 500m/1Gi, timeout 2h.
Network Tests Script
ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh
New executable script: decodes Vault key, sets SSH opts, verifies SSH, persists env, mutates remote .env, runs remote verification (make verify-*) and traffic-flow tests (make run-traffic-flow-tests), captures logs, exits non-zero on failures.
Deploy Cluster Script
ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh
Reformatted remote SSH command blocks to multi-line; moved/added create/update of last-openshift-dpf-dir.sh earlier (pre-make all) and removed the older post-make all block.
Sanity Existing Script
ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh
Added local pre-flight debug (scp/source last-openshift-dpf-dir.sh, rewrite kubeconfig server), extra oc diagnostics, remote .env VERIFY_* mutations, capture remote verification-result, and gate exit on SANITY_TESTS_RESULT.

Sequence Diagram(s)

sequenceDiagram
    participant CI as CI/Prow
    participant Pod as Test Pod
    participant SSH as SSH Client
    participant Hypervisor as Hypervisor
    participant K8s as Kubernetes API
    participant TFT as Traffic Flow Tests

    CI->>Pod: trigger network-tests step
    Pod->>Pod: decode Vault key -> /tmp/id_rsa\nset SSH_OPTS
    Pod->>SSH: test SSH connectivity to ${REMOTE_HOST}
    alt SSH fails
        SSH-->>Pod: connection error
        Pod->>CI: emit diagnostics, exit 1
    else SSH succeeds
        Pod->>Pod: persist REMOTE_HOST & SSH_OPTS to ${SHARED_DIR}/dpf-env
        Pod->>Hypervisor: Phase 1 - SSH run: update .env, run make verify-*
        Hypervisor->>K8s: perform oc queries / verifications
        Hypervisor-->>Pod: verification result
        alt verification failed
            Pod->>CI: exit 1
        else verification passed
            Pod->>Hypervisor: Phase 2 - SSH run: set TFT nodes, make run-traffic-flow-tests
            Hypervisor->>TFT: execute tests, tee logs
            Hypervisor-->>Pod: logs + status
            alt tests passed
                Pod->>CI: exit 0
            else tests failed
                Pod->>CI: exit 1
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Add network-tests and other script updates' is vague and uses non-descriptive terms like 'other script updates' that obscure the full scope of changes without conveying meaningful information about what was actually modified. Consider a more specific title that clearly identifies the main changes, such as 'Add network-tests step-registry and update sanity/deploy-cluster jobs' or similar to better convey the PR's primary objectives.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Stable And Deterministic Test Names ✅ Passed This PR does not contain any Go test files or Ginkgo test declarations, so the custom check for Ginkgo test name stability is not applicable.
Test Structure And Quality ✅ Passed This PR contains only CI/CD infrastructure files (YAML configs, bash scripts, and step registry metadata) with no Ginkgo test code present.
Microshift Test Compatibility ✅ Passed This pull request does not add any new Ginkgo e2e tests. The changes consist of CI/CD infrastructure configuration files and bash scripts, not Go test code with Ginkgo test blocks.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The custom check targets Ginkgo e2e tests, but this PR adds only shell scripts and CI/CD configuration files, not Ginkgo test patterns.
Topology-Aware Scheduling Compatibility ✅ Passed Pull request modifies only CI/CD pipeline infrastructure files with no topology-aware scheduling constraints or workload scheduling directives.
Ote Binary Stdout Contract ✅ Passed This PR does not contain any OTE binaries or Go test code, only CI/Prow configuration files, Bash scripts, and governance metadata files.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds CI/CD configuration and bash scripts, not Ginkgo e2e tests. Custom check does not apply.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: file /var/tmp/gitrepo1491198238/ci-operator/step-registry/dpf/hypervisor/networking-tests/dpf-hypervisor-network-tests-commands.sh has incorrect prefix. Prefix should be dpf-hypervisor-networking-tests
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh (1)

118-125: ⚠️ Potential issue | 🟠 Major

Avoid dumping full env/.env to CI logs.

Lines in these blocks print full environment and .env content, which can leak secrets/tokens in job output.

Suggested safe logging pattern
-    env; \
+    env | grep -E '^(PATH|HOME|SHELL|USER|KUBECONFIG)='; \
@@
-  cat .env; \
+  grep -E '^(KUBECONFIG|VERIFY_DEPLOYMENT|VERIFY_MAX_RETRIES|VERIFY_SLEEP_SECONDS)=' .env; \

Also applies to: 153-162, 177-181

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`
around lines 118 - 125, The script currently dumps full environment contents
with commands like env and source ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}
which can leak secrets; update the SSH command blocks (where SSH_OPTS,
REMOTE_HOST, REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION and LAST_OPENSHIFT_DPF are
used) to stop printing raw env/.env: remove or replace plain env/source outputs
with explicit, whitelisted variable echoing (e.g., echo only non-sensitive
variables you need) or a redacted summary, and apply the same change to the
other SSH blocks in this file that call env or source the .env (the similar
blocks later that print env). Ensure no full .env or env output is sent to CI
logs.
♻️ Duplicate comments (8)
ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh (2)

93-94: ⚠️ Potential issue | 🟠 Major

Stop printing full remote environment in deploy logs.

env in these SSH blocks can expose sensitive values in CI output.

Also applies to: 110-111

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`
around lines 93 - 94, The SSH command blocks that run ssh ${SSH_OPTS}
root@${REMOTE_HOST} currently include env which prints the full remote
environment into CI logs; remove the env invocation from those SSH command
strings (and any duplicated occurrences around the SSH_OPTS/REMOTE_HOST usage,
e.g., the blocks at the shown locations) so sensitive environment variables are
not emitted—if environment verification is required, replace env with a
non-sensitive check (e.g., print specific, safe variables or use a guarded echo)
rather than the global env.

120-137: ⚠️ Potential issue | 🟠 Major

Update LAST_OPENSHIFT_DPF only after a successful deployment.

This block advances the pointer right after clone, before make all is confirmed. If deployment fails later, downstream steps can resolve to an incomplete workspace.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`
around lines 120 - 137, The current block that creates/updates
REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION and sets LAST_OPENSHIFT_DPF runs before
deployment completes; move the update of LAST_OPENSHIFT_DPF (the sed invocation
that edits last-openshift-dpf-dir.sh and the echo that writes
LAST_OPENSHIFT_DPF=${REMOTE_WORK_DIR}/openshift-dpf) out of this pre-deploy
block and only execute it after a successful deployment (e.g., after the
successful completion of make all / the deploy step). Keep the initial
existence-check/creation logic (so the file exists) but remove/disable the
sed/update path here and instead call the same update logic in the post-deploy
success path, referencing REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION,
LAST_OPENSHIFT_DPF, and the sed command used to change LAST_OPENSHIFT_DPF.
ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh (4)

82-82: ⚠️ Potential issue | 🟠 Major

Don’t force kubeconfig.doca8; reuse the kubeconfig selected by deploy step.

Line 82 overrides kubeconfig path unconditionally and can diverge from deploy output (.env/generated path).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
at line 82, The script currently unconditionally sets KUBECONFIG to
${LAST_OPENSHIFT_DPF}/kubeconfig.doca8 which can override the kubeconfig chosen
by the deploy step; change the logic in the section that exports KUBECONFIG
(referencing the KUBECONFIG variable and LAST_OPENSHIFT_DPF) so it only sets
KUBECONFIG when it is not already defined or when the deploy-produced path (from
the deploy step/.env or generated kubeconfig variable) is absent—i.e., check for
an existing KUBECONFIG or use the deploy-provided kubeconfig variable if
present, otherwise fall back to ${LAST_OPENSHIFT_DPF}/kubeconfig.doca8.

75-78: ⚠️ Potential issue | 🟠 Major

Do not log full env / .env in CI output.

These lines can leak credentials and sensitive runtime config.

Also applies to: 109-114, 130-130

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 75 - 78, Remove the unconditional env dumps and any echo that might
print secrets; specifically delete or replace the bare env commands and the echo
${LAST_OPENSHIFT_DPF} around the source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} invocation so CI does not leak
environment variables. If you need to log state for debugging, print only a
whitelist of safe vars or use a redaction helper (e.g., iterate over a safe list
or grep for non-sensitive keys) and replace the raw env/echo uses in the same
blocks where env is called (the env; source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}; echo ${LAST_OPENSHIFT_DPF}; env;
sequences and the other occurrences reported).

115-116: ⚠️ Potential issue | 🟠 Major

Remove hardcoded worker node names.

Fixed node names make this step cluster-specific and brittle.

Portable approach
-  export TFT_SERVER_NODE=worker-303ea712f414; \
-  export TFT_CLIENT_NODE=worker-303ea712f378; \
+  export TFT_SERVER_NODE=$(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[0].metadata.name}'); \
+  export TFT_CLIENT_NODE=$(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath='{.items[1].metadata.name}'); \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 115 - 116, The script currently hardcodes TFT_SERVER_NODE and
TFT_CLIENT_NODE exports, which makes the step cluster-specific; replace those
fixed assignments with dynamic resolution or configurable environment variables:
allow TFT_SERVER_NODE and TFT_CLIENT_NODE to be supplied from the environment if
present, otherwise discover appropriate worker node names at runtime (e.g., pick
two worker nodes using cluster queries such as selecting nodes with label
node-role.kubernetes.io/worker or by finding nodes matching the test pod
placement) and assign their names to TFT_SERVER_NODE and TFT_CLIENT_NODE; update
the export lines (the TFT_SERVER_NODE and TFT_CLIENT_NODE assignments) to
implement this fallback behavior so tests are portable across clusters.

33-34: ⚠️ Potential issue | 🟠 Major

Write SSH_OPTS as a sourceable shell assignment.

At Line 34, unquoted SSH_OPTS value is split into tokens when dpf-env is sourced downstream.

Proposed fix
-echo "REMOTE_HOST=${REMOTE_HOST}" >> ${SHARED_DIR}/dpf-env
-echo "SSH_OPTS=${SSH_OPTS}" >> ${SHARED_DIR}/dpf-env
+printf 'REMOTE_HOST=%q\n' "${REMOTE_HOST}" >> "${SHARED_DIR}/dpf-env"
+printf 'SSH_OPTS=%q\n' "${SSH_OPTS}" >> "${SHARED_DIR}/dpf-env"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 33 - 34, The SSH_OPTS value is written raw and will be split when
dpf-env is later sourced; replace the echo that writes SSH_OPTS into dpf-env
with a source-safe assignment by shell-escaping the variable (e.g., emitting
SSH_OPTS='...') so the full value and quoting are preserved when sourced; update
the write that currently uses SSH_OPTS and dpf-env to produce a properly
escaped/quoted assignment for SSH_OPTS (leave the REMOTE_HOST write as-is).
ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh (2)

98-110: ⚠️ Potential issue | 🟠 Major

Harden .env updates to avoid silent no-op replacements.

This is still using broad sed patterns without asserting results, so missing keys can pass undetected.

Proposed hardening
-if ssh ${SSH_OPTS} root@${REMOTE_HOST} "set -e;\
+if ssh ${SSH_OPTS} root@${REMOTE_HOST} "set -euo pipefail;\
     source ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}; \
     echo \${LAST_OPENSHIFT_DPF}; \
     cd \${LAST_OPENSHIFT_DPF}; \
     pwd ; \
-    set -e; \
     test -f .env ; \
-    cat .env | grep VERIFY ; \
     cp .env .env_orig ; \
-    sed -i 's|VERIFY_DEPLOYMENT=.*|VERIFY_DEPLOYMENT=true|' .env ; \
-    sed -i 's|VERIFY_MAX_RETRIES=.*|VERIFY_MAX_RETRIES=4|' .env ; \
-    sed -i 's|VERIFY_SLEEP_SECONDS=.*|VERIFY_SLEEP_SECONDS=3|' .env ; \
-    cat .env | grep VERIFY ; \ "; then
+    sed -i -E 's|^VERIFY_DEPLOYMENT=.*$|VERIFY_DEPLOYMENT=true|' .env ; \
+    sed -i -E 's|^VERIFY_MAX_RETRIES=.*$|VERIFY_MAX_RETRIES=4|' .env ; \
+    sed -i -E 's|^VERIFY_SLEEP_SECONDS=.*$|VERIFY_SLEEP_SECONDS=3|' .env ; \
+    grep -qx 'VERIFY_DEPLOYMENT=true' .env ; \
+    grep -qx 'VERIFY_MAX_RETRIES=4' .env ; \
+    grep -qx 'VERIFY_SLEEP_SECONDS=3' .env ; "; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`
around lines 98 - 110, The sed replacements for VERIFY_* in the .env file can
silently no-op if keys are missing; before running the sed commands in the SSH
block, assert presence of the keys (e.g. run grep -q '^VERIFY_DEPLOYMENT=' .env
&& grep -q '^VERIFY_MAX_RETRIES=' .env && grep -q '^VERIFY_SLEEP_SECONDS=' .env)
and exit non-zero with a clear message if any are missing, or alternatively
implement idempotent logic that appends the key if absent (detect .env, then for
each symbol VERIFY_DEPLOYMENT, VERIFY_MAX_RETRIES, VERIFY_SLEEP_SECONDS either
replace with sed as now or append "KEY=value" when grep fails); ensure the SSH
compound command exits on failure so missing keys cause the step to fail rather
than silently proceeding.

165-167: ⚠️ Potential issue | 🟠 Major

Resolve the remaining sanity result TODO before relying on PASS/FAIL.

The script still documents unresolved pass/fail parsing and leaves a TODO. SANITY_TESTS_RESULT is currently tied to SSH command success only.

I can help wire this to a definitive marker (status file/log token parsing) and prepare a small follow-up patch if you want.

Also applies to: 190-191

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`
around lines 165 - 167, The script currently sets SANITY_TESTS_RESULT based
solely on the SSH command exit, leaving a TODO for robust pass/fail detection;
change the remote sanity invocation to write an explicit marker (e.g., echo
"SANITY:PASS" or "SANITY:FAIL" to /tmp/sanity_status) regardless of the SSH exit
code (for example append "&& echo 'SANITY:PASS' >/tmp/sanity_status || echo
'SANITY:FAIL' >/tmp/sanity_status" to the remote test command), then after the
SSH completes read and parse /tmp/sanity_status on the remote host (or cat it
over SSH) and set SANITY_TESTS_RESULT by matching the token
("SANITY:PASS"/"SANITY:FAIL") instead of relying on SSH status; update any
references to SANITY_TESTS_RESULT and remove the TODO so downstream logic uses
the parsed token.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`:
- Around line 50-63: The sed replacements for VERIFY_* in the remote .env
(inside the SSH block) are fragile and can partially match or silently no-op;
update the sequence that touches .env (the commands referencing .env, sed -i
's|VERIFY_DEPLOYMENT=.*|...|', sed -i 's|VERIFY_MAX_RETRIES=.*|...|', sed -i
's|VERIFY_SLEEP_SECONDS=.*|...|') to use anchored regexes that match
start-of-line (e.g. ^VERIFY_DEPLOYMENT=) and, if a key is missing, append the
proper key=value line instead of silently doing nothing; after editing (you
already copy to .env_orig) run explicit post-checks using grep -E
'^VERIFY_DEPLOYMENT=' '^VERIFY_MAX_RETRIES=' '^VERIFY_SLEEP_SECONDS=' and fail
the SSH command if any expected key/value is not present so the overall if-test
detects the problem.

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`:
- Around line 66-70: The scp uses ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}
which points to a file, not the install directory, so change the remote path to
use the install-directory variable LAST_OPENSHIFT_DPF (or its remote equivalent)
when copying the kubeconfig; update the scp invocation that currently references
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}/kubeconfig.doca8 to reference
${LAST_OPENSHIFT_DPF}/kubeconfig.doca8 (keeping ${SSH_OPTS} and ${REMOTE_HOST}
unchanged) and verify LAST_OPENSHIFT_DPF is set to the remote DPF install
directory before the scp call.

---

Outside diff comments:
In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`:
- Around line 118-125: The script currently dumps full environment contents with
commands like env and source ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} which can
leak secrets; update the SSH command blocks (where SSH_OPTS, REMOTE_HOST,
REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION and LAST_OPENSHIFT_DPF are used) to stop
printing raw env/.env: remove or replace plain env/source outputs with explicit,
whitelisted variable echoing (e.g., echo only non-sensitive variables you need)
or a redacted summary, and apply the same change to the other SSH blocks in this
file that call env or source the .env (the similar blocks later that print env).
Ensure no full .env or env output is sent to CI logs.

---

Duplicate comments:
In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`:
- Around line 93-94: The SSH command blocks that run ssh ${SSH_OPTS}
root@${REMOTE_HOST} currently include env which prints the full remote
environment into CI logs; remove the env invocation from those SSH command
strings (and any duplicated occurrences around the SSH_OPTS/REMOTE_HOST usage,
e.g., the blocks at the shown locations) so sensitive environment variables are
not emitted—if environment verification is required, replace env with a
non-sensitive check (e.g., print specific, safe variables or use a guarded echo)
rather than the global env.
- Around line 120-137: The current block that creates/updates
REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION and sets LAST_OPENSHIFT_DPF runs before
deployment completes; move the update of LAST_OPENSHIFT_DPF (the sed invocation
that edits last-openshift-dpf-dir.sh and the echo that writes
LAST_OPENSHIFT_DPF=${REMOTE_WORK_DIR}/openshift-dpf) out of this pre-deploy
block and only execute it after a successful deployment (e.g., after the
successful completion of make all / the deploy step). Keep the initial
existence-check/creation logic (so the file exists) but remove/disable the
sed/update path here and instead call the same update logic in the post-deploy
success path, referencing REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION,
LAST_OPENSHIFT_DPF, and the sed command used to change LAST_OPENSHIFT_DPF.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`:
- Line 82: The script currently unconditionally sets KUBECONFIG to
${LAST_OPENSHIFT_DPF}/kubeconfig.doca8 which can override the kubeconfig chosen
by the deploy step; change the logic in the section that exports KUBECONFIG
(referencing the KUBECONFIG variable and LAST_OPENSHIFT_DPF) so it only sets
KUBECONFIG when it is not already defined or when the deploy-produced path (from
the deploy step/.env or generated kubeconfig variable) is absent—i.e., check for
an existing KUBECONFIG or use the deploy-provided kubeconfig variable if
present, otherwise fall back to ${LAST_OPENSHIFT_DPF}/kubeconfig.doca8.
- Around line 75-78: Remove the unconditional env dumps and any echo that might
print secrets; specifically delete or replace the bare env commands and the echo
${LAST_OPENSHIFT_DPF} around the source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} invocation so CI does not leak
environment variables. If you need to log state for debugging, print only a
whitelist of safe vars or use a redaction helper (e.g., iterate over a safe list
or grep for non-sensitive keys) and replace the raw env/echo uses in the same
blocks where env is called (the env; source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}; echo ${LAST_OPENSHIFT_DPF}; env;
sequences and the other occurrences reported).
- Around line 115-116: The script currently hardcodes TFT_SERVER_NODE and
TFT_CLIENT_NODE exports, which makes the step cluster-specific; replace those
fixed assignments with dynamic resolution or configurable environment variables:
allow TFT_SERVER_NODE and TFT_CLIENT_NODE to be supplied from the environment if
present, otherwise discover appropriate worker node names at runtime (e.g., pick
two worker nodes using cluster queries such as selecting nodes with label
node-role.kubernetes.io/worker or by finding nodes matching the test pod
placement) and assign their names to TFT_SERVER_NODE and TFT_CLIENT_NODE; update
the export lines (the TFT_SERVER_NODE and TFT_CLIENT_NODE assignments) to
implement this fallback behavior so tests are portable across clusters.
- Around line 33-34: The SSH_OPTS value is written raw and will be split when
dpf-env is later sourced; replace the echo that writes SSH_OPTS into dpf-env
with a source-safe assignment by shell-escaping the variable (e.g., emitting
SSH_OPTS='...') so the full value and quoting are preserved when sourced; update
the write that currently uses SSH_OPTS and dpf-env to produce a properly
escaped/quoted assignment for SSH_OPTS (leave the REMOTE_HOST write as-is).

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`:
- Around line 98-110: The sed replacements for VERIFY_* in the .env file can
silently no-op if keys are missing; before running the sed commands in the SSH
block, assert presence of the keys (e.g. run grep -q '^VERIFY_DEPLOYMENT=' .env
&& grep -q '^VERIFY_MAX_RETRIES=' .env && grep -q '^VERIFY_SLEEP_SECONDS=' .env)
and exit non-zero with a clear message if any are missing, or alternatively
implement idempotent logic that appends the key if absent (detect .env, then for
each symbol VERIFY_DEPLOYMENT, VERIFY_MAX_RETRIES, VERIFY_SLEEP_SECONDS either
replace with sed as now or append "KEY=value" when grep fails); ensure the SSH
compound command exits on failure so missing keys cause the step to fail rather
than silently proceeding.
- Around line 165-167: The script currently sets SANITY_TESTS_RESULT based
solely on the SSH command exit, leaving a TODO for robust pass/fail detection;
change the remote sanity invocation to write an explicit marker (e.g., echo
"SANITY:PASS" or "SANITY:FAIL" to /tmp/sanity_status) regardless of the SSH exit
code (for example append "&& echo 'SANITY:PASS' >/tmp/sanity_status || echo
'SANITY:FAIL' >/tmp/sanity_status" to the remote test command), then after the
SSH completes read and parse /tmp/sanity_status on the remote host (or cat it
over SSH) and set SANITY_TESTS_RESULT by matching the token
("SANITY:PASS"/"SANITY:FAIL") instead of relying on SSH status; update any
references to SANITY_TESTS_RESULT and remove the TODO so downstream logic uses
the parsed token.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0978686b-424c-4621-a116-6702a1e6006b

📥 Commits

Reviewing files that changed from the base of the PR and between fcf275a and 8ea88f4.

📒 Files selected for processing (9)
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/.config.prowgen
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml
  • ci-operator/jobs/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main-presubmits.yaml
  • ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh
  • ci-operator/step-registry/dpf/hypervisor/network-tests/OWNERS
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.metadata.json
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml
  • ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh
✅ Files skipped from review due to trivial changes (4)
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.metadata.json
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/.config.prowgen
  • ci-operator/step-registry/dpf/hypervisor/network-tests/OWNERS
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml

@wabouhamad wabouhamad force-pushed the update-sanity-deploy-cluster-2026-04-06 branch 2 times, most recently from 6fd5b77 to 7f95d04 Compare April 14, 2026 18:40
@tsorya
Copy link
Copy Markdown
Contributor

tsorya commented Apr 14, 2026

/pj-rehearse network-access-allowed

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@tsorya: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@tsorya
Copy link
Copy Markdown
Contributor

tsorya commented Apr 14, 2026

/pj-rehearse network-access-allowed

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@tsorya: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh (1)

121-143: ⚠️ Potential issue | 🟠 Major

Stop printing full environments and .env contents from the hypervisor.

These env calls and cat .env dump everything sourced in the remote workspace into CI logs. That is likely to leak credentials from env.user_* or the generated .env; please whitelist only the few safe values you need for debugging.

Also applies to: 156-166, 180-184

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`
around lines 121 - 143, The SSH command block is printing full environment and
`.env` contents (via `env` and `cat .env`), which can leak secrets; remove those
broad dumps and instead echo only a small, explicit whitelist of safe variables
needed for debugging (e.g. LAST_OPENSHIFT_DPF, KUBECONFIG, any non-secret
DEBUG_* flags), leaving other commands (pwd, ls -ltr, source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}, oc commands, make targets) intact;
update the SSH string around the symbols SSH_OPTS, REMOTE_HOST,
REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION, LAST_OPENSHIFT_DPF and the KUBECONFIG
export so that you print only those specific variables and no longer run `env`
or `cat .env`.
♻️ Duplicate comments (7)
ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh (5)

115-116: ⚠️ Potential issue | 🟠 Major

Discover the traffic-test workers dynamically.

These node names are cluster-specific. Any fresh deployment or renamed worker will fail the step before run-traffic-flow-tests even starts; query the worker names from the cluster or pass them in as environment variables.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 115 - 116, Replace the hardcoded worker node names by first
honoring any externally provided environment variables TFT_SERVER_NODE and
TFT_CLIENT_NODE, and if they are unset, dynamically discover suitable worker
nodes in the dpf-hypervisor-network-tests script
(dpf-hypervisor-network-tests-commands.sh) by querying the cluster (e.g., using
kubectl/oc to list worker nodes or nodes with the appropriate label/annotation)
and assigning the discovered names to TFT_SERVER_NODE and TFT_CLIENT_NODE before
running run-traffic-flow-tests; ensure the discovery logic picks two distinct
worker nodes and falls back to a clear error message if insufficient nodes are
found.

33-34: ⚠️ Potential issue | 🟠 Major

Persist SSH_OPTS as a quoted shell assignment.

This writes SSH_OPTS=-i /tmp/id_rsa ..., which is not sourceable. Any downstream step that sources ${SHARED_DIR}/dpf-env will try to execute /tmp/id_rsa instead of restoring the SSH options.

🛠️ Proposed fix
-echo "REMOTE_HOST=${REMOTE_HOST}" >> ${SHARED_DIR}/dpf-env
-echo "SSH_OPTS=${SSH_OPTS}" >> ${SHARED_DIR}/dpf-env
+printf 'REMOTE_HOST=%q\n' "${REMOTE_HOST}" >> "${SHARED_DIR}/dpf-env"
+printf 'SSH_OPTS=%q\n' "${SSH_OPTS}" >> "${SHARED_DIR}/dpf-env"

Run this minimal repro to verify the current form is not sourceable:

#!/bin/bash
set -euo pipefail

if bash -lc 'source <(printf "%s\n" "SSH_OPTS=-i /tmp/id_rsa -o StrictHostKeyChecking=no")'; then
  echo "unexpected success"
else
  echo "sourcing fails because the value is parsed as a command invocation"
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 33 - 34, The SSH_OPTS line currently appends an unquoted, unsafely
formatted assignment to ${SHARED_DIR}/dpf-env which is not sourceable; replace
that echo with a safe, shell-assignable format (e.g. use printf 'SSH_OPTS=%q\n'
"${SSH_OPTS}" >> "${SHARED_DIR}/dpf-env") so the value is properly
quoted/escaped when written; keep REMOTE_HOST behavior as-is or apply the same
safe formatting if needed.

48-63: ⚠️ Potential issue | 🟠 Major

Harden the VERIFY_* rewrites the same way sanity-existing now does.

These substitutions are still unanchored and only print the result afterward, so missing or partially matched keys can slip through silently. Please switch to anchored replacements plus exact post-write assertions before running the verify targets.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 48 - 63, The sed substitutions for VERIFY_DEPLOYMENT,
VERIFY_MAX_RETRIES and VERIFY_SLEEP_SECONDS are unanchored and can silently miss
or partially match keys; change each sed to anchored forms like sed -i
's|^VERIFY_DEPLOYMENT=.*$|VERIFY_DEPLOYMENT=true|' (and similarly for
VERIFY_MAX_RETRIES and VERIFY_SLEEP_SECONDS), then add exact post-write
assertions using grep -x (e.g. grep -x '^VERIFY_DEPLOYMENT=true$' .env) for each
variable and fail the remote command if any assertion fails before proceeding to
run the verify targets.

72-82: ⚠️ Potential issue | 🟠 Major

Do not override KUBECONFIG with kubeconfig.doca8.

ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh rewrites .env to KUBECONFIG=${REMOTE_WORK_DIR}/openshift-dpf/kubeconfig-mno, but this block forces ${LAST_OPENSHIFT_DPF}/kubeconfig.doca8. If that fallback file is absent or stale, every oc and make verify-* call here is pointed at the wrong cluster.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 72 - 82, The script forcibly overrides KUBECONFIG to
${LAST_OPENSHIFT_DPF}/kubeconfig.doca8 which ignores the .env rewrite (e.g.
KUBECONFIG=${REMOTE_WORK_DIR}/openshift-dpf/kubeconfig-mno); change the export
line in the SSH block (the one referencing KUBECONFIG and kubeconfig.doca8) to
not unconditionally overwrite KUBECONFIG — either remove that export entirely or
make it conditional, e.g. export
KUBECONFIG=${KUBECONFIG:-${LAST_OPENSHIFT_DPF}/kubeconfig.doca8}, so the
previously sourced environment (from ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}
and LAST_OPENSHIFT_DPF) can supply the correct kubeconfig.

72-88: ⚠️ Potential issue | 🟠 Major

Avoid logging full remote environments or .env content.

The env calls and cat .env here publish whatever credentials were sourced on the hypervisor into prow logs. Please replace them with a small allowlist of non-sensitive variables.

Also applies to: 107-114, 128-132

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`
around lines 72 - 88, The script currently runs raw env and possibly
cats/sources remote .env inside the SSH command (see the SSH command invoking
env, source ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}, and any cat .env usage),
which can leak secrets to logs; replace these with an explicit allowlist: remove
generic env and any cat/source of .env and instead echo/print only specific
non-sensitive variables (e.g., echo "$LAST_OPENSHIFT_DPF", echo "$KUBECONFIG" or
use printenv VAR_NAME for each allowed VAR), and ensure KUBECONFIG is set via
the export KUBECONFIG assignment already present; apply the same change to the
other SSH blocks that call env or cat .env (the repeated blocks around the later
env/cat usages).
ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh (2)

93-98: ⚠️ Potential issue | 🟠 Major

Stop dumping the remote environment into CI logs.

env here prints the full hypervisor environment into prow output. That can expose credentials or tokens from the remote host; please replace it with a small allowlist or explicit echo statements.

Also applies to: 110-113

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`
around lines 93 - 98, The remote ssh command currently runs `env` which dumps
the entire hypervisor environment into CI logs; replace that call with a limited
allowlist or explicit safe echoes (e.g., only echo REMOTE_MAIN_WORK_DIR,
datetime_string, and git branch) so secrets aren’t leaked. Locate the ssh
invocation that uses SSH_OPTS and root@${REMOTE_HOST} (the block that cd's into
${REMOTE_MAIN_WORK_DIR} and creates openshift-dpf-${datetime_string} and clones
${OPENSHIFT_DPF_GITHUB_REPO_URL} on branch ${OPENSHIFT_DPF_BRANCH}) and remove
`env`, adding only specific echo statements for non-sensitive variables or an
explicit allowlist check instead; apply the same change to the duplicate ssh
block later that mirrors this behavior.

120-137: ⚠️ Potential issue | 🟠 Major

Update last-openshift-dpf-dir.sh only after the new workspace is ready.

ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh and ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh source this file immediately. Writing it right after git clone means a failed generate-env or make all can redirect later jobs to an incomplete directory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`
around lines 120 - 137, The script currently writes/updates
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} immediately after git clone; instead
ensure the update happens only after the new workspace is fully ready (i.e.,
after successful generate-env and make all). Change the logic in this file so it
either defers creation/updating of ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} or
adds a readiness check on the remote ${REMOTE_WORK_DIR}/openshift-dpf (for
example test for a stable artifact like a Makefile or the generate-env output)
before writing LAST_OPENSHIFT_DPF=${REMOTE_WORK_DIR}/openshift-dpf; if the
readiness check fails, do not create/update the hypervisor file and exit with an
error so downstream scripts that source LAST_OPENSHIFT_DPF won't be pointed at
an incomplete workspace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`:
- Around line 231-232: The piped remote commands running `make clean-all` and
`make all` (which set `CLEAN_ALL_SUCCESS` and `DEPLOYMENT_SUCCESS`) are losing
failures because `tee` masks errors; enable pipefail in those remote bash
sessions so the pipeline returns the make failure. Update the remote invocation
that executes `make clean-all | tee ...` and `make all | tee ...` to run with
pipefail (e.g., enable `set -o pipefail` or run bash with `-o pipefail`) so the
exit status reflects the `make` failure and the existing logic that writes
`last-openshift-dpf-dir.sh` and sets `CLEAN_ALL_SUCCESS`/`DEPLOYMENT_SUCCESS`
behaves correctly.

In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`:
- Around line 66-80: The step currently hardcodes /tmp/kubeconfig.doca8 when
SCPing, modifying, and exporting KUBECONFIG; instead read the kubeconfig path
written by the deploy step from the .env file (the variable the deploy script
rewrites) and use that variable in place of /tmp/kubeconfig.doca8 across the
scp, cp, sed and export commands so operations use the actual generated
kubeconfig; keep using LAST_OPENSHIFT_DPF and REMOTE_HOST for the remote path
resolution but substitute the hardcoded filename for the kubeconfig variable
read from .env before running scp/cp/sed/export.

---

Outside diff comments:
In
`@ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh`:
- Around line 121-143: The SSH command block is printing full environment and
`.env` contents (via `env` and `cat .env`), which can leak secrets; remove those
broad dumps and instead echo only a small, explicit whitelist of safe variables
needed for debugging (e.g. LAST_OPENSHIFT_DPF, KUBECONFIG, any non-secret
DEBUG_* flags), leaving other commands (pwd, ls -ltr, source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}, oc commands, make targets) intact;
update the SSH string around the symbols SSH_OPTS, REMOTE_HOST,
REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION, LAST_OPENSHIFT_DPF and the KUBECONFIG
export so that you print only those specific variables and no longer run `env`
or `cat .env`.

---

Duplicate comments:
In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`:
- Around line 93-98: The remote ssh command currently runs `env` which dumps the
entire hypervisor environment into CI logs; replace that call with a limited
allowlist or explicit safe echoes (e.g., only echo REMOTE_MAIN_WORK_DIR,
datetime_string, and git branch) so secrets aren’t leaked. Locate the ssh
invocation that uses SSH_OPTS and root@${REMOTE_HOST} (the block that cd's into
${REMOTE_MAIN_WORK_DIR} and creates openshift-dpf-${datetime_string} and clones
${OPENSHIFT_DPF_GITHUB_REPO_URL} on branch ${OPENSHIFT_DPF_BRANCH}) and remove
`env`, adding only specific echo statements for non-sensitive variables or an
explicit allowlist check instead; apply the same change to the duplicate ssh
block later that mirrors this behavior.
- Around line 120-137: The script currently writes/updates
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} immediately after git clone; instead
ensure the update happens only after the new workspace is fully ready (i.e.,
after successful generate-env and make all). Change the logic in this file so it
either defers creation/updating of ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION} or
adds a readiness check on the remote ${REMOTE_WORK_DIR}/openshift-dpf (for
example test for a stable artifact like a Makefile or the generate-env output)
before writing LAST_OPENSHIFT_DPF=${REMOTE_WORK_DIR}/openshift-dpf; if the
readiness check fails, do not create/update the hypervisor file and exit with an
error so downstream scripts that source LAST_OPENSHIFT_DPF won't be pointed at
an incomplete workspace.

In
`@ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh`:
- Around line 115-116: Replace the hardcoded worker node names by first honoring
any externally provided environment variables TFT_SERVER_NODE and
TFT_CLIENT_NODE, and if they are unset, dynamically discover suitable worker
nodes in the dpf-hypervisor-network-tests script
(dpf-hypervisor-network-tests-commands.sh) by querying the cluster (e.g., using
kubectl/oc to list worker nodes or nodes with the appropriate label/annotation)
and assigning the discovered names to TFT_SERVER_NODE and TFT_CLIENT_NODE before
running run-traffic-flow-tests; ensure the discovery logic picks two distinct
worker nodes and falls back to a clear error message if insufficient nodes are
found.
- Around line 33-34: The SSH_OPTS line currently appends an unquoted, unsafely
formatted assignment to ${SHARED_DIR}/dpf-env which is not sourceable; replace
that echo with a safe, shell-assignable format (e.g. use printf 'SSH_OPTS=%q\n'
"${SSH_OPTS}" >> "${SHARED_DIR}/dpf-env") so the value is properly
quoted/escaped when written; keep REMOTE_HOST behavior as-is or apply the same
safe formatting if needed.
- Around line 48-63: The sed substitutions for VERIFY_DEPLOYMENT,
VERIFY_MAX_RETRIES and VERIFY_SLEEP_SECONDS are unanchored and can silently miss
or partially match keys; change each sed to anchored forms like sed -i
's|^VERIFY_DEPLOYMENT=.*$|VERIFY_DEPLOYMENT=true|' (and similarly for
VERIFY_MAX_RETRIES and VERIFY_SLEEP_SECONDS), then add exact post-write
assertions using grep -x (e.g. grep -x '^VERIFY_DEPLOYMENT=true$' .env) for each
variable and fail the remote command if any assertion fails before proceeding to
run the verify targets.
- Around line 72-82: The script forcibly overrides KUBECONFIG to
${LAST_OPENSHIFT_DPF}/kubeconfig.doca8 which ignores the .env rewrite (e.g.
KUBECONFIG=${REMOTE_WORK_DIR}/openshift-dpf/kubeconfig-mno); change the export
line in the SSH block (the one referencing KUBECONFIG and kubeconfig.doca8) to
not unconditionally overwrite KUBECONFIG — either remove that export entirely or
make it conditional, e.g. export
KUBECONFIG=${KUBECONFIG:-${LAST_OPENSHIFT_DPF}/kubeconfig.doca8}, so the
previously sourced environment (from ${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}
and LAST_OPENSHIFT_DPF) can supply the correct kubeconfig.
- Around line 72-88: The script currently runs raw env and possibly cats/sources
remote .env inside the SSH command (see the SSH command invoking env, source
${REMOTE_LAST_OPENSHIFT_DPF_DIR_LOCATION}, and any cat .env usage), which can
leak secrets to logs; replace these with an explicit allowlist: remove generic
env and any cat/source of .env and instead echo/print only specific
non-sensitive variables (e.g., echo "$LAST_OPENSHIFT_DPF", echo "$KUBECONFIG" or
use printenv VAR_NAME for each allowed VAR), and ensure KUBECONFIG is set via
the export KUBECONFIG assignment already present; apply the same change to the
other SSH blocks that call env or cat .env (the repeated blocks around the later
env/cat usages).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 89fe4823-9026-4e3b-a125-df60d44d36ae

📥 Commits

Reviewing files that changed from the base of the PR and between 8ea88f4 and 7ac5014.

📒 Files selected for processing (9)
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/.config.prowgen
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml
  • ci-operator/jobs/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main-presubmits.yaml
  • ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh
  • ci-operator/step-registry/dpf/hypervisor/network-tests/OWNERS
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.metadata.json
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml
  • ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh
✅ Files skipped from review due to trivial changes (4)
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.metadata.json
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/.config.prowgen
  • ci-operator/step-registry/dpf/hypervisor/network-tests/OWNERS
  • ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml

Comment on lines +231 to +232
# Need basic oc commands to verify make all step passed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

if bash -lc 'false | tee /dev/null'; then
  echo "pipeline succeeded without pipefail"
fi

if bash -lc 'set -o pipefail; false | tee /dev/null'; then
  echo "unexpected success"
else
  echo "pipeline failed once pipefail was enabled"
fi

Repository: openshift/release

Length of output: 391


🏁 Script executed:

# First, locate and examine the file mentioned in the review
cat -n ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh | head -250

Repository: openshift/release

Length of output: 11759


Add set -o pipefail to remote bash sessions running make clean-all and make all.

Lines 198 and 209 pipe make commands through tee without set -o pipefail. If either make clean-all or make all fails, the pipeline still exits with 0 because tee succeeds, causing the conditional to pass and incorrectly setting CLEAN_ALL_SUCCESS=true or DEPLOYMENT_SUCCESS=true. Since the directory path is written to last-openshift-dpf-dir.sh before these commands execute (lines 123, 130–131), a silent failure will propagate a broken deployment path to subsequent steps.

Fix locations
-if ssh ${SSH_OPTS} root@${REMOTE_HOST} "cd ${REMOTE_WORK_DIR}/openshift-dpf && mkdir -p ${REMOTE_LOGS_DIR} && make clean-all 2>&1 | tee ${CLEAN_ALL_LOG}"; then
+if ssh ${SSH_OPTS} root@${REMOTE_HOST} "set -o pipefail; cd ${REMOTE_WORK_DIR}/openshift-dpf && mkdir -p ${REMOTE_LOGS_DIR} && make clean-all 2>&1 | tee ${CLEAN_ALL_LOG}"; then
-if ssh ${SSH_OPTS} root@${REMOTE_HOST} "cd ${REMOTE_WORK_DIR}/openshift-dpf && mkdir -p ${REMOTE_LOGS_DIR} && make all 2>&1 | tee ${DEPLOYMENT_LOG}"; then
+if ssh ${SSH_OPTS} root@${REMOTE_HOST} "set -o pipefail; cd ${REMOTE_WORK_DIR}/openshift-dpf && mkdir -p ${REMOTE_LOGS_DIR} && make all 2>&1 | tee ${DEPLOYMENT_LOG}"; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh`
around lines 231 - 232, The piped remote commands running `make clean-all` and
`make all` (which set `CLEAN_ALL_SUCCESS` and `DEPLOYMENT_SUCCESS`) are losing
failures because `tee` masks errors; enable pipefail in those remote bash
sessions so the pipeline returns the make failure. Update the remote invocation
that executes `make clean-all | tee ...` and `make all | tee ...` to run with
pipefail (e.g., enable `set -o pipefail` or run bash with `-o pipefail`) so the
exit status reflects the `make` failure and the existing logic that writes
`last-openshift-dpf-dir.sh` and sets `CLEAN_ALL_SUCCESS`/`DEPLOYMENT_SUCCESS`
behaves correctly.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-sanity

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@aopincar
Copy link
Copy Markdown
Contributor

/pj-rehearse network-access-allowed

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@aopincar: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Adedd network-tests and other script updates:
-  modified:   ci-operator/config/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main.yaml
-  modified:   ci-operator/jobs/rh-ecosystem-edge/openshift-dpf/rh-ecosystem-edge-openshift-dpf-main-presubmits.yaml
-  modified:   ci-operator/step-registry/dpf/hypervisor/deploy-cluster/dpf-hypervisor-deploy-cluster-commands.sh
-  new file:   ci-operator/step-registry/dpf/hypervisor/network-tests/OWNERS
-  new file:   ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-commands.sh
-  new file:   ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.metadata.json
-  new file:   ci-operator/step-registry/dpf/hypervisor/network-tests/dpf-hypervisor-network-tests-ref.yaml
-  modified:   ci-operator/step-registry/dpf/hypervisor/sanity-existing/dpf-hypervisor-sanity-existing-commands.sh
@wabouhamad wabouhamad force-pushed the update-sanity-deploy-cluster-2026-04-06 branch from 7ac5014 to 3d0452a Compare April 15, 2026 14:38
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@wabouhamad: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster rh-ecosystem-edge/openshift-dpf presubmit Registry content changed

The following jobs are not rehearsable without the network-access-rehearsals-ok, and approved labels present on this PR. This is due to the restrict_network_access field being set to false. The network-access-rehearsals-ok label can be added by any openshift org member other than the PR's author by commenting: /pj-rehearse network-access-allowed:

Test name
pull-ci-rh-ecosystem-edge-openshift-dpf-main-sanity
pull-ci-rh-ecosystem-edge-openshift-dpf-main-network-tests
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@aopincar
Copy link
Copy Markdown
Contributor

/pj-rehearse network-access-allowed

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@aopincar: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@wabouhamad
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@wabouhamad: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 17, 2026

@wabouhamad: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/rh-ecosystem-edge/openshift-dpf/main/network-tests eab27e9 link unknown /pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-network-tests
ci/rehearse/rh-ecosystem-edge/openshift-dpf/main/sanity 7ac5014 link unknown /pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-sanity
ci/rehearse/rh-ecosystem-edge/openshift-dpf/main/deploy-cluster 3d0452a link unknown /pj-rehearse pull-ci-rh-ecosystem-edge-openshift-dpf-main-deploy-cluster

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. network-access-rehearsals-ok

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants