Skip to content

test(ci): trigger forward-compatibility workflow on PR for testing#2248

Open
rajathagasthya wants to merge 15 commits intoNVIDIA:mainfrom
rajathagasthya:forward-compat-test
Open

test(ci): trigger forward-compatibility workflow on PR for testing#2248
rajathagasthya wants to merge 15 commits intoNVIDIA:mainfrom
rajathagasthya:forward-compat-test

Conversation

@rajathagasthya
Copy link
Copy Markdown
Contributor

Summary

  • Temporarily adds push trigger on pull-request/* branches to the forward-compatibility workflow so it runs during PR validation
  • This is for testing only — the push trigger must be reverted before merging

Test plan

  • Verify forward-compatibility workflow triggers on this PR
  • Revert push trigger before merging

Break the monolithic ci.yaml into focused, reusable workflow files:
code-scanning, config-checks, golang-checks, image-builds, and
release. Centralize shared variables in variables.yaml. Standardize
copyright headers across all workflow files.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add e2e-tests.yaml reusable workflow for end-to-end GPU operator
testing. Introduce env-to-values.sh to convert environment variables
to Helm values overrides. Update install-operator.sh to use yq-based
YAML merging for component image configuration.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add forward-compatibility.yaml workflow that runs weekly to test the
GPU operator against latest upstream component images (toolkit,
device-plugin, mig-manager). Includes get-latest-images.sh with
retry/backoff for image verification and generate-values-overrides.sh
for Helm values generation.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
- Restore Holodeck to v0.2.18 matching the original ci.yaml
- Remove unused operator_version input from release workflow and caller
- Skip Slack notification when SLACK_BOT_TOKEN secret is not configured

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
- Remove push: triggers from reusable workflows (config-checks,
  golang-checks, image-builds, release) to prevent double-execution
  when called from ci.yaml
- Eliminate ~80 lines of duplicated variable calculation logic in
  image-builds.yaml and release.yaml
- Make yq a hard requirement for YAML merging instead of falling back
  to cat concatenation which produces invalid YAML
- Replace echo -e with printf '%b' in env-to-values.sh for portability
- Add use_values_override input to e2e-tests workflow_dispatch
- Fix shebang consistency in get-latest-images.sh (#!/usr/bin/env bash)
- Document 8-char SHA truncation assumption in get-latest-images.sh
- Remove non-deterministic date from generated values file headers
- Remove trailing whitespace in code-scanning.yaml

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
- Pin mikefarah/yq action to v4 version tag
- Restore SHA-pinned regctl action (v0.11.1)
- Add workflow-level permissions: {} for least privilege
- Move expression interpolations from run: to env: blocks

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
- Restore bundle CSV update and commit-SHA-tagged bundle image
- Derive OPERATOR_IMAGE_BASE from GITHUB_REPOSITORY_OWNER
- Decouple coverage from blocking image builds
- Remove duplicate pull_request trigger from code-scanning

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
- Add timeout-minutes to all workflow jobs

- Add set -euo pipefail and quote variables in install-operator.sh

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
The addition of `set -euo pipefail` made `${SKIP_INSTALL}` fail when
unset. Use `${SKIP_INSTALL:-}` for safe default expansion.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
The `set -euo pipefail` in install-operator.sh propagates to sourced
.definitions.sh where `${DEBUG}` is used without a default value.
Use `${DEBUG:-}` for safe expansion under nounset.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add push trigger on pull-request/* branches so the forward
compatibility workflow runs during PR validation. This is temporary
and must be reverted before merging.

Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
The reusable e2e-tests workflow requires contents:read and
id-token:write but the caller job had no permissions granted
(inherited permissions:{} from the top level). Add the required
permissions to the run-e2e-tests job.
The mig-manager container image is published from the NVIDIA/mig-parted
repository, not NVIDIA/k8s-mig-manager (which doesn't exist). Fix the
GitHub repo mapping so the script can fetch the latest commit SHA.
The remote test machines don't have yq installed, causing the forward
compatibility workflow to fail when merging the values override file
with the env-generated values. Instead of depending on yq, use Helm's
native multi-file support (-f file1 -f file2) where later files take
precedence.
The values-file code path in install-operator.sh was not passing
OPERATOR_OPTIONS to helm install, causing test-case-specific flags
like --set driver.nvidiaDriverCRD.enabled=true to be silently
dropped. This caused the nvidia-driver e2e test to fail because the
NVIDIADriver CR was never created.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants