TCL-4378: Add Kind-based operator lifecycle E2E test#14
Open
jplimack-ai wants to merge 26 commits intojplimack/tcl-4373-update-sriov-network-operator-fork-deps-to-k8s-v0342from
Open
Conversation
|
Thanks for your PR,
To skip the vendors CIs, Maintainers can use one of:
|
Pure Go E2E test that validates operator deployment and reconciliation in a real Kind cluster without SR-IOV hardware. Uses Kind SDK, Docker SDK, and Helm SDK as Go libraries. Adds CI job on ubuntu-24.04-4core. TCL-4378
Enable the `modernize` golangci-lint checker and auto-fix all 12 issues:
- interface{} -> any
- for loops -> range over int
- manual contains loops -> slices.Contains
- []byte(fmt.Sprintf...) -> fmt.Appendf
Also fix 3 staticcheck SA5011 nil-deref warnings in conformance tests.
ca935fc to
9df8902
Compare
added 6 commits
March 5, 2026 07:42
Add validation-style tests for network-resources-injector and operator-webhook DaemonSets. Also fix setup-go to use go-version-file instead of hardcoded version.
Add a new `virtual-k8s-conformance` CI job that runs the existing kcli-based virtual cluster conformance tests on GitHub-hosted ubuntu-24.04-4core runners with KVM, removing the dependency on self-hosted [sriov] runners. Gated behind ENABLE_VIRTUAL_E2E repo var. Also includes remaining modernize linter fixes.
The IsKernelArgsSet mock expectations in the daemon plugin_test.go BeforeEach block need .AnyTimes() since not all test cases exercise the generic plugin (e.g., VirtualOpenstack only loads the virtual plugin). Also quote $USER in workflow to fix shellcheck SC2086.
- udev_test.go: add gomock.Any() for context arg in LoadUdevRules "Failed to trigger rules" test - generic_plugin_test.go: fix RunCommand mock to match 5 args (ctx + command + 3 variadic) instead of 6
The BeforeEach .AnyTimes() expectation for SetRDMASubsystem("")
consumes all matching calls, making the exact-once expectation
in the "should not configure RDMA kernel args" test always fail
as "missing call".
t.Setenv cannot be used in parallel tests (panics on Go 1.22+). Remove t.Parallel() from TestStaticValidateSriovNetworkNodePolicyWithInvalidVendorDevMode.
Pull Request Test Coverage Report for Build 22747019920Details
💛 - Coveralls |
added 5 commits
March 5, 2026 08:49
These tests share global state (interfaceSelected, snclient) and cannot safely run in parallel. The modernize linter incorrectly added t.Parallel() to them.
Tests in validate_test.go share global state (interfaceSelected, snclient) and cannot safely run in parallel.
ubuntu-24.04-4core runners are not available in this org. Switch kind-e2e and virtual-k8s-conformance to ubuntu-22.04-4core.
Kind v0.31.0 defaults to k8s 1.33 which uses kubeadm v1beta4, but the operator targets k8s 1.28. Pin the node image to match and increase WaitForReady to 5 minutes for slower CI runners.
Add DisplayUsage/DisplaySalutation options and log full error before asserting to diagnose kubeadm init failures in CI.
Use v1.31.4 node image (default for Kind v0.31.0) instead of v1.28.15 which fails kubeadm init on ubuntu-22.04 runners. Add verbose logging for Kind cluster creation failures.
dd934b4 to
dc8344d
Compare
|
Thanks for your PR,
To skip the vendors CIs, Maintainers can use one of:
|
added 4 commits
March 5, 2026 16:19
Remove explicit node image pin (let Kind use its default) and add docker container log dump when kubeadm init fails to get the actual error output instead of Gomega-truncated byte arrays.
Kubernetes v1.35 (default for Kind v0.31.0) rejects node-role.kubernetes.io/worker as a --node-labels kubelet flag because it's not in the allowed kubernetes.io label prefix set. Apply the label post-creation via kubectl instead.
The build scripts (hack/build-go.sh) use git rev-parse inside the Docker build. Excluding .git from the tar context caused 'git rev-parse --show-cdup' to fail with exit code 128.
added 7 commits
March 5, 2026 17:44
- Dump pod status and daemon logs when SriovNetworkNodeState timeout - Set disableDrain=true for single-node Kind cluster
The config daemon's init containers require external CNI images (sriov-cni, infiniband-cni) that aren't loaded into Kind, causing them to hang in Init state. Relax BeforeSuite to only wait for SriovNetworkNodeState objects to exist (created by the operator controller) rather than waiting for SyncStatus=Succeeded which requires the daemon pod to fully start.
- Config daemon: check DaemonSet is scheduled (not ready), since init containers need external images not loaded into Kind - Webhook: use Eventually to wait for MutatingWebhookConfiguration since it depends on cert-manager issuing certificates
The MutatingWebhookConfiguration depends on cert-manager issuing certificates, which is unreliable in Kind. Check that the operator-webhook DaemonSet is created instead.
DaemonSet pods may not reach Ready state in Kind due to init containers pulling external images. Check DesiredNumberScheduled > 0 instead of DesiredNumberScheduled == NumberReady. Remove duplicate operator-webhook test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a pure Go E2E test suite that validates the SR-IOV operator's full deployment pipeline in a real Kind cluster, no SR-IOV hardware or KVM needed. The config daemon reaches
SyncStatus: "Succeeded"with no devices viashouldSkipReconciliation(), so we can test the entire operator lifecycle on standard CI runners.What's new:
test/kind/test suite — Ginkgo tests behind//go:build kindthat create a Kind cluster, build and load Docker images, deploy cert-manager, install the Helm chart, and assert the operator is healthymake test-e2e-kind-virtual— new Makefile target to run the suitekind-e2ejob onubuntu-24.04-4core, runs after build/test/golangci with no special runner requirementsUses Kind SDK, Docker SDK, and Helm SDK as Go libraries (pinned to versions compatible with the project's k8s v0.28.x deps). The only
os/execcall is forkubectl applyof the cert-manager manifest.Ticket: TCL-4378