-
Notifications
You must be signed in to change notification settings - Fork 70
WIP 🐛 Workload should still resilient when catalog is deleted #2439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WIP 🐛 Workload should still resilient when catalog is deleted #2439
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive end-to-end tests to verify that installed OLM extensions continue functioning correctly when their source catalog is deleted. The tests cover both standard runtime and experimental Boxcutter runtime scenarios.
Changes:
- Added new feature file with 8 scenarios testing catalog deletion resilience
- Implemented
CatalogIsDeletedfunction to support catalog deletion in tests - Added step registrations for ClusterExtension update operations
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| test/e2e/steps/steps.go | Adds CatalogIsDeleted function and step registrations for testing catalog deletion and ClusterExtension updates |
| test/e2e/features/catalog-deletion-resilience.feature | Defines 8 test scenarios covering extension resilience, resource restoration, config changes, version upgrades, and revision behavior when catalog is deleted |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d3cbb5a to
f31b184
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f31b184 to
dce6d68
Compare
| l.Info("skipping unpack - using installed bundle content") | ||
| // imageFS will remain nil - the applier will use the existing installed content | ||
| return nil, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PROBLEM: Always tries to pull image, even when using installed bundle
|
|
||
| // If contentFS is nil, we're maintaining the current state without catalog access. | ||
| // In this case, reconcile the existing Helm release if it exists. | ||
| if contentFS == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fail if is not able to get the content.
PROBLEM: Immediately tries to build chart from contentFS
FIX: Reconcile the existing release and watch the release objects to ensure they're maintained
dce6d68 to
b15c262
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b15c262 to
b1d259e
Compare
b1d259e to
c6870c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Show resolved
Hide resolved
c6870c5 to
36e9069
Compare
36e9069 to
6799025
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Outdated
Show resolved
Hide resolved
986e945 to
95f39fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // If contentFS is nil, we're maintaining the current state without catalog access. | ||
| // In this case, we should use the existing installed revision without generating a new one. | ||
| if contentFS == nil { | ||
| if len(existingRevisions) == 0 { | ||
| return false, "", fmt.Errorf("no bundle content available and no existing revisions found") | ||
| } | ||
| // Use the most recent revision and rely on its existing controller loop (don't create a new one). | ||
| // Returning true here signals that the rollout has succeeded using the current revision; the | ||
| // ClusterExtensionRevision controller will continue to reconcile and maintain the resources | ||
| // independently of this apply call. | ||
| return true, "", nil | ||
| } |
Copilot
AI
Jan 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contentFS nil handling in the apply method introduces a new code path for maintaining state when the catalog is unavailable but lacks unit test coverage. Consider adding unit tests to verify that this path correctly handles cases where no existing revisions are found versus when revisions exist.
internal/operator-controller/controllers/clusterextension_reconcile_steps.go
Show resolved
Hide resolved
When upgrading OLM from standard (Helm runtime) to experimental (Boxcutter runtime), the BoxcutterStorageMigrator creates a ClusterExtensionRevision from the existing Helm release. However, the migrated revision was created without status conditions, causing a race condition where it wasn't recognized as "Installed". This fix sets an initial Succeeded status on migrated revisions, ensuring they're immediately recognized and allowing version upgrades to proceed correctly after OLM upgrades. Fixes test-upgrade-st2ex-e2e failures.
Enables installed extensions to continue working when their source catalog becomes unavailable or is deleted. When resolution fails due to catalog unavailability, the operator now continues reconciling with the currently installed bundle instead of failing. Changes: - Resolution falls back to installed bundle when catalog unavailable - Unpacking skipped when maintaining current installed state - Helm and Boxcutter appliers handle nil contentFS gracefully - Version upgrades properly blocked without catalog access This ensures workloads remain stable and operational even when the catalog they were installed from is temporarily unavailable or deleted, while appropriately preventing version changes that require catalog access.
95f39fb to
8851183
Compare
To ensure that we do not broke workloads when a catalog is removed.