-
Notifications
You must be signed in to change notification settings - Fork 70
WIP 🐛 (fix) Helm to Boxcutter migration during OLM upgrade #2440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP 🐛 (fix) Helm to Boxcutter migration during OLM upgrade #2440
Conversation
When upgrading OLM from standard (Helm runtime) to experimental (Boxcutter runtime), the BoxcutterStorageMigrator creates a ClusterExtensionRevision from the existing Helm release. However, the migrated revision was created without status conditions, causing a race condition where it wasn't recognized as "Installed". This fix sets an initial Succeeded status on migrated revisions, ensuring they're immediately recognized and allowing version upgrades to proceed correctly after OLM upgrades. Fixes test-upgrade-st2ex-e2e failures.
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request fixes a race condition that occurs during the upgrade from standard OLM (Helm runtime) to experimental OLM (Boxcutter runtime). The issue arose because migrated ClusterExtensionRevisions were created without a Succeeded=True status condition, causing them not to be recognized as "Installed" until the ClusterExtensionRevision controller reconciled them. This timing gap led to version resolution failures during OLM upgrades.
Changes:
- Added a new
ClusterExtensionRevisionReasonMigratedconstant for tracking migration status - Set initial
Succeeded=Truestatus condition on migrated revisions immediately after creation - Enhanced documentation explaining the race condition and its resolution
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
api/v1/clusterextensionrevision_types.go |
Added new ClusterExtensionRevisionReasonMigrated constant for status condition reasons |
internal/operator-controller/applier/boxcutter.go |
Added status update logic to set Succeeded=True condition on migrated revisions with comprehensive documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2440 +/- ##
=======================================
Coverage 73.05% 73.05%
=======================================
Files 100 100
Lines 7641 7650 +9
=======================================
+ Hits 5582 5589 +7
- Misses 1623 1624 +1
- Partials 436 437 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
When upgrading OLM from standard (Helm runtime) to experimental (Boxcutter runtime), the BoxcutterStorageMigrator creates a ClusterExtensionRevision from the existing Helm release. However, the migrated revision was created without status conditions, causing a race condition where it wasn't recognized as "Installed".
This fix sets an initial Succeeded status on migrated revisions, ensuring they're immediately recognized and allowing version upgrades to proceed correctly after OLM upgrades.
Fixes test-upgrade-st2ex-e2e flake failures.
Faced when start to validate the resilience of an workload when catalog is deleted.
Example: https://github.com/operator-framework/operator-controller/actions/runs/20890017311/job/60019736069
What is the problem
When we upgrade OLM itself from standard to experimental, our installed extensions get "stuck" and can't be upgraded anymore.
Real-World Scenario
What We're Doing
Day 1: You install OLM standard edition and install PostgreSQL operator v2.0.0
Day 2: You want to try the new Boxcutter runtime (experimental features)
Day 3: PostgreSQL v2.1.0 is released with bug fixes you need
What Was Happening (Before Fix)
The Migration Process
When OLM upgrades from Helm to Boxcutter:
4. Race condition - System checks what's installed before status is set
The Timing Issue
What's Fixed Now
After the Fix
When OLM upgrades from Helm to Boxcutter:
Comparison Table