Skip to content

HIVE-3002: IBMCloud MachinePools: Snowflake FailureDomain matching#2825

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
2uasimojo:HIVE-3002/ibmcloud-fd-matching
Jan 27, 2026
Merged

HIVE-3002: IBMCloud MachinePools: Snowflake FailureDomain matching#2825
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
2uasimojo:HIVE-3002/ibmcloud-fd-matching

Conversation

@2uasimojo
Copy link
Copy Markdown
Member

We've been using cluster-control-plane-machineset-operator (CPMS) utilities to extract failure domains from MachineSets for comparison when matching up generated msets with remote ones. CPMS doesn't support IBMCloud, so we were quietly getting back a generic (empty) failure domain for all IBMCloud msets, resulting in false positive matches and bad controller behavior (namely deleting all but one of the remote MachineSets).

Here we add local code to interpret the Zone from the providerSpec as the failure domain and match on it.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jan 6, 2026

@2uasimojo: This pull request references HIVE-3002 which is a valid jira issue.

Details

In response to this:

We've been using cluster-control-plane-machineset-operator (CPMS) utilities to extract failure domains from MachineSets for comparison when matching up generated msets with remote ones. CPMS doesn't support IBMCloud, so we were quietly getting back a generic (empty) failure domain for all IBMCloud msets, resulting in false positive matches and bad controller behavior (namely deleting all but one of the remote MachineSets).

Here we add local code to interpret the Zone from the providerSpec as the failure domain and match on it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@2uasimojo
Copy link
Copy Markdown
Member Author

/hold for QE

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 6, 2026
@openshift-ci openshift-ci Bot requested review from dlom and suhanime January 6, 2026 23:24
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 7, 2026

Codecov Report

❌ Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.70%. Comparing base (26db049) to head (3f207cb).
⚠️ Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
...g/controller/machinepool/machinepool_controller.go 46.15% 3 Missing and 4 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2825      +/-   ##
==========================================
+ Coverage   50.40%   50.70%   +0.30%     
==========================================
  Files         279      279              
  Lines       34194    34534     +340     
==========================================
+ Hits        17236    17511     +275     
- Misses      15597    15642      +45     
- Partials     1361     1381      +20     
Files with missing lines Coverage Δ
...g/controller/machinepool/machinepool_controller.go 67.00% <46.15%> (+3.77%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@huangmingxia
Copy link
Copy Markdown
Contributor

/test konflux

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 14, 2026

@huangmingxia: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test coverage
/test e2e
/test e2e-azure
/test e2e-gcp
/test e2e-openstack
/test e2e-pool
/test e2e-vsphere
/test images
/test periodic-images
/test security
/test unit
/test verify

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-hive-master-coverage
pull-ci-openshift-hive-master-e2e
pull-ci-openshift-hive-master-e2e-pool
pull-ci-openshift-hive-master-images
pull-ci-openshift-hive-master-periodic-images
pull-ci-openshift-hive-master-security
pull-ci-openshift-hive-master-unit
pull-ci-openshift-hive-master-verify
Details

In response to this:

/test konflux

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@huangmingxia
Copy link
Copy Markdown
Contributor

/retest

@red-hat-konflux
Copy link
Copy Markdown
Contributor

Caution

There are some errors in your PipelineRun template.

PipelineRun Error
hive-mce-210-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request
hive-mce-211-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request
hive-mce-26-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request
hive-mce-27-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request
hive-mce-28-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request
hive-mce-29-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request
hive-on-pull-request CEL expression evaluation error: expression "event == \"pull_request\"\n&& !body.pull_request.draft\n&& target_branch == \"master\"\n&& !files.all.all(x, x.matches('^docs/|\\\\.md$|^(?:.*/)?(?:\\\\.gitignore|OWNERS|PROJECT|LICENSE)$'))\n" failed to evaluate: no such key: pull_request

Comment thread pkg/controller/machinepool/machinepool_controller.go Outdated
We've been using cluster-control-plane-machineset-operator (CPMS)
utilities to extract failure domains from MachineSets for comparison
when matching up generated msets with remote ones. CPMS doesn't support
IBMCloud, so we were quietly getting back a generic (empty) failure
domain for all IBMCloud msets, resulting in false positive matches and
bad controller behavior (namely deleting all but one of the remote
MachineSets).

Here we add local code to interpret the `Zone` from the providerSpec as
the failure domain and match on it.
@2uasimojo 2uasimojo force-pushed the HIVE-3002/ibmcloud-fd-matching branch from 86596dd to 3f207cb Compare January 19, 2026 16:58
@2uasimojo
Copy link
Copy Markdown
Member Author

/test security

@2uasimojo
Copy link
Copy Markdown
Member Author

/retest hive-mce-28

@openshift openshift deleted a comment from openshift-ci Bot Jan 26, 2026
@2uasimojo 2uasimojo closed this Jan 26, 2026
@2uasimojo 2uasimojo reopened this Jan 26, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jan 26, 2026

@2uasimojo: This pull request references HIVE-3002 which is a valid jira issue.

Details

In response to this:

We've been using cluster-control-plane-machineset-operator (CPMS) utilities to extract failure domains from MachineSets for comparison when matching up generated msets with remote ones. CPMS doesn't support IBMCloud, so we were quietly getting back a generic (empty) failure domain for all IBMCloud msets, resulting in false positive matches and bad controller behavior (namely deleting all but one of the remote MachineSets).

Here we add local code to interpret the Zone from the providerSpec as the failure domain and match on it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@2uasimojo
Copy link
Copy Markdown
Member Author

/test e2e

@2uasimojo
Copy link
Copy Markdown
Member Author

/test hive-mce-26-on-pull-request

@openshift openshift deleted a comment from openshift-ci Bot Jan 26, 2026
@2uasimojo
Copy link
Copy Markdown
Member Author

/test verify
/override "Red Hat Konflux"
/assign @dlom
/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 27, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 27, 2026

@2uasimojo: Overrode contexts on behalf of 2uasimojo: Red Hat Konflux

Details

In response to this:

/test verify
/override "Red Hat Konflux"
/assign @dlom
/hold cancel

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dlom
Copy link
Copy Markdown
Contributor

dlom commented Jan 27, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jan 27, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 27, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, dlom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@2uasimojo
Copy link
Copy Markdown
Member Author

/override "Red Hat Konflux"

?

@openshift openshift deleted a comment from openshift-ci Bot Jan 27, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 27, 2026

@2uasimojo: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit e33d703 into openshift:master Jan 27, 2026
23 of 24 checks passed
@2uasimojo 2uasimojo deleted the HIVE-3002/ibmcloud-fd-matching branch January 27, 2026 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants