NO-JIRA: fix: report all degraded controller conditions instead of first only#444 by RadekManak · Pull Request #444 · openshift/cluster-cloud-controller-manager-operator

RadekManak · 2026-04-10T14:08:04Z

Summary

checkControllerConditions() was iterating the CO status conditions and returning early on the first degraded condition found, so when both CloudConfigController and TrustedCABundleController are degraded simultaneously, only one gets reflected in the operator's Degraded status message — whichever happens to appear first in the slice.
This caused a flake in the e2e test for OCP-70566 (openshift/openshift-tests-private#29732) where the test expected TrustedCABundleControllerControllerDegraded in the message but got CloudConfigControllerDegraded instead.

Fix

Collect all degraded conditions in a single pass and return a combined error message, so all failing controllers are always reported.

Test plan

Re-run [sig-cluster-lifecycle] Cluster_Infrastructure CCM Medium-70566-Garbage in cloud-controller-manager status — the Degraded message will now contain both conditions when both controllers fail, making the test deterministic

coderabbitai · 2026-04-10T14:08:21Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 12af340c-50a0-4760-81d2-dea03bc469d7

📥 Commits

Reviewing files that changed from the base of the PR and between 07c949d and 099bb22.

📒 Files selected for processing (1)

pkg/controllers/clusteroperator_controller.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/controllers/clusteroperator_controller.go

Walkthrough

Modified checkControllerConditions in the cluster operator controller to collect degraded condition messages for both cloudConfigControllerDegradedCondition and trustedCABundleControllerDegradedCondition, sort them deterministically, then return a single aggregated error instead of returning on the first degraded condition. Added sort and strings imports.

Changes

Cohort / File(s)	Summary
Cluster Operator Controller `pkg/controllers/clusteroperator_controller.go`	Changed `checkControllerConditions` to accumulate matching degraded condition messages into a slice, sort the messages, and return one aggregated error (joined with `"; "`). Updated imports to include `sort` and `strings`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

pkg/controllers/clusteroperator_controller.go (1)

324-326: Consider sorting degraded messages before join to make error text fully deterministic.

The current output order depends on co.Status.Conditions iteration order. Sorting before strings.Join makes exact message text stable across reconciles/tests.

Suggested diff

 import (
 	"context"
 	"fmt"
+	"sort"
 	"strings"
@@
 	if len(degradedMessages) > 0 {
+		sort.Strings(degradedMessages)
 		return false, fmt.Errorf("failed to apply resources because %s", strings.Join(degradedMessages, "; "))
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/controllers/clusteroperator_controller.go` around lines 324 - 326, The
error string built from degradedMessages is non-deterministic because it
reflects iteration order; before calling strings.Join(degradedMessages, "; "),
sort the slice (e.g., use sort.Strings(degradedMessages)) so the output is
stable across runs, and add the necessary import for "sort"; update the block
that returns fmt.Errorf("failed to apply resources because %s",
strings.Join(degradedMessages, "; ")) to sort degradedMessages first.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/controllers/clusteroperator_controller.go`:
- Around line 324-326: The error string built from degradedMessages is
non-deterministic because it reflects iteration order; before calling
strings.Join(degradedMessages, "; "), sort the slice (e.g., use
sort.Strings(degradedMessages)) so the output is stable across runs, and add the
necessary import for "sort"; update the block that returns fmt.Errorf("failed to
apply resources because %s", strings.Join(degradedMessages, "; ")) to sort
degradedMessages first.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c22b36ca-2842-47a3-9960-a96b40ff44a2

📥 Commits

Reviewing files that changed from the base of the PR and between b4e6117 and 07c949d.

📒 Files selected for processing (1)

pkg/controllers/clusteroperator_controller.go

checkControllerConditions() was returning early on the first degraded condition it encountered, so when multiple controllers were degraded simultaneously only one would be reflected in the operator's Degraded status message. Collect all degraded conditions and report them together.

nrb · 2026-04-10T15:06:12Z

I should probably integrate this in to #436

RadekManak · 2026-04-10T15:07:28Z

feel free to do so.

openshift-ci · 2026-04-10T17:27:57Z

@RadekManak: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

nrb · 2026-04-10T17:44:41Z

/approve
/lgtm
/verified by CI

openshift-ci-robot · 2026-04-10T17:44:52Z

@nrb: This PR has been marked as verified by CI.

Details

In response to this:

/approve
/lgtm
/verified by CI

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

nrb · 2026-04-10T17:45:03Z

/retitle NO-JIRA: fix: report all degraded controller conditions instead of first only#444

openshift-ci · 2026-04-10T17:45:05Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [nrb]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2026-04-10T17:45:10Z

@RadekManak: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

checkControllerConditions() was iterating the CO status conditions and returning early on the first degraded condition found, so when both CloudConfigController and TrustedCABundleController are degraded simultaneously, only one gets reflected in the operator's Degraded status message — whichever happens to appear first in the slice.

This caused a flake in the e2e test for OCP-70566 (openshift/openshift-tests-private#29732) where the test expected TrustedCABundleControllerControllerDegraded in the message but got CloudConfigControllerDegraded instead.

Fix

Collect all degraded conditions in a single pass and return a combined error message, so all failing controllers are always reported.

Test plan

Re-run [sig-cluster-lifecycle] Cluster_Infrastructure CCM Medium-70566-Garbage in cloud-controller-manager status — the Degraded message will now contain both conditions when both controllers fail, making the test deterministic

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci bot requested review from nrb and racheljpg April 10, 2026 14:08

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

RadekManak force-pushed the feature-ccm-flake-fix branch from 07c949d to 099bb22 Compare April 10, 2026 14:13

RadekManak mentioned this pull request Apr 10, 2026

NO-JIRA: Merge https://github.com/kubernetes/cloud-provider-vsphere:release-1.35 (93af0f7) into main openshift/cloud-provider-vsphere#113

Merged

openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Apr 10, 2026

openshift-ci bot assigned nrb Apr 10, 2026

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 10, 2026

openshift-ci bot changed the title ~~fix: report all degraded controller conditions instead of first only~~ NO-JIRA: fix: report all degraded controller conditions instead of first only#444 Apr 10, 2026

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2026

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 10, 2026

openshift-merge-bot bot merged commit 1eccfe3 into openshift:main Apr 10, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NO-JIRA: fix: report all degraded controller conditions instead of first only#444#444

NO-JIRA: fix: report all degraded controller conditions instead of first only#444#444
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
RadekManak:feature-ccm-flake-fix

RadekManak commented Apr 10, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

nrb commented Apr 10, 2026

Uh oh!

RadekManak commented Apr 10, 2026

Uh oh!

openshift-ci bot commented Apr 10, 2026

Uh oh!

nrb commented Apr 10, 2026

Uh oh!

openshift-ci-robot commented Apr 10, 2026

Uh oh!

nrb commented Apr 10, 2026

Uh oh!

openshift-ci bot commented Apr 10, 2026

Uh oh!

openshift-ci-robot commented Apr 10, 2026

Summary

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RadekManak commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Test plan

Uh oh!

coderabbitai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

nrb commented Apr 10, 2026

Uh oh!

RadekManak commented Apr 10, 2026

Uh oh!

openshift-ci bot commented Apr 10, 2026

Uh oh!

nrb commented Apr 10, 2026

Uh oh!

openshift-ci-robot commented Apr 10, 2026

Uh oh!

nrb commented Apr 10, 2026

Uh oh!

openshift-ci bot commented Apr 10, 2026

Uh oh!

openshift-ci-robot commented Apr 10, 2026

Summary

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RadekManak commented Apr 10, 2026 •

edited

Loading

coderabbitai bot commented Apr 10, 2026 •

edited

Loading