Skip to content

(bug) Fix HelmReleaseSummary.FailureMessage never persisted#1693

Merged
gianlucam76 merged 1 commit intoprojectsveltos:mainfrom
gianlucam76:bug-helm-error
Apr 5, 2026
Merged

(bug) Fix HelmReleaseSummary.FailureMessage never persisted#1693
gianlucam76 merged 1 commit intoprojectsveltos:mainfrom
gianlucam76:bug-helm-error

Conversation

@gianlucam76
Copy link
Copy Markdown
Member

@gianlucam76 gianlucam76 commented Apr 5, 2026

When a Helm chart deployment fails, setHelmFailureMessageOnHelmChartSummary correctly sets FailureMessage on clusterSummary.Status.HelmReleaseSummaries[i] in memory, but this value was never written back to the API server.

The root cause is a sequencing issue: updateStatusForReferencedHelmReleases (which persists HelmReleaseSummaries) runs before walkChartsAndDeploy. After walkChartsAndDeploy fails and sets the FailureMessage in memory, updateStatusForNonReferencedHelmReleases then re-fetches a fresh ClusterSummary from the API and overwrites the status, so discarding the in-memory change entirely.

As a result, HelmReleaseSummaries[].FailureMessage always remained empty even after repeated failures, while featureSummaries[].failureMessage (set through a separate path) correctly reflected the error.

A secondary bug: setHelmFailureMessageOnHelmChartSummary was passed currentChart (the raw, pre-template spec) instead of instantiatedChart. The lookup compares ReleaseName/ReleaseNamespace against values stored in HelmReleaseSummaries, which are the instantiated values. When those fields contain Go templates, the lookup would silently find no match and the FailureMessage would not be set at all.

Fix consist in:

  1. Pass instantiatedChart (post-template) instead of currentChart to setHelmFailureMessageOnHelmChartSummary so the lookup always matches what is stored in HelmReleaseSummaries.
  2. In updateStatusForNonReferencedHelmReleases, before overwriting the status with the freshly fetched ClusterSummary, build an index of the in-memory FailureMessage values set by walkChartsAndDeploy and merge them into the entries being written. This ensures the failure from the current reconciliation round is persisted to the API.

Fixes #1692

@gianlucam76
Copy link
Copy Markdown
Member Author

  status:
    dependencies: no dependencies
    featureSummaries:
    - consecutiveFailures: 1
      failureMessage: |
        chart: ingress-nginx, release: nginx, context deadline exceeded
      featureID: Helm
      hash: Ckr8tUz78ZCxNGIA/lR5QmvKVBZpFzuNCcjlUq/mE7k=
      lastAppliedTime: "2026-04-05T15:01:36Z"
      status: Failed
    helmReleaseSummaries:
    - failureMessage: context deadline exceeded
      releaseName: nginx
      releaseNamespace: nginx
      status: Managing
    - releaseName: postgres-operator
      releaseNamespace: postgres-operator
      status: Managing
      valuesHash: yj0WO6sFU4GCciYUBWjzvvfqrBh869doeOC2Pp5EI1Y=
    nextReconcileTime: "2026-04-05T15:01:46Z"

…rt deployment failure

When a Helm chart deployment fails, setHelmFailureMessageOnHelmChartSummary correctly sets FailureMessage on
clusterSummary.Status.HelmReleaseSummaries[i] in memory, but this value was never written back to the API server.

The root cause is a sequencing issue: updateStatusForReferencedHelmReleases (which persists HelmReleaseSummaries)
runs before walkChartsAndDeploy. After walkChartsAndDeploy fails and sets the FailureMessage in memory,
updateStatusForNonReferencedHelmReleases then re-fetches a fresh ClusterSummary from the API and overwrites the status,
so discarding the in-memory change entirely.

As a result, HelmReleaseSummaries[].FailureMessage always remained empty even after repeated failures,
while featureSummaries[].failureMessage (set through a separate path) correctly reflected the error.

A secondary bug: setHelmFailureMessageOnHelmChartSummary was passed currentChart (the raw, pre-template spec) instead of
instantiatedChart. The lookup compares ReleaseName/ReleaseNamespace against values stored in HelmReleaseSummaries, which
are the instantiated values. When those fields contain Go templates, the lookup would silently find no match and the
FailureMessage would not be set at all.

Fix consist in:
1. Pass instantiatedChart (post-template) instead of currentChart to setHelmFailureMessageOnHelmChartSummary so the lookup
always matches what is stored in HelmReleaseSummaries.
2. In updateStatusForNonReferencedHelmReleases, before overwriting the status with the freshly fetched ClusterSummary, build
an index of the in-memory FailureMessage values set by walkChartsAndDeploy and merge them into the entries being written. This
ensures the failure from the current reconciliation round is persisted to the API.
@gianlucam76 gianlucam76 changed the title (bug) Fix HelmReleaseSummary.FailureMessage never persisted after cha… (bug) Fix HelmReleaseSummary.FailureMessage never persisted Apr 5, 2026
@gianlucam76
Copy link
Copy Markdown
Member Author

Test added to functional verification

@gianlucam76 gianlucam76 merged commit 527c226 into projectsveltos:main Apr 5, 2026
23 of 24 checks passed
@gianlucam76 gianlucam76 deleted the bug-helm-error branch April 5, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Failure message not showing up in helmReleaseSummaries

1 participant