Skip to content

Conversation

@sadasu
Copy link
Contributor

@sadasu sadasu commented Dec 1, 2025

Detect when the infrastructure manifest could not be updated within the Bootstrap ignition and report that.

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 1, 2025
@openshift-ci-robot
Copy link
Contributor

@sadasu: This pull request references Jira Issue OCPBUGS-65566, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Detect when the infrastructure manifest could not be updated within the Bootstrap ignition and report that.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from rna-afk and rwsu December 1, 2025 22:19
@sadasu
Copy link
Contributor Author

sadasu commented Dec 1, 2025

/test e2e-gcp-custom-dns

@sadasu
Copy link
Contributor Author

sadasu commented Dec 1, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 1, 2025
@openshift-ci-robot
Copy link
Contributor

@sadasu: This pull request references Jira Issue OCPBUGS-65566, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @jinyunma

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from jinyunma December 1, 2025 22:22
@rna-afk
Copy link
Contributor

rna-afk commented Dec 1, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 1, 2025
@sadasu
Copy link
Contributor Author

sadasu commented Dec 2, 2025

/retest-required

@sadasu
Copy link
Contributor Author

sadasu commented Dec 2, 2025

/test e2e-gcp-custom-dns

Comment on lines +177 to 184
// Reset error state
updateError = nil

break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems more intuitive to me, that we would return nil here and return error at the end of the function

@patrickdillon
Copy link
Contributor

/approve

In the bug, this was listed as one of the options for an acceptable fix, but it's unclear to me with this path how we will resolve the CI failures. With this code the job would start to throw an error. Are we planning on disabling promtail in this job?

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 2, 2025
@sadasu
Copy link
Contributor Author

sadasu commented Dec 3, 2025

/approve

In the bug, this was listed as one of the options for an acceptable fix, but it's unclear to me with this path how we will resolve the CI failures. With this code the job would start to throw an error. Are we planning on disabling promtail in this job?

@patrickdillon, I am not sure why/how enabling promtail would cause us not to be able to parse the bootstrap ignition and update the infrastructure manifest. That is essentially what is happening here.

@jinyunma
Copy link
Contributor

jinyunma commented Dec 3, 2025

With this code the job would start to throw an error. Are we planning on disabling promtail in this job?

Yes, promtail is already disabled in customer dns job via release PR.

I am not sure why/how enabling promtail would cause us not to be able to parse the bootstrap ignition and update the infrastructure manifest.

What I observed is that the data in bootstrap ignition file is based64 encoded and gzip compressed when enabling promtail. For the normal bootstrap ignition file, it is text plain.

pre-merge tested, job with enabled promtail failed with error.

level=error msg=failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed preparing ignition data: failed to edit bootstrap, master or worker ignition: failed to add load balancers to ignition config: unable to find infrastructure manifest /opt/openshift/manifests/cluster-infrastructure-02-config.yml within bootstrap ignition to update 

I guess the fix should be acceptable. @sadasu @patrickdillon wdyt?

@jinyunma
Copy link
Contributor

jinyunma commented Dec 3, 2025

/test e2e-azure-custom-dns

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 3, 2025

@jinyunma: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test artifacts-images
/test e2e-agent-compact-ipv4
/test e2e-aws-ovn
/test e2e-aws-ovn-edge-zones-manifest-validation
/test e2e-aws-ovn-upi
/test e2e-azure-nat-gateway-single-zone
/test e2e-azure-ovn
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upi
/test e2e-metal-ipi-ovn-ipv6
/test e2e-openstack-ovn
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi
/test gofmt
/test golint
/test govet
/test images
/test integration-tests
/test integration-tests-nodejoiner
/test okd-scos-images
/test openstack-manifests
/test shellcheck
/test unit
/test verify-codegen
/test verify-deps
/test verify-vendor
/test yaml-lint

The following commands are available to trigger optional jobs:

/test aws-private
/test azure-ovn-marketplace-images
/test azure-private
/test e2e-agent-4control-ipv4
/test e2e-agent-5control-ipv4
/test e2e-agent-compact-ipv4-appliance-diskimage
/test e2e-agent-compact-ipv4-iso-no-registry
/test e2e-agent-compact-ipv4-none-platform
/test e2e-agent-compact-ipv6-minimaliso
/test e2e-agent-ha-dualstack
/test e2e-agent-sno-ipv4-pxe
/test e2e-agent-sno-ipv6
/test e2e-agent-two-node-fencing-ipv4
/test e2e-aws-byo-subnet-role-security-groups
/test e2e-aws-custom-dns-techpreview
/test e2e-aws-default-config
/test e2e-aws-overlay-mtu-ovn-1200
/test e2e-aws-ovn-custom-iam-profile
/test e2e-aws-ovn-edge-zones
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-heterogeneous
/test e2e-aws-ovn-imdsv2
/test e2e-aws-ovn-proxy
/test e2e-aws-ovn-public-ipv4-pool
/test e2e-aws-ovn-public-ipv4-pool-disabled
/test e2e-aws-ovn-public-subnets
/test e2e-aws-ovn-shared-vpc-custom-security-groups
/test e2e-aws-ovn-shared-vpc-edge-zones
/test e2e-aws-ovn-single-node
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-upgrade
/test e2e-aws-upi-proxy
/test e2e-azure-custom-dns-techpreview
/test e2e-azure-default-config
/test e2e-azure-ovn-multidisk-techpreview
/test e2e-azure-ovn-resourcegroup
/test e2e-azure-ovn-shared-vpc
/test e2e-azure-ovn-techpreview
/test e2e-azure-ovn-upi
/test e2e-azurestack
/test e2e-azurestack-upi
/test e2e-crc
/test e2e-external-aws
/test e2e-external-aws-ccm
/test e2e-gcp-custom-dns
/test e2e-gcp-custom-endpoints
/test e2e-gcp-default-config
/test e2e-gcp-ovn-byo-vpc
/test e2e-gcp-ovn-heterogeneous
/test e2e-gcp-ovn-techpreview
/test e2e-gcp-ovn-xpn
/test e2e-gcp-secureboot
/test e2e-gcp-upgrade
/test e2e-gcp-upi-xpn
/test e2e-gcp-xpn-dedicated-dns-project
/test e2e-ibmcloud-ovn
/test e2e-metal-assisted
/test e2e-metal-ipi-ovn
/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-swapped-hosts
/test e2e-metal-ipi-ovn-virtualmedia
/test e2e-metal-ovn-two-node-arbiter
/test e2e-metal-ovn-two-node-fencing
/test e2e-metal-single-node-live-iso
/test e2e-nutanix-ovn
/test e2e-openstack-ccpmso
/test e2e-openstack-ccpmso-zone
/test e2e-openstack-dualstack
/test e2e-openstack-dualstack-upi
/test e2e-openstack-externallb
/test e2e-openstack-nfv-intel
/test e2e-openstack-proxy
/test e2e-openstack-singlestackv6
/test e2e-powervs-capi-ovn
/test e2e-vsphere-externallb-ovn
/test e2e-vsphere-host-groups-ovn-techpreview
/test e2e-vsphere-multi-vcenter-ovn
/test e2e-vsphere-ovn-disk-setup-techpreview
/test e2e-vsphere-ovn-hybrid-env
/test e2e-vsphere-ovn-multi-disk
/test e2e-vsphere-ovn-multi-network
/test e2e-vsphere-ovn-techpreview
/test e2e-vsphere-ovn-upi-zones
/test e2e-vsphere-ovn-zones
/test e2e-vsphere-ovn-zones-techpreview
/test e2e-vsphere-static-ovn
/test gcp-custom-endpoints-proxy-wif
/test gcp-private
/test okd-scos-e2e-aws-ovn
/test okd-scos-e2e-vsphere-ovn

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-installer-main-artifacts-images
pull-ci-openshift-installer-main-e2e-aws-ovn
pull-ci-openshift-installer-main-gofmt
pull-ci-openshift-installer-main-golint
pull-ci-openshift-installer-main-govet
pull-ci-openshift-installer-main-images
pull-ci-openshift-installer-main-okd-scos-e2e-vsphere-ovn
pull-ci-openshift-installer-main-okd-scos-images
pull-ci-openshift-installer-main-shellcheck
pull-ci-openshift-installer-main-unit
pull-ci-openshift-installer-main-verify-codegen
pull-ci-openshift-installer-main-verify-deps
pull-ci-openshift-installer-main-verify-vendor
pull-ci-openshift-installer-main-yaml-lint
Details

In response to this:

/test e2e-azure-custom-dns

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jinyunma
Copy link
Contributor

jinyunma commented Dec 3, 2025

/test e2e-azure-custom-dns-techpreview

@sadasu sadasu force-pushed the add-custom-dns-error branch from 9686416 to 7571f2e Compare December 4, 2025 17:01
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 4, 2025

New changes are detected. LGTM label has been removed.

@sadasu sadasu force-pushed the add-custom-dns-error branch from 7571f2e to b6dfa3b Compare December 4, 2025 17:17
Detect when the infrastructure manifest could not be updated
within the Bootstrap ignition and report that.
When userProvisionedDNS is enabled, we edit the bootstrap Ignition
after it is created to inser Load Balancer IP information. It is
possible for a user to start their cluste install with their own
bootstrap ignition that could be base64 encoded. So, decode the
bootstrap ignition before updating it.
@sadasu sadasu force-pushed the add-custom-dns-error branch from b6dfa3b to 92fff40 Compare December 4, 2025 17:30
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: patrickdillon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sadasu
Copy link
Contributor Author

sadasu commented Dec 4, 2025

/test e2e-gcp-custom-dns

@sadasu
Copy link
Contributor Author

sadasu commented Dec 4, 2025

@patrickdillon, the bootstrap Ignition file being modified with clusterapi code is obtained from the bootstrap Ign asset. If the user provided their own updated bootstrap ignition file as would be the case after the promtail injection step in CI, would it present itself as the bootstrap ign asset?

@jinyunma thanks for trying the version of the fix that just reports error when it can't read the Infrastructure manifest in the bootstrap ignition. If regular installs work with promtail enabled, then maybe we should support it with custom-dns too.

My 2nd commit just tries to base64 decode the entire ignition file befor JSON unmarshaling it. Based on this comment it appears that the bootstrap ign is also gunzipped. Could you please confirm? (If so, my 2nd commit is insufficient.)

Also, it appears that a user provided Bootstrap ignition to create a cluster only happens with UPI where the user is required to provision other infrastructure too. Is this within the scope of custom-dns IPI?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 4, 2025

@sadasu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-custom-dns-techpreview 9686416 link false /test e2e-azure-custom-dns-techpreview
ci/prow/e2e-gcp-secureboot b6dfa3b link false /test e2e-gcp-secureboot
ci/prow/e2e-gcp-custom-endpoints b6dfa3b link false /test e2e-gcp-custom-endpoints
ci/prow/e2e-vsphere-ovn b6dfa3b link true /test e2e-vsphere-ovn
ci/prow/gcp-custom-endpoints-proxy-wif b6dfa3b link false /test gcp-custom-endpoints-proxy-wif
ci/prow/e2e-gcp-custom-dns 92fff40 link false /test e2e-gcp-custom-dns
ci/prow/e2e-gcp-ovn b6dfa3b link true /test e2e-gcp-ovn
ci/prow/e2e-vsphere-ovn-techpreview b6dfa3b link false /test e2e-vsphere-ovn-techpreview
ci/prow/e2e-gcp-xpn-dedicated-dns-project b6dfa3b link false /test e2e-gcp-xpn-dedicated-dns-project
ci/prow/e2e-gcp-default-config b6dfa3b link false /test e2e-gcp-default-config
ci/prow/okd-scos-e2e-vsphere-ovn 92fff40 link false /test okd-scos-e2e-vsphere-ovn
ci/prow/e2e-openstack-ovn b6dfa3b link true /test e2e-openstack-ovn
ci/prow/gcp-private b6dfa3b link false /test gcp-private
ci/prow/e2e-vsphere-ovn-zones b6dfa3b link false /test e2e-vsphere-ovn-zones
ci/prow/e2e-gcp-ovn-xpn b6dfa3b link false /test e2e-gcp-ovn-xpn
ci/prow/e2e-gcp-ovn-byo-vpc b6dfa3b link false /test e2e-gcp-ovn-byo-vpc
ci/prow/e2e-vsphere-ovn-hybrid-env b6dfa3b link false /test e2e-vsphere-ovn-hybrid-env
ci/prow/e2e-vsphere-multi-vcenter-ovn b6dfa3b link false /test e2e-vsphere-multi-vcenter-ovn
ci/prow/e2e-azurestack b6dfa3b link false /test e2e-azurestack
ci/prow/e2e-openstack-nfv-intel b6dfa3b link false /test e2e-openstack-nfv-intel
ci/prow/e2e-openstack-proxy b6dfa3b link false /test e2e-openstack-proxy
ci/prow/e2e-vsphere-ovn-disk-setup-techpreview b6dfa3b link false /test e2e-vsphere-ovn-disk-setup-techpreview

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jinyunma
Copy link
Contributor

jinyunma commented Dec 5, 2025

Based on this comment it appears that the bootstrap ign is also gunzipped. Could you please confirm?

I downloaded bootstrap.ign file from installer created storage account in running CI cluster, and checked that the data is gunzipped and base64 compressed, also sent the file to you via slack DM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants