Upgrade to operator-sdk 1.41.1 #1043

dprince · 2025-11-14T19:48:25Z

Rescaffold the nova-operator to operator-sdk 1.41.1, which includes:

Reorganize project structure (pkg/ -> internal/)
Move webhook implementations to internal/webhook/v1beta1/
Add new cmd/main.go entrypoint with updated controller initialization
Update RBAC, certmanager, and prometheus configurations
Enhance network policies for metrics and webhook traffic
Remove auto-generated test suite scaffolding
Update build workflow and Dockerfile to version 1.41.1

This upgrade modernizes the operator structure and aligns with the latest operator-sdk best practices.

Jira: OSPRH-21969

Depends-On: openstack-k8s-operators/openstack-operator#1683

softwarefactory-project-zuul · 2025-11-14T20:01:32Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/9a74a948246442dfb17dea450b4c74a2

❌ openstack-meta-content-provider FAILURE in 11m 42s
⚠️ nova-operator-kuttl SKIPPED Skipped due to failed job openstack-meta-content-provider
⚠️ nova-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-meta-content-provider
⚠️ nova-operator-tempest-multinode-ceph SKIPPED Skipped due to failed job openstack-meta-content-provider

softwarefactory-project-zuul · 2025-11-14T22:00:19Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b77ba055c14a4fb896e67692835bbe63

❌ openstack-meta-content-provider FAILURE in 16m 35s
⚠️ nova-operator-kuttl SKIPPED Skipped due to failed job openstack-meta-content-provider
⚠️ nova-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-meta-content-provider
⚠️ nova-operator-tempest-multinode-ceph SKIPPED Skipped due to failed job openstack-meta-content-provider

danpawlik · 2025-11-17T07:24:10Z

recheck

softwarefactory-project-zuul · 2025-11-17T07:40:03Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/8f758e3f635f4b039bd378d654dbf317

❌ openstack-meta-content-provider FAILURE in 14m 51s
⚠️ nova-operator-kuttl SKIPPED Skipped due to failed job openstack-meta-content-provider
⚠️ nova-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-meta-content-provider
⚠️ nova-operator-tempest-multinode-ceph SKIPPED Skipped due to failed job openstack-meta-content-provider

danpawlik · 2025-11-17T08:08:08Z

recheck

danpawlik · 2025-11-17T08:08:59Z

earlier hold node, setting project_layout to v4 seems to help:

-    operators.operatorframework.io/project_layout: go.kubebuilder.io/v3
+    operators.operatorframework.io/project_layout: go.kubebuilder.io/v4

but that should not be necessary, due openstack-k8s-operators/openstack-operator#1683 contains that.

softwarefactory-project-zuul · 2025-11-17T08:10:42Z

This change depends on a change that failed to merge.

Change openstack-k8s-operators/openstack-operator#1683 is needed.

danpawlik · 2025-11-17T08:11:12Z

Updated first comment Depends-On

danpawlik · 2025-11-17T08:11:17Z

recheck

danpawlik · 2025-11-17T08:55:36Z

Wondering if error in job nova-operator-tempest-multinode:

2025-11-17 08:42:59.440354 | controller |         File "/tmp/ansible_kubernetes.core.k8s_payload_9gcprbir/ansible_kubernetes.core.k8s_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/service.py", line 201, in retrieve
2025-11-17 08:42:59.440360 | controller |       ansible_collections.kubernetes.core.plugins.module_utils.k8s.exceptions.CoreException: Failed to retrieve requested object: HTTPSConnectionPool(host='api.crc.testing', port=6443): Max retries exceeded with url: /api/v1/namespaces/openstack (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f5889f7e6d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
2025-11-17 08:42:59.440366 | controller |

is related to recent changes in crc-cloud: crc-org/crc-cloud#209
Done same command 2 minutes after fail and cluster seems to be up and ready...

Let's try with recheck, if that would be flaky, will do another PR with retry + delay.

So far, I think operator-sdk is in correct version.

softwarefactory-project-zuul · 2025-11-17T11:18:11Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ab36e589caeb4c0fb83160c69b89aa41

✔️ openstack-meta-content-provider SUCCESS in 3h 05m 42s
❌ nova-operator-kuttl FAILURE in 38m 34s
❌ nova-operator-tempest-multinode FAILURE in 21m 51s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 49m 27s

dprince · 2025-11-17T13:08:25Z

recheck

softwarefactory-project-zuul · 2025-11-17T15:53:22Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/62cf6212155e4c6ea9419d73dafdb617

✔️ openstack-meta-content-provider SUCCESS in 2h 43m 47s
❌ nova-operator-kuttl FAILURE in 39m 20s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 26m 09s
❌ nova-operator-tempest-multinode-ceph FAILURE in 21m 29s

dprince · 2025-11-17T16:47:27Z

recheck

softwarefactory-project-zuul · 2025-11-17T20:05:47Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/02a0c3ee95164448bb616b28a857ab3e

✔️ openstack-meta-content-provider SUCCESS in 3h 17m 11s
❌ nova-operator-kuttl FAILURE in 39m 25s
❌ nova-operator-tempest-multinode FAILURE in 21m 57s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 55m 25s

Rescaffold the nova-operator to operator-sdk 1.41.1, which includes: - Reorganize project structure (pkg/ -> internal/) - Move webhook implementations to internal/webhook/v1beta1/ - Add new cmd/main.go entrypoint with updated controller initialization - Update RBAC, certmanager, and prometheus configurations - Enhance network policies for metrics and webhook traffic - Remove auto-generated test suite scaffolding - Update build workflow and Dockerfile to version 1.41.1 This upgrade modernizes the operator structure and aligns with the latest operator-sdk best practices. Jira: OSPRH-21969 Depends-On: openstack-k8s-operators/openstack-operator#1683

softwarefactory-project-zuul · 2025-11-18T15:41:39Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4c902cf5b693468f98c0c6515fb3a2be

✔️ openstack-meta-content-provider SUCCESS in 3h 18m 57s
❌ nova-operator-kuttl FAILURE in 38m 57s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 25m 35s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 56m 56s

dprince · 2025-11-18T19:11:19Z

recheck

softwarefactory-project-zuul · 2025-11-18T22:09:09Z

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/40ac4540071444dd8ba36feae8cc9791

✔️ openstack-meta-content-provider SUCCESS in 2h 56m 34s
❌ nova-operator-kuttl FAILURE in 32m 26s
✔️ nova-operator-tempest-multinode SUCCESS in 2h 20m 37s
✔️ nova-operator-tempest-multinode-ceph SUCCESS in 2h 39m 05s

Explicitly to delete any running nova-operator deployments from openstack-operator here as label selectors can change and installing a service catalog/index like this alongside openstack-operator (what CI appears to do?) is not recommended unless the initialization resource controller in openstack-operator is paused and existing deployments are cleaned properly

stuggi · 2025-11-24T15:10:58Z

cmd/main.go

+	var tlsOpts []func(*tls.Config)
+	flag.StringVar(&metricsAddr, "metrics-bind-address", "0", "The address the metrics endpoint binds to. "+
+		"Use :8443 for HTTPS or :8080 for HTTP, or leave as 0 to disable the metrics service.")
+	flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")


pprofBindAddress is missing from current implementation. lets add this when we bump to the new infra-op version

abays · 2025-11-24T20:50:48Z

Makefile

+	# explicitly to delete any running nova-operator deployments from openstack-operator here as
+	# label selectors can change and installing a service catalog/index like this alongside
+	# openstack-operator (what CI appears to do?) is not recommended
+	oc delete deployment nova-operator-controller-manager -n openstack-operators --ignore-not-found=true


In the context mentioned in the comment, wouldn't the openstack-operator-controller-operator pod just recreate the Deployment again right after we delete it here? I wonder if we need to use the OpenStack interface to drop the replicas for the Nova operator to 0 [1]?

[1] https://github.com/openstack-k8s-operators/openstack-operator/blob/17b1faec894dfcad58164b52f38cf6acda76f9dc/api/operator/v1beta1/openstack_types.go#L223

no because in CI they are setting the initialization openstack-operator-controller-operator replicas to 0

stuggi

/lgtm

openshift-ci · 2025-11-25T11:36:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprince, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [dprince,stuggi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from olliewalsh and stuggi November 14, 2025 19:48

openshift-ci bot added the approved label Nov 14, 2025

dprince force-pushed the operator_sdk_1.41.1 branch from 76db895 to 41a2f63 Compare November 14, 2025 21:42

dprince force-pushed the operator_sdk_1.41.1 branch from 41a2f63 to 8e33039 Compare November 18, 2025 12:21

stuggi reviewed Nov 24, 2025

View reviewed changes

abays reviewed Nov 24, 2025

View reviewed changes

stuggi approved these changes Nov 25, 2025

View reviewed changes

openshift-ci bot assigned stuggi Nov 25, 2025

openshift-ci bot added the lgtm label Nov 25, 2025

openshift-merge-bot bot merged commit 5c47a41 into openstack-k8s-operators:main Nov 25, 2025
7 checks passed

Upgrade to operator-sdk 1.41.1 #1043

Upgrade to operator-sdk 1.41.1 #1043

Uh oh!

Conversation

dprince commented Nov 14, 2025 • edited by danpawlik Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

softwarefactory-project-zuul bot commented Nov 14, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 14, 2025

Uh oh!

danpawlik commented Nov 17, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 17, 2025

Uh oh!

danpawlik commented Nov 17, 2025

Uh oh!

danpawlik commented Nov 17, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 17, 2025

Uh oh!

danpawlik commented Nov 17, 2025

Uh oh!

danpawlik commented Nov 17, 2025

Uh oh!

danpawlik commented Nov 17, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 17, 2025

Uh oh!

dprince commented Nov 17, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 17, 2025

Uh oh!

dprince commented Nov 17, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 17, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 18, 2025

Uh oh!

dprince commented Nov 18, 2025

Uh oh!

softwarefactory-project-zuul bot commented Nov 18, 2025

Uh oh!

stuggi Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

abays Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

dprince Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

stuggi left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dprince commented Nov 14, 2025 •

edited by danpawlik

Loading