USHIFT-6951: Add node-exporter as optional MicroShift component#6810
USHIFT-6951: Add node-exporter as optional MicroShift component#6810copejon wants to merge 4 commits into
Conversation
|
@copejon: This pull request references USHIFT-6951 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an optional node-exporter component: Kubernetes manifests, kustomize overlays, OTEL collector scrape drop-in, RPM packaging, and optional-workload registration with namespace-aware workload merging. ChangesNode-exporter optional workload integration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 1 warning)
✅ Passed checks (13 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: copejon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/optional/node-exporter/03-daemonset.yaml`:
- Line 69: The DaemonSet currently has an empty securityContext and an init
container ("init-textfile") running as root/privileged; update the manifest to
enforce least-privilege defaults: add a pod/container securityContext with
runAsNonRoot: true, set a non-root runAsUser (e.g. an unprivileged UID), set
readOnlyRootFilesystem: true, and set allowPrivilegeEscalation: false for both
the main node-exporter container and the init-textfile container; remove
privileged: true and any runAs: 0 usages unless you have a documented, justified
exception (if an exception is required, add a comment explaining why). Also
apply the same hardening to the other securityContext instance referenced around
lines 144-147.
- Around line 65-68: The DaemonSet containers currently only set
resources.requests (e.g., the resources block shown with cpu: 8m and memory:
32Mi); add matching resources.limits for cpu and memory for every container in
the DaemonSet spec to prevent noisy-neighbor issues — update the resources
section to include limits (reasonable values higher than requests) for the
container(s) defined in this manifest and repeat the same change for the other
containers mentioned (the similar resources blocks at the other occurrences).
Ensure each container stanza has both resources.requests and resources.limits
entries.
- Around line 33-121: Add livenessProbe and readinessProbe blocks for both
containers: for the node-exporter container (identify by name "node-exporter"
and its arg --web.listen-address=127.0.0.1:9101) add httpGet probes against host
127.0.0.1 port 9101 path /metrics (scheme HTTP) with sensible defaults (e.g.
initialDelaySeconds 10, periodSeconds 10, timeoutSeconds 5, failureThreshold 3);
for the kube-rbac-proxy container (identify by name "kube-rbac-proxy" and its
arg --secure-listen-address=[$(IP)]:9100) add readinessProbe and livenessProbe
using httpGet with scheme HTTPS to port 9100 path /healthz (same timing
defaults) so the pod correctly reports ready/healthy state and restarts on
failure.
In `@assets/optional/node-exporter/kustomization.yaml`:
- Around line 3-12: Add a NetworkPolicy resource to this kustomization so
standalone deployments constrain namespace traffic: create a new manifest (e.g.,
05-networkpolicy.yaml) that defines a NetworkPolicy allowing required
ingress/egress (for node-exporter, allow pod-to-pod and node-exporter ports as
needed) and then add " - 05-networkpolicy.yaml" to the resources list in
assets/optional/node-exporter/kustomization.yaml (which currently lists
00-namespace.yaml through 04-service.yaml); ensure the NetworkPolicy's
metadata.namespace matches the namespace in 00-namespace.yaml and name it
clearly (e.g., node-exporter-networkpolicy) so it can be located and reviewed.
In `@packaging/observability/otelcol.d/microshift-metrics-node-exporter.yaml`:
- Around line 7-11: Change the static scrape target from the loopback address to
the same bind/address used by kube-rbac-proxy and ensure TLS identity matches
the cert: replace the targets entry (currently "127.0.0.1:9100") with the
node-bound address used by the kube-rbac-proxy listener (the $(IP) style value
or an equivalent host IP substitution), keep scheme: https and
tls_config.ca_file as-is, and add tls_config.server_name (or update it) to the
service-issued certificate's SAN/common-name so the TLS host name validation
matches the cert presented by kube-rbac-proxy.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 05931de1-4901-4b4c-a254-2b372cfd24ae
📒 Files selected for processing (16)
assets/optional/node-exporter/00-namespace.yamlassets/optional/node-exporter/01-cluster-role-binding.yamlassets/optional/node-exporter/01-cluster-role.yamlassets/optional/node-exporter/01-security-context-constraints.yamlassets/optional/node-exporter/01-service-account.yamlassets/optional/node-exporter/02-accelerators-collector-configmap.yamlassets/optional/node-exporter/02-kube-rbac-proxy-secret.yamlassets/optional/node-exporter/03-daemonset.yamlassets/optional/node-exporter/04-service.yamlassets/optional/node-exporter/kustomization.aarch64.yamlassets/optional/node-exporter/kustomization.x86_64.yamlassets/optional/node-exporter/kustomization.yamlpackaging/observability/otelcol.d/microshift-metrics-node-exporter.yamlpackaging/rpm/microshift.specpkg/healthcheck/microshift_optional_workloads.gotest/bin/common.sh
| containers: | ||
| - args: | ||
| - --web.listen-address=127.0.0.1:9101 | ||
| - --path.sysfs=/host/sys | ||
| - --path.rootfs=/host/root | ||
| - --path.procfs=/host/root/proc | ||
| - --path.udev.data=/host/root/run/udev/data | ||
| - --no-collector.wifi | ||
| - --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|run/k3s/containerd/.+|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/) | ||
| - --collector.netclass.ignored-devices=^.*$ | ||
| - --collector.netdev.device-exclude=^.*$ | ||
| - --collector.cpu.info | ||
| - --collector.textfile.directory=/var/node_exporter/textfile | ||
| - --no-collector.btrfs | ||
| command: | ||
| - /bin/sh | ||
| - -c | ||
| - | | ||
| export GOMAXPROCS=4 | ||
| # We don't take CPU affinity into account as the container doesn't have integer CPU requests. | ||
| # In case of error, fallback to the default value. | ||
| NUM_CPUS=$(grep -c '^processor' "/proc/cpuinfo" 2>/dev/null || echo "0") | ||
| if [ "$NUM_CPUS" -lt "$GOMAXPROCS" ]; then | ||
| export GOMAXPROCS="$NUM_CPUS" | ||
| fi | ||
| echo "ts=$(date --iso-8601=seconds) num_cpus=$NUM_CPUS gomaxprocs=$GOMAXPROCS" | ||
| exec /bin/node_exporter "$0" "$@" | ||
| env: | ||
| - name: DBUS_SYSTEM_BUS_ADDRESS | ||
| value: unix:path=/host/root/var/run/dbus/system_bus_socket | ||
| image: quay.io/openshift/node-exporter | ||
| name: node-exporter | ||
| resources: | ||
| requests: | ||
| cpu: 8m | ||
| memory: 32Mi | ||
| securityContext: {} | ||
| terminationMessagePolicy: FallbackToLogsOnError | ||
| volumeMounts: | ||
| - mountPath: /host/sys | ||
| mountPropagation: HostToContainer | ||
| name: sys | ||
| readOnly: true | ||
| - mountPath: /host/root | ||
| mountPropagation: HostToContainer | ||
| name: root | ||
| readOnly: true | ||
| - mountPath: /var/node_exporter/textfile | ||
| name: node-exporter-textfile | ||
| readOnly: true | ||
| - mountPath: /var/node_exporter/accelerators_collector_config | ||
| name: node-exporter-accelerators-collector-config | ||
| readOnly: true | ||
| workingDir: /var/node_exporter/textfile | ||
| - args: | ||
| - --secure-listen-address=[$(IP)]:9100 | ||
| - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 | ||
| - --upstream=http://127.0.0.1:9101/ | ||
| - --tls-cert-file=/etc/tls/private/tls.crt | ||
| - --tls-private-key-file=/etc/tls/private/tls.key | ||
| - --config-file=/etc/kube-rbac-policy/config.yaml | ||
| env: | ||
| - name: IP | ||
| valueFrom: | ||
| fieldRef: | ||
| fieldPath: status.podIP | ||
| image: quay.io/openshift/kube-rbac-proxy | ||
| name: kube-rbac-proxy | ||
| ports: | ||
| - containerPort: 9100 | ||
| hostPort: 9100 | ||
| name: https | ||
| resources: | ||
| requests: | ||
| cpu: 1m | ||
| memory: 15Mi | ||
| securityContext: | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: | ||
| - ALL | ||
| readOnlyRootFilesystem: true | ||
| runAsGroup: 65532 | ||
| runAsNonRoot: true | ||
| runAsUser: 65532 | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| terminationMessagePolicy: FallbackToLogsOnError | ||
| volumeMounts: |
There was a problem hiding this comment.
Define readiness and liveness probes for both containers.
Neither node-exporter nor kube-rbac-proxy has probes, so stalled processes can remain running but unhealthy.
Suggested patch
- args:
- --web.listen-address=127.0.0.1:9101
@@
name: node-exporter
+ readinessProbe:
+ tcpSocket:
+ port: 9101
+ periodSeconds: 10
+ livenessProbe:
+ tcpSocket:
+ port: 9101
+ initialDelaySeconds: 15
+ periodSeconds: 20
@@
- args:
- --secure-listen-address=[$(IP)]:9100
@@
name: kube-rbac-proxy
+ readinessProbe:
+ tcpSocket:
+ port: 9100
+ periodSeconds: 10
+ livenessProbe:
+ tcpSocket:
+ port: 9100
+ initialDelaySeconds: 15
+ periodSeconds: 20🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/optional/node-exporter/03-daemonset.yaml` around lines 33 - 121, Add
livenessProbe and readinessProbe blocks for both containers: for the
node-exporter container (identify by name "node-exporter" and its arg
--web.listen-address=127.0.0.1:9101) add httpGet probes against host 127.0.0.1
port 9101 path /metrics (scheme HTTP) with sensible defaults (e.g.
initialDelaySeconds 10, periodSeconds 10, timeoutSeconds 5, failureThreshold 3);
for the kube-rbac-proxy container (identify by name "kube-rbac-proxy" and its
arg --secure-listen-address=[$(IP)]:9100) add readinessProbe and livenessProbe
using httpGet with scheme HTTPS to port 9100 path /healthz (same timing
defaults) so the pod correctly reports ready/healthy state and restarts on
failure.
| resources: | ||
| requests: | ||
| cpu: 8m | ||
| memory: 32Mi |
There was a problem hiding this comment.
Add CPU/memory limits for every container.
Only requests are set today. For a node-wide DaemonSet, missing limits can create noisy-neighbor and recovery issues under pressure.
Suggested patch
resources:
+ limits:
+ cpu: 100m
+ memory: 128Mi
requests:
cpu: 8m
memory: 32Mi
@@
resources:
+ limits:
+ cpu: 50m
+ memory: 64Mi
requests:
cpu: 1m
memory: 15Mi
@@
resources:
+ limits:
+ cpu: 20m
+ memory: 32Mi
requests:
cpu: 1m
memory: 1MiAlso applies to: 105-108, 140-143
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/optional/node-exporter/03-daemonset.yaml` around lines 65 - 68, The
DaemonSet containers currently only set resources.requests (e.g., the resources
block shown with cpu: 8m and memory: 32Mi); add matching resources.limits for
cpu and memory for every container in the DaemonSet spec to prevent
noisy-neighbor issues — update the resources section to include limits
(reasonable values higher than requests) for the container(s) defined in this
manifest and repeat the same change for the other containers mentioned (the
similar resources blocks at the other occurrences). Ensure each container stanza
has both resources.requests and resources.limits entries.
| requests: | ||
| cpu: 8m | ||
| memory: 32Mi | ||
| securityContext: {} |
There was a problem hiding this comment.
Harden container security context and avoid privileged root init where possible.
node-exporter uses an empty securityContext, and init-textfile runs privileged as UID 0. This is a broad escalation surface on a host-mounted DaemonSet. Please explicitly set least-privilege defaults and remove privileged/root execution unless strictly required and documented as an exception.
As per coding guidelines, "**/*.{yaml,yml,json}: Flag privileged: true ... running as root without justificationandsecurityContext: runAsNonRoot, readOnlyRootFilesystem, allowPrivilegeEscalation: false`."
Also applies to: 144-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/optional/node-exporter/03-daemonset.yaml` at line 69, The DaemonSet
currently has an empty securityContext and an init container ("init-textfile")
running as root/privileged; update the manifest to enforce least-privilege
defaults: add a pod/container securityContext with runAsNonRoot: true, set a
non-root runAsUser (e.g. an unprivileged UID), set readOnlyRootFilesystem: true,
and set allowPrivilegeEscalation: false for both the main node-exporter
container and the init-textfile container; remove privileged: true and any
runAs: 0 usages unless you have a documented, justified exception (if an
exception is required, add a comment explaining why). Also apply the same
hardening to the other securityContext instance referenced around lines 144-147.
| resources: | ||
| - 00-namespace.yaml | ||
| - 01-service-account.yaml | ||
| - 01-cluster-role.yaml | ||
| - 01-cluster-role-binding.yaml | ||
| - 01-security-context-constraints.yaml | ||
| - 02-kube-rbac-proxy-secret.yaml | ||
| - 02-accelerators-collector-configmap.yaml | ||
| - 03-daemonset.yaml | ||
| - 04-service.yaml |
There was a problem hiding this comment.
Add a NetworkPolicy resource to this bundle.
This kustomization is independently deployable but does not include a NetworkPolicy, so namespace traffic remains unconstrained when deployed standalone.
Suggested minimal wiring
resources:
- 00-namespace.yaml
- 01-service-account.yaml
- 01-cluster-role.yaml
- 01-cluster-role-binding.yaml
- 01-security-context-constraints.yaml
- 02-kube-rbac-proxy-secret.yaml
- 02-accelerators-collector-configmap.yaml
- 03-daemonset.yaml
- 04-service.yaml
+ - 05-network-policy.yamlAs per coding guidelines, If this is a Kubernetes/OpenShift manifest or Helm template: ... NetworkPolicy defined for the namespace.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/optional/node-exporter/kustomization.yaml` around lines 3 - 12, Add a
NetworkPolicy resource to this kustomization so standalone deployments constrain
namespace traffic: create a new manifest (e.g., 05-networkpolicy.yaml) that
defines a NetworkPolicy allowing required ingress/egress (for node-exporter,
allow pod-to-pod and node-exporter ports as needed) and then add " -
05-networkpolicy.yaml" to the resources list in
assets/optional/node-exporter/kustomization.yaml (which currently lists
00-namespace.yaml through 04-service.yaml); ensure the NetworkPolicy's
metadata.namespace matches the namespace in 00-namespace.yaml and name it
clearly (e.g., node-exporter-networkpolicy) so it can be located and reviewed.
2a731ff to
5ef9333
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
assets/optional/node-exporter/03-daemonset.yaml (3)
65-68:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSet CPU/memory limits for every container.
Only requests are configured. Add
resources.limitsfornode-exporter,kube-rbac-proxy, andinit-textfile.As per coding guidelines, "
Resource limits (cpu, memory) on every container."Also applies to: 105-108, 140-143
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@assets/optional/node-exporter/03-daemonset.yaml` around lines 65 - 68, The resources section currently sets only requests; add resources.limits for each container named node-exporter, kube-rbac-proxy, and init-textfile so every container has both cpu and memory limits (mirror or slightly exceed the existing requests). Update the resource blocks for the containers referenced around the three occurrences (the blocks starting with "resources:" at lines corresponding to node-exporter, kube-rbac-proxy, and init-textfile) to include a limits subfield with cpu and memory values (e.g., matching the existing requests values or a higher safe value) to satisfy the "Resource limits (cpu, memory) on every container" guideline.
33-127:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd readiness and liveness probes for both app containers.
node-exporterandkube-rbac-proxycurrently have no probes, so unhealthy pods may stay serving/Running.As per coding guidelines, "
Liveness + readiness probes defined."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@assets/optional/node-exporter/03-daemonset.yaml` around lines 33 - 127, Add both livenessProbe and readinessProbe blocks to the two container specs (container name "node-exporter" and container name "kube-rbac-proxy"). For "node-exporter" add HTTP probes against path /metrics on port 9101 (scheme: HTTP) with sensible defaults (e.g. initialDelaySeconds 15, periodSeconds 10, timeoutSeconds 5, failureThreshold 3). For "kube-rbac-proxy" add HTTP probes against /metrics on port 9100 using scheme HTTPS (since the container serves TLS) with similar timing values (initialDelaySeconds 15, periodSeconds 10, timeoutSeconds 5, failureThreshold 3). Place the probe blocks at the same level as resources/volumeMounts in each container spec so Kubernetes will use them for liveness and readiness checks.
69-69:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHarden container security contexts and drop privileged root init.
node-exporterhas an empty security context, andinit-textfileruns privileged as UID 0. Enforce least privilege (runAsNonRoot,allowPrivilegeEscalation: false,readOnlyRootFilesystem, dropALLcapabilities), and avoid privileged/root unless justified as an explicit exception.As per coding guidelines, "
Flag privileged: true ... running as root without justificationandsecurityContext: runAsNonRoot, readOnlyRootFilesystem, allowPrivilegeEscalation: false."Also applies to: 144-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@assets/optional/node-exporter/03-daemonset.yaml` at line 69, Update the empty securityContext for the node-exporter container and the init-container named init-textfile to enforce least privilege: remove any privileged: true from init-textfile, set securityContext on both containers to runAsNonRoot: true with a non-root runAsUser (e.g. 65534 or another project-approved non-root UID), set allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, and drop capabilities (capabilities: { drop: ["ALL"] }); ensure these changes are applied to the main node-exporter container's securityContext and the initContainers block (init-textfile) so neither runs as privileged/root unless explicitly justified.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/optional/node-exporter/03-daemonset.yaml`:
- Around line 128-129: The DaemonSet currently sets hostNetwork: true and
hostPID: true which violates the no-host-namespaces guideline; either remove
these keys from the DaemonSet spec (so the pod does not share host network or
PID) or add a documented, explicit exception explaining why both are strictly
required for this workload (include the minimum required justification, risks,
and any compensating controls). Locate the DaemonSet manifest that sets
hostNetwork and hostPID (the DaemonSet/spec.template.spec block) and either
delete those fields or replace them with a short comment and a written exception
in your PR description or an inline comment stating why hostNetwork and hostPID
are necessary and what mitigations (e.g., strict RBAC, nodeSelector, seccomp,
read-only root FS) are in place.
---
Duplicate comments:
In `@assets/optional/node-exporter/03-daemonset.yaml`:
- Around line 65-68: The resources section currently sets only requests; add
resources.limits for each container named node-exporter, kube-rbac-proxy, and
init-textfile so every container has both cpu and memory limits (mirror or
slightly exceed the existing requests). Update the resource blocks for the
containers referenced around the three occurrences (the blocks starting with
"resources:" at lines corresponding to node-exporter, kube-rbac-proxy, and
init-textfile) to include a limits subfield with cpu and memory values (e.g.,
matching the existing requests values or a higher safe value) to satisfy the
"Resource limits (cpu, memory) on every container" guideline.
- Around line 33-127: Add both livenessProbe and readinessProbe blocks to the
two container specs (container name "node-exporter" and container name
"kube-rbac-proxy"). For "node-exporter" add HTTP probes against path /metrics on
port 9101 (scheme: HTTP) with sensible defaults (e.g. initialDelaySeconds 15,
periodSeconds 10, timeoutSeconds 5, failureThreshold 3). For "kube-rbac-proxy"
add HTTP probes against /metrics on port 9100 using scheme HTTPS (since the
container serves TLS) with similar timing values (initialDelaySeconds 15,
periodSeconds 10, timeoutSeconds 5, failureThreshold 3). Place the probe blocks
at the same level as resources/volumeMounts in each container spec so Kubernetes
will use them for liveness and readiness checks.
- Line 69: Update the empty securityContext for the node-exporter container and
the init-container named init-textfile to enforce least privilege: remove any
privileged: true from init-textfile, set securityContext on both containers to
runAsNonRoot: true with a non-root runAsUser (e.g. 65534 or another
project-approved non-root UID), set allowPrivilegeEscalation: false,
readOnlyRootFilesystem: true, and drop capabilities (capabilities: { drop:
["ALL"] }); ensure these changes are applied to the main node-exporter
container's securityContext and the initContainers block (init-textfile) so
neither runs as privileged/root unless explicitly justified.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: a3a221a9-0630-48d6-a93b-3148eb3d5973
📒 Files selected for processing (17)
assets/optional/node-exporter/00-namespace.yamlassets/optional/node-exporter/01-cluster-role-binding.yamlassets/optional/node-exporter/01-cluster-role.yamlassets/optional/node-exporter/01-security-context-constraints.yamlassets/optional/node-exporter/01-service-account.yamlassets/optional/node-exporter/02-accelerators-collector-configmap.yamlassets/optional/node-exporter/02-kube-rbac-proxy-secret.yamlassets/optional/node-exporter/03-daemonset.yamlassets/optional/node-exporter/04-service.yamlassets/optional/node-exporter/kustomization.aarch64.yamlassets/optional/node-exporter/kustomization.x86_64.yamlassets/optional/node-exporter/kustomization.yamlpackaging/observability/otelcol.d/microshift-metrics-node-exporter.yamlpackaging/rpm/microshift.specpkg/healthcheck/microshift_optional_workloads.goscripts/auto-rebase/assets.yamltest/bin/common.sh
✅ Files skipped from review due to trivial changes (4)
- assets/optional/node-exporter/kustomization.x86_64.yaml
- assets/optional/node-exporter/kustomization.yaml
- assets/optional/node-exporter/02-accelerators-collector-configmap.yaml
- assets/optional/node-exporter/kustomization.aarch64.yaml
🚧 Files skipped from review as they are similar to previous changes (11)
- packaging/observability/otelcol.d/microshift-metrics-node-exporter.yaml
- assets/optional/node-exporter/01-cluster-role.yaml
- assets/optional/node-exporter/00-namespace.yaml
- test/bin/common.sh
- assets/optional/node-exporter/01-service-account.yaml
- assets/optional/node-exporter/02-kube-rbac-proxy-secret.yaml
- assets/optional/node-exporter/01-cluster-role-binding.yaml
- assets/optional/node-exporter/04-service.yaml
- assets/optional/node-exporter/01-security-context-constraints.yaml
- pkg/healthcheck/microshift_optional_workloads.go
- packaging/rpm/microshift.spec
| hostNetwork: true | ||
| hostPID: true |
There was a problem hiding this comment.
Avoid sharing host namespaces without an explicit exception.
hostNetwork: true and hostPID: true materially expand pod blast radius. Please remove them or document why both are strictly required for this workload.
As per coding guidelines, "No hostPID, hostNetwork, hostIPC, privileged: true."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@assets/optional/node-exporter/03-daemonset.yaml` around lines 128 - 129, The
DaemonSet currently sets hostNetwork: true and hostPID: true which violates the
no-host-namespaces guideline; either remove these keys from the DaemonSet spec
(so the pod does not share host network or PID) or add a documented, explicit
exception explaining why both are strictly required for this workload (include
the minimum required justification, risks, and any compensating controls).
Locate the DaemonSet manifest that sets hostNetwork and hostPID (the
DaemonSet/spec.template.spec block) and either delete those fields or replace
them with a short comment and a written exception in your PR description or an
inline comment stating why hostNetwork and hostPID are necessary and what
mitigations (e.g., strict RBAC, nodeSelector, seccomp, read-only root FS) are in
place.
5ef9333 to
bfd0a33
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/optional/node-exporter/03-daemonset.yaml`:
- Around line 121-124: The TLS secret volumeMount for the node-exporter
container is set with write access but should be read-only; update the
volumeMount entry that references name: node-exporter-tls and mountPath:
/etc/tls/private (under the container's volumeMounts) to set readOnly: true
instead of false so kube-rbac-proxy can only read the cert/key.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 9d6cc3d0-5089-4e08-9254-565a79de66bf
📒 Files selected for processing (17)
assets/optional/node-exporter/00-namespace.yamlassets/optional/node-exporter/01-cluster-role-binding.yamlassets/optional/node-exporter/01-cluster-role.yamlassets/optional/node-exporter/01-security-context-constraints.yamlassets/optional/node-exporter/01-service-account.yamlassets/optional/node-exporter/02-accelerators-collector-configmap.yamlassets/optional/node-exporter/02-kube-rbac-proxy-secret.yamlassets/optional/node-exporter/03-daemonset.yamlassets/optional/node-exporter/04-service.yamlassets/optional/node-exporter/kustomization.aarch64.yamlassets/optional/node-exporter/kustomization.x86_64.yamlassets/optional/node-exporter/kustomization.yamlpackaging/observability/otelcol.d/microshift-metrics-node-exporter.yamlpackaging/rpm/microshift.specpkg/healthcheck/microshift_optional_workloads.goscripts/auto-rebase/assets.yamltest/bin/common.sh
✅ Files skipped from review due to trivial changes (3)
- assets/optional/node-exporter/kustomization.x86_64.yaml
- packaging/observability/otelcol.d/microshift-metrics-node-exporter.yaml
- assets/optional/node-exporter/kustomization.yaml
🚧 Files skipped from review as they are similar to previous changes (13)
- assets/optional/node-exporter/01-security-context-constraints.yaml
- assets/optional/node-exporter/01-cluster-role.yaml
- assets/optional/node-exporter/kustomization.aarch64.yaml
- assets/optional/node-exporter/04-service.yaml
- assets/optional/node-exporter/01-service-account.yaml
- assets/optional/node-exporter/01-cluster-role-binding.yaml
- test/bin/common.sh
- scripts/auto-rebase/assets.yaml
- assets/optional/node-exporter/02-accelerators-collector-configmap.yaml
- packaging/rpm/microshift.spec
- assets/optional/node-exporter/02-kube-rbac-proxy-secret.yaml
- assets/optional/node-exporter/00-namespace.yaml
- pkg/healthcheck/microshift_optional_workloads.go
44f0e6e to
27136ce
Compare
Add node-exporter kustomize manifests, healthcheck registration, RPM packaging, and OpenTelemetry collector drop-in configuration. Includes own 00-namespace.yaml for independent deployability and mergeWorkloads() for shared-namespace healthcheck support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
27136ce to
8b05e84
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@assets/optional/node-exporter/03-daemonset.yaml`:
- Around line 93-126: The kube-rbac-proxy sidecar (container name
kube-rbac-proxy) is enforcing TLS+RBAC for /metrics but the OpenTelemetry
collector scrapes https://127.0.0.1:9100 without presenting credentials, causing
401/403; remove the proxy sidecar or stop directing the localhost scrape through
it: delete the kube-rbac-proxy container block (the args including
--config-file=/etc/kube-rbac-policy/config.yaml), its volumeMounts
(node-exporter-tls and node-exporter-kube-rbac-proxy-config), and any related
ports/hostPort entries for 9100 so node-exporter is scraped directly, or
alternatively replace that approach by updating the RBAC policy file (referenced
in assets/optional/node-exporter/02-kube-rbac-proxy-secret.yaml) to allow
unauthenticated GET /metrics from the collector — pick one consistent change and
remove the kube-rbac-proxy references (container name kube-rbac-proxy, mounts,
and flags) to fix the auth mismatch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 515d8549-f0c6-407f-8946-eee5a7ba3ced
📒 Files selected for processing (17)
assets/optional/node-exporter/00-namespace.yamlassets/optional/node-exporter/01-cluster-role-binding.yamlassets/optional/node-exporter/01-cluster-role.yamlassets/optional/node-exporter/01-security-context-constraints.yamlassets/optional/node-exporter/01-service-account.yamlassets/optional/node-exporter/02-accelerators-collector-configmap.yamlassets/optional/node-exporter/02-kube-rbac-proxy-secret.yamlassets/optional/node-exporter/03-daemonset.yamlassets/optional/node-exporter/04-service.yamlassets/optional/node-exporter/kustomization.aarch64.yamlassets/optional/node-exporter/kustomization.x86_64.yamlassets/optional/node-exporter/kustomization.yamlpackaging/observability/otelcol.d/microshift-metrics-node-exporter.yamlpackaging/rpm/microshift.specpkg/healthcheck/microshift_optional_workloads.goscripts/auto-rebase/assets.yamltest/bin/common.sh
✅ Files skipped from review due to trivial changes (2)
- assets/optional/node-exporter/kustomization.x86_64.yaml
- assets/optional/node-exporter/01-cluster-role-binding.yaml
🚧 Files skipped from review as they are similar to previous changes (12)
- test/bin/common.sh
- packaging/observability/otelcol.d/microshift-metrics-node-exporter.yaml
- assets/optional/node-exporter/02-kube-rbac-proxy-secret.yaml
- assets/optional/node-exporter/kustomization.aarch64.yaml
- assets/optional/node-exporter/00-namespace.yaml
- assets/optional/node-exporter/kustomization.yaml
- assets/optional/node-exporter/01-cluster-role.yaml
- assets/optional/node-exporter/04-service.yaml
- scripts/auto-rebase/assets.yaml
- assets/optional/node-exporter/02-accelerators-collector-configmap.yaml
- pkg/healthcheck/microshift_optional_workloads.go
- assets/optional/node-exporter/01-service-account.yaml
Add microshift-metrics-node-exporter-release-info RPM sub-package with per-component release JSON files for bootc/osbuilder image embedding. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MicroShift's otelcol runs as a systemd service and cannot present ServiceAccount credentials. Remove the prometheus-k8s user requirement so any GET /metrics over TLS is allowed on single-node MicroShift. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@CodeRabbit help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
Status, support, documentation and community
|
|
@CodeRabbit re-evaluate past comments against current code and resolve those that have been satisfied |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Fetch all CodeRabbit review comments on PR `#6810`
gh api repos/openshift/microshift/pulls/6810/comments \
--paginate \
--jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, path: .path, body: (.body | .[0:300]), line: .line, position: .position, resolved: .resolved}' 2>/dev/null | head -200
echo "---REVIEW THREADS---"
# Also fetch review comments as threads
gh api repos/openshift/microshift/pulls/6810/reviews \
--jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, state: .state, submitted_at: .submitted_at}' 2>/dev/null
echo "---ISSUE COMMENTS---"
gh api repos/openshift/microshift/issues/6810/comments \
--jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, body: (.body | .[0:500])}' 2>/dev/null | head -100Length of output: 6024 🏁 Script executed: #!/bin/bash
echo "=== 03-daemonset.yaml ==="
cat assets/optional/node-exporter/03-daemonset.yaml
echo ""
echo "=== kustomization.yaml ==="
cat assets/optional/node-exporter/kustomization.yaml
echo ""
echo "=== microshift-metrics-node-exporter.yaml ==="
cat packaging/observability/otelcol.d/microshift-metrics-node-exporter.yaml
echo ""
echo "=== NetworkPolicy check ==="
fd "network-policy" assets/optional/node-exporter/Length of output: 8181 🏁 Script executed: #!/bin/bash
# Get full body of comment 3361390373
gh api repos/openshift/microshift/pulls/comments/3361390373 \
--jq '.body'
echo ""
echo "=== Resolving satisfied comments via GraphQL ==="
# First get the pull request node id and review thread IDs
gh api graphql -f query='
{
repository(owner: "openshift", name: "microshift") {
pullRequest(number: 6810) {
reviewThreads(first: 100) {
nodes {
id
isResolved
comments(first: 1) {
nodes {
databaseId
body
}
}
}
}
}
}
}' | jq '.data.repository.pullRequest.reviewThreads.nodes[] | {threadId: .id, isResolved: .isResolved, commentId: .comments.nodes[0].databaseId, body: (.comments.nodes[0].body | .[0:100])}'Length of output: 7251 🏁 Script executed: #!/bin/bash
# Resolve the scrape-endpoint thread (now satisfied: proxy binds 0.0.0.0:9100, scrape target 127.0.0.1:9100 is valid)
gh api graphql -f query='
mutation {
resolveReviewThread(input: {threadId: "PRRT_kwDOFZBjb86HQ1eH"}) {
thread {
id
isResolved
}
}
}'Length of output: 423
Resolved (3 total):
Still open (5):
|
|
@CodeRabbit review |
✅ Action performedReview finished.
|
|
@copejon: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Add node-exporter as an optional MicroShift component deployed via kustomize manifests and packaged as
microshift-metrics-node-exporterRPM sub-package.This is PR 3/3 splitting #6763 into independently-mergeable patches. The three PRs can merge in any order.
Siblings: #6808 (metrics-server), #6809 (kube-state-metrics)
What's included
assets/optional/node-exporter/) — DaemonSet with hostNetwork, hostPID, and privileged SCC; kube-rbac-proxy sidecar; accelerator/GPU collector ConfigMap; and own00-namespace.yamlfor independent deployabilitymergeWorkloads()function (identical to sibling PRs; git auto-merges)%package metrics-node-exporterwith per-file installs to082-microshift-node-exporter/microshift-metrics-node-exporter.yamlscrape config using static target127.0.0.1:9100test/bin/common.shSummary by CodeRabbit
New Features
Packaging
Tests / Chores