-
Notifications
You must be signed in to change notification settings - Fork 494
socket_tracer and perf_profiler BPF programs fail to initialize on GKE 1.34 (COS 125, kernel 6.12.55) #2344
Description
Describe the bug
After upgrading a GKE cluster from 1.33 to 1.34, Pixie's socket_tracer and perf_profiler probes fail to initialize, causing all network-related data tables (conn_stats, http_events, dns_events) to be missing. Scripts like px/cluster fail with "Table 'conn_stats' not found."
GKE 1.34 uses Container-Optimized OS (COS) milestone 125, which ships with Linux kernel 6.12.55. COS does not include host kernel headers at /lib/modules/6.12.55+/build. The PEM falls back to the closest bundled packaged header (6.1.8), which is too far from kernel 6.12 for the socket_tracer BPF program to compile successfully.
As per #2275, the Pixie team runs primarily on GKE COS nodes. GKE 1.34 with COS 125 is now the current default version in GKE release channels, so this affects all GKE COS users who upgrade.
To Reproduce
Steps to reproduce the behavior:
- Deploy Pixie on a GKE 1.34 cluster with COS node images
- Check stirling_error table: socket_tracer shows "Unable to initialize BCC BPF program"
- Run px/cluster — fails with "Table 'conn_stats' not found"
Expected behavior
PEM should either bundle compatible headers for kernel 6.12.x, download them from Google's cos-tools bucket (https://storage.googleapis.com/cos-tools//), or use CO-RE/BTF to avoid the header dependency.
Screenshots
If applicable, add screenshots to help explain your problem. Please make sure the screenshot does not contain any sensitive information such as API keys or access tokens.
Logs
Please attach the logs by running the following command: (only logs deemed relevant to the issue have been attached. We can share more logs if necessary)
# PEM startup — version and kernel detection
I20260402 06:52:11.168628 6953 pem_main.cc:69] Pixie PEM. Version: v0.14.15+Distribution.623e988.202501242347.1.RELEASE.jenkins, id: 5661a75d-639d-44ef-9440-2d829ea172a1, kernel version: 6.12.55
I20260402 06:52:11.168690 6953 kernel_version.cc:82] Obtained Linux version string from `uname`: 6.12.55+
I20260402 06:52:11.168726 6953 stirling.cc:958] Creating Stirling, registered sources: [process_stats, network_stats, jvm_stats, socket_tracer, perf_profiler, proc_exit_tracer, stirling_error]
# Header resolution — host headers not found, falls back to bundled 6.1.8
I20260402 06:52:11.169289 6953 source_connector.cc:35] Initializing source connector: socket_tracer
I20260402 06:52:11.169345 6953 linux_headers.cc:353] Detected kernel release (uname -r): 6.12.55+
I20260402 06:52:11.171020 6953 linux_headers.cc:369] Not Found : Could not find 'source' or 'build' under /lib/modules/6.12.55+.
I20260402 06:52:11.171041 6953 linux_headers.cc:215] Looking for host Linux headers at /host/lib/modules/6.12.55+/build.
I20260402 06:52:11.171056 6953 linux_headers.cc:372] Not Found : Did not find the host headers at path: /host/lib/modules/6.12.55+/build, No such file or directory.
I20260402 06:52:11.171062 6953 linux_headers.cc:314] Attempting to install packaged headers.
I20260402 06:52:11.171190 6953 linux_headers.cc:320] Using packaged header: /px/linux-headers-x86_64-6.1.8.tar.gz
I20260402 06:52:11.498617 6953 linux_headers.cc:345] Successfully installed packaged copy of headers at /lib/modules/6.12.55+/build
# BPF compilation fails — 6.1.8 headers incompatible with 6.12.55 kernel
In file included from src/stirling/source_connectors/socket_tracer/bcc_bpf/socket_trace.c:25:
In file included from src/stirling/bpf_tools/bcc_bpf/system-headers/linux/net.h:1:
In file included from include/linux/net.h:23:
In file included from include/linux/fs.h:13:
In file included from include/linux/list_lru.h:14:
include/linux/xarray.h:1151:20: error: use of undeclared identifier 'CONFIG_BASE_SMALL'
void __rcu *slots[XA_CHUNK_SIZE];
include/linux/xarray.h:1697:38: error: no member named 'marks' in 'struct xa_node'
unsigned long *addr = xas->xa_node->marks[(__force unsigned)mark];
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
# socket_tracer fails to initialize
W20260402 06:52:14.420778 6953 stirling.cc:416] Source Connector (registry name=socket_tracer) not instantiated, error: Internal : Unable to initialize BCC BPF program: Unable to initialize BPF program
App information (please complete the following information):
- Pixie version: 0.14.15 (operator chart 0.1.7)
- Cloud: Cosmic Cloud (getcosmic.ai)
- K8s cluster version: GKE 1.34.4-gke.1193000
- Node OS: Container-Optimized OS (COS 125)
- Kernel: 6.12.55+
Additional context
Available data tables (missing network tables):
- present: process_stats, network_stats, jvm_stats, proc_exit_events, probe_status, stirling_error
- missing: conn_stats, http_events, dns_events, mysql_events, pgsql_events, redis_events