Skip to content

fix: correct pt_regs offset for kernel unwind#11352

Open
kylewanginchina wants to merge 1 commit intomainfrom
fix-arm-v8-dwarf
Open

fix: correct pt_regs offset for kernel unwind#11352
kylewanginchina wants to merge 1 commit intomainfrom
fix-arm-v8-dwarf

Conversation

@kylewanginchina
Copy link
Contributor

One or more of:

  • Agent

Fixes fix: correct pt_regs offset for kernel unwind

Steps to reproduce the bug

  • Deploy DeepFlow Agent on an ARM64 (aarch64) Linux system.
  • Enable CPU profiling (eBPF continuous profiler).
  • Run a workload that requires DWARF-based stack unwinding (e.g., a Node.js application or C++ application with debug symbols).
  • Observe that mixed-mode stack traces (kernel + user) are broken or incomplete. specifically, when a perf event triggers in kernel mode, the unwinder fails to correctly transition to user space because it cannot locate the saved user registers (pt_regs) on the kernel stack.

Changes to fix the bug

  • Modified the unwind_sysinfo_t structure in perf_profiler.h to explicitly include stack_ptregs_offset.
  • Updated unwind_tracer.c (user space) to calculate stack_ptregs_offset dynamically based on architecture:
    • For ARM64: THREAD_SIZE (16KB) - sizeof(struct pt_regs) (336 bytes).
    • For x86_64: THREAD_SIZE (16KB) - sizeof(struct pt_regs) (168 bytes).
  • Updated perf_profiler.bpf.c (eBPF kernel space) to use the stack_ptregs_offset from the map when calculating the address of user registers from the kernel stack pointer.

Affected branches

  • main

Checklist

  • Added unit test to verify the fix.
  • Verified eBPF program runs successfully on linux 4.14.x.
  • Verified eBPF program runs successfully on linux 4.19.x.
  • Verified eBPF program runs successfully on linux 5.2.x.
  • Verified eBPF program runs successfully on linux 5.4.x.
  • Verified eBPF program runs successfully on linux 5.15.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant