Skip to content

optimizer: include panic location and backtrace in caught panic errors#36877

Draft
antiguru wants to merge 1 commit into
mainfrom
claude/practical-pasteur-3UtSX
Draft

optimizer: include panic location and backtrace in caught panic errors#36877
antiguru wants to merge 1 commit into
mainfrom
claude/practical-pasteur-3UtSX

Conversation

@antiguru
Copy link
Copy Markdown
Member

@antiguru antiguru commented Jun 2, 2026

Motivation

When a transform panics, catch_unwind_optimize demotes it to an internal optimizer error. Today we only get the panic message — no location, no backtrace — so an error like:

internal error in optimizer: internal transform error: unexpected panic during query optimization: index out of bounds: the len is 2 but the index is 18446744073709551615

gives no hint about where in the optimizer the panic happened, making these bugs hard to track down.

The panic location and backtrace are only available inside the panic handler (which runs before the stack is unwound), but mz_ore's enhanced panic handler returns early — without printing or reporting — while catching an unwind. So that context was discarded, and catch_unwind_str only recovered the payload message. There was even a long-standing TODO(teskje): collect and log a backtrace from the panic site at the call site.

Changes

  • mz_ore::panic — add an opt-in capture path:

    • catch_unwind_with_details(...) works like catch_unwind_str, but on panic returns a CaughtPanic { message, location, backtrace }.
    • While inside such a call, the enhanced panic handler stashes the panic location and a backtrace into a thread-local before letting the unwind proceed, instead of throwing them away.
    • This is a parallel infra: the backtrace is only captured when a panic actually fires (the exceptional case), so there's no per-optimization cost, and other catch_unwind callers are completely unaffected. If the enhanced handler isn't installed, the message is still recovered and location/backtrace are simply absent.
  • mz_transformcatch_unwind_optimize now uses catch_unwind_with_details:

    • The user-facing error includes at least the panic location: ... unexpected panic during query optimization: <msg> (at <file:line:col>).
    • The full backtrace is logged via tracing::error! (resolving the old TODO).

Tips for reviewer

The behavioral change for existing catch_unwind/catch_unwind_str callers is nil — they don't set the CAPTURE_PANIC_DETAILS thread-local, so the handler's early-return path is unchanged for them.

This PR was drafted by Claude Code at the request of @mh.

https://claude.ai/code/session_01Av4N3HccSLB4Y3EAjvdngx


Generated by Claude Code

When a transform panics, `catch_unwind_optimize` demotes it to an internal
error. Previously we only recovered the panic payload (the message), so the
resulting error had no panic location and the backtrace was lost, making
internal optimizer errors hard to track down from a bare message like
"index out of bounds".

The panic location and backtrace are only available inside the panic
handler, which runs before the stack is unwound, but the enhanced handler
returns early (without printing) while catching an unwind, so that context
was discarded.

Add an opt-in capture path in `mz_ore::panic`: a new
`catch_unwind_with_details` instructs the enhanced panic handler to stash
the panic location and a backtrace into a thread-local before letting the
unwind proceed, and returns them alongside the message as a `CaughtPanic`.
The backtrace is only captured when a panic actually fires, so there's no
per-optimization cost. Other `catch_unwind` callers are unaffected.

Use it in the optimizer to surface the panic location in the user-facing
error and log the full backtrace.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants