Skip to content

Surface engine WARNING-level log messages in the Python client (parity with the CLI) #480

@rustyconover

Description

@rustyconover

I'd like to suggest a small feature that would bring the Python client closer to parity with the DuckDB CLI.

Background

The CLI shell already surfaces engine log messages to the user. At startup it registers a custom LogStorage (ShellLogStorage) and does roughly:

log_manager.RegisterLogStorage("shell_log_storage", storage_ptr);
log_manager.SetLogStorage(*db_instance, "shell_log_storage");
log_manager.SetEnableLogging(db_instance);
log_manager.SetLogLevel(duckdb::LogLevel::LOG_WARNING);

So a CLI user automatically sees WARNING-level messages (e.g. deprecated-syntax notices, the macOS Rosetta perf warning, GEOMETRY/CRS storage-version warnings) printed to the console.

The Python client doesn't do anything analogous — engine logging is left at its defaults (disabled, memory storage), so these warnings are effectively invisible to Python/Jupyter users unless they manually SET enable_logging=true and SELECT * FROM duckdb_logs. The net effect is that deprecation warnings the CLI shows are silently dropped in Python/Jupyter.

Proposal

Add an (opt-in) Python LogStorage that forwards engine log entries to Python — ideally via the standard logging module, e.g. logging.getLogger("duckdb").warning(message) — so users get visibility through machinery they already control (handlers, levels, filters), and notebook users see them inline.

Prior art in this repo

There's already a clean precedent: the progress bar registers a custom display through ClientConfig::display_create_func (JupyterProgressBarDisplay in src/duckdb_py/jupyter/). A log sink would follow the same shape — a LogStorage subclass registered at connection time, alongside where the progress bar is wired up in SetDefaultConfigArguments().

Implementation notes / care points

  • GIL: log callbacks fire from executor threads with the GIL released during query execution, so the sink must py::gil_scoped_acquire before touching Python — exactly what JupyterProgressBarDisplay::Update() already does.
  • Default off / opt-in: to avoid changing default output (which could disrupt output-diffing test harnesses like nbval, or add nondeterministic interleaving), this is probably best behind a connection flag, defaulting off — or at most on only in interactive sessions.
  • Routing through logging (rather than raw stdout/stderr like the CLI) keeps it un-surprising and easily silenced.

Why it's low-risk

WARNING-level emission is very sparse in the engine today (a handful of call sites, mostly deprecation notices), so the practical noise is minimal — but those are exactly the messages users most benefit from seeing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions