Skip to content

Commit d784c4d

Browse files
committed
UNPICK fix: update PyCapsule failure analysis and suggest tasks for resolution
1 parent 91b90f4 commit d784c4d

File tree

1 file changed

+25
-2
lines changed

1 file changed

+25
-2
lines changed

dev/notes/pycapsule_failure.md

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,28 @@ Commit range `9b4f1442^..d629ced2` replaced the `Table` wrapper-based API with a
1111

1212
Prior to the refactor, callers could not pass arbitrary capsule-bearing objects to `SessionContext.read_table`; they first had to wrap them in `Table`/`RawTable`, which were only constructible through safe helpers that produced trusted capsules. The new auto-coercion path therefore widened the attack surface to unvalidated capsules, exposing the latent unsafety.
1313

14-
## Suggested Fixes
15-
See the Suggested Tasks in the PR review comment for concrete follow-up work.
14+
## Runtime failure after 91b90f44
15+
Commit 91b90f44 changed :meth:`SessionContext.read_table` so that any object
16+
exposing ``__datafusion_table_provider__`` is normalized through
17+
``Table.from_table_provider_capsule`` before delegating to the Rust context.
18+
【F:python/datafusion/context.py†L1189-L1198】 That helper now calls into the
19+
private binding ``df_internal.catalog.RawTable.from_table_provider_capsule`` to
20+
wrap the capsule, but the ``RawTable`` type exported from
21+
``datafusion._internal`` does not currently expose such a constructor.
22+
【F:python/datafusion/catalog.py†L176-L187】 At runtime the lookup therefore
23+
raises ``AttributeError`` and prevents `examples/pycapsule_failure.py` from
24+
running, regressing the original reproducer from a segfault into a hard failure.
25+
26+
## Suggested Tasks
27+
1. Export a ``RawTable.from_table_provider_capsule`` constructor from the Rust
28+
bindings and ensure it becomes available through
29+
``datafusion._internal.catalog`` during the wheel build so that the Python
30+
shim can locate it.
31+
2. Add an integration test that imports ``datafusion._internal`` and asserts
32+
``hasattr(df_internal.catalog.RawTable, "from_table_provider_capsule")``
33+
before exercising ``SessionContext.read_table`` with a raw capsule to catch
34+
regressions.
35+
3. Consider extending ``table_provider_from_pycapsule`` so that
36+
``RawTable.__new__`` can directly accept capsule instances (without going
37+
through the static helper) to reduce the surface area for Python/Rust API
38+
skew in the future.

0 commit comments

Comments
 (0)