Skip to content

Commit f29bdb2

Browse files
committed
Revert "UNPICK changes to review"
This reverts commit e637505.
1 parent e637505 commit f29bdb2

File tree

5 files changed

+667
-14
lines changed

5 files changed

+667
-14
lines changed

docs/source/user-guide/dataframe/index.rst

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,33 @@ Core Classes
228228
* :py:meth:`~datafusion.SessionContext.from_pandas` - Create from Pandas DataFrame
229229
* :py:meth:`~datafusion.SessionContext.from_arrow` - Create from Arrow data
230230

231+
``SessionContext`` can automatically resolve SQL table names that match
232+
in-scope Python data objects. When automatic lookup is enabled, a query
233+
such as ``ctx.sql("SELECT * FROM pdf")`` will register a pandas or
234+
PyArrow object named ``pdf`` without calling
235+
:py:meth:`~datafusion.SessionContext.from_pandas` or
236+
:py:meth:`~datafusion.SessionContext.from_arrow` explicitly. This requires
237+
the corresponding library (``pandas`` for pandas objects, ``pyarrow`` for
238+
Arrow objects) to be installed.
239+
240+
.. code-block:: python
241+
242+
import pandas as pd
243+
from datafusion import SessionContext
244+
245+
ctx = SessionContext(auto_register_python_objects=True)
246+
pdf = pd.DataFrame({"value": [1, 2, 3]})
247+
248+
df = ctx.sql("SELECT SUM(value) AS total FROM pdf")
249+
print(df.to_pandas()) # automatically registers `pdf`
250+
251+
Automatic lookup is disabled by default. Enable it by passing
252+
``auto_register_python_objects=True`` when constructing the session or by
253+
configuring :py:class:`~datafusion.SessionConfig` with
254+
:py:meth:`~datafusion.SessionConfig.with_python_table_lookup`. Use
255+
:py:meth:`~datafusion.SessionContext.set_python_table_lookup` to toggle the
256+
behaviour at runtime.
257+
231258
See: :py:class:`datafusion.SessionContext`
232259

233260
Expression Classes

docs/source/user-guide/sql.rst

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,4 +36,29 @@ DataFusion also offers a SQL API, read the full reference `here <https://arrow.a
3636
df = ctx.sql('SELECT "Attack"+"Defense", "Attack"-"Defense" FROM pokemon')
3737
3838
# collect and convert to pandas DataFrame
39-
df.to_pandas()
39+
df.to_pandas()
40+
41+
Automatic variable registration
42+
-------------------------------
43+
44+
You can opt-in to DataFusion automatically registering Arrow-compatible Python
45+
objects that appear in SQL queries. This removes the need to call
46+
``register_*`` helpers explicitly when working with in-memory data structures.
47+
48+
.. code-block:: python
49+
50+
import pyarrow as pa
51+
from datafusion import SessionContext
52+
53+
ctx = SessionContext(auto_register_python_objects=True)
54+
55+
orders = pa.Table.from_pydict({"item": ["apple", "pear"], "qty": [5, 2]})
56+
57+
result = ctx.sql("SELECT item, qty FROM orders WHERE qty > 2")
58+
print(result.to_pandas())
59+
60+
The feature inspects the call stack for variables whose names match missing
61+
tables and registers them if they expose Arrow data (including pandas and
62+
Polars DataFrames). Existing contexts can enable or disable the behavior at
63+
runtime through :py:meth:`SessionContext.set_python_table_lookup` or by passing
64+
``auto_register_python_objects`` when constructing the session.

0 commit comments

Comments
 (0)