Skip to content

Commit fbba2a0

Browse files
committed
Document UDAF list-valued scalar returns
Add documented list-valued scalar returns for UDAF accumulators, including an example with pa.scalar and a note about unsupported pyarrow.Array returns from evaluate(). Also, introduce a UDAF FAQ entry detailing list-returning patterns and required return_type/state_type definitions.
1 parent 5f10176 commit fbba2a0

File tree

2 files changed

+26
-1
lines changed

2 files changed

+26
-1
lines changed

docs/source/user-guide/common-operations/udf-and-udfa.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,17 @@ also see how the inputs to ``update`` and ``merge`` differ.
149149
150150
df.aggregate([], [my_udaf(col("a"), col("b")).alias("col_diff")])
151151
152+
FAQ
153+
^^^
154+
155+
**How do I return a list from a UDAF?**
156+
Use a list-valued scalar and declare list types for both the return and state
157+
definitions. Returning a ``pyarrow.Array`` from ``evaluate`` is not supported
158+
unless you convert it to a list scalar. For example, in ``evaluate`` you can
159+
return ``pa.scalar([...], type=pa.list_(pa.timestamp("ms")))`` and register the
160+
UDAF with ``return_type=pa.list_(pa.timestamp("ms"))`` and
161+
``state_type=[pa.list_(pa.timestamp("ms"))]``.
162+
152163
Window Functions
153164
----------------
154165

python/datafusion/user_defined.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -282,7 +282,21 @@ def merge(self, states: list[pa.Array]) -> None:
282282

283283
@abstractmethod
284284
def evaluate(self) -> pa.Scalar:
285-
"""Return the resultant value."""
285+
"""Return the resultant value.
286+
287+
If you need to return a list, wrap it in a scalar with the correct
288+
list type, for example::
289+
290+
import pyarrow as pa
291+
292+
return pa.scalar(
293+
[pa.scalar("2024-01-01T00:00:00Z")],
294+
type=pa.list_(pa.timestamp("ms")),
295+
)
296+
297+
Returning a ``pyarrow.Array`` from ``evaluate`` is not supported unless
298+
you explicitly convert it to a list-valued scalar.
299+
"""
286300

287301

288302
class AggregateUDFExportable(Protocol):

0 commit comments

Comments
 (0)