Use cumsum from flox #10987

Illviljan · 2025-12-06T13:44:27Z

Closes cumsum drops index coordinates #6528
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

The non-flox version reduces chunksizes significantly:

x = xr.DataArray([1, 1, 1, 1, 1], name="x").chunk()
grp_idx = xr.DataArray([-1, 0, 0, -1, 1])
with xr.set_options(use_flox=False):
    print(x.groupby(grp_idx).cumsum())
<xarray.DataArray 'x' (dim_0: 5)> Size: 40B
dask.array<getitem, shape=(5,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0

With flox the chunksize is retained:

x = xr.DataArray([1, 1, 1, 1, 1], name="x").chunk()
grp_idx = xr.DataArray([-1, 0, 0, -1, 1])
with xr.set_options(use_flox=True):
    print(x.groupby(grp_idx).cumsum())
<xarray.DataArray 'x' (dim_0: 5)> Size: 40B
dask.array<_finalize_scan, shape=(5,), dtype=int64, chunksize=(5,), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

xarray/core/groupby.py

…o cumsum_flox

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

xarray/core/groupby.py

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

for more information, see https://pre-commit.ci

dcherian · 2025-12-10T21:31:03Z

xarray/util/generate_aggregations.py

        # median isn't enabled yet, because it would break if a single group was present in multiple
        # chunks. The non-flox code path will just rechunk every group to a single chunk and execute the median
-        method_is_not_flox_supported = method.name in ("median", "cumsum", "cumprod")
+        method_is_not_flox_supported = method.name in ("median", "cumprod")


FYI in a future PR, I'd like to use the new flox.is_supported_aggregation here. It's a little smarter about this dispatching. We'll also have to figure out what to do about median which currently auto-rechunks so it always works.

dcherian · 2025-12-10T21:34:16Z

xarray/util/generate_aggregations.py

        )
        return f"""\
-        return self.reduce(
+        out = self.reduce(


is this needed?

I thought this looked nice and it was clear there was tweak to fix cumsum/cumprod:

out = self.reduce( duck_array_ops.cumsum, dim=dim, skipna=skipna, keep_attrs=keep_attrs, **kwargs, ) return out.assign_coords(self._obj.coords)

Then for consistency and readability I followed that pattern on the others.

Can we apply the assign_coords fix for Dataset.cumsum et al too?

for more information, see https://pre-commit.ci

dcherian · 2025-12-12T22:33:45Z

xarray/tests/test_groupby.py

-    assert_identical(expected.foo, actual)
-
+@pytest.mark.parametrize(
+    "method, expected_array, use_flox, use_dask",


The use_dask here should mean grouping a dask array by a numpy array. That will work always.

use cumsum from flox

776bc5a

github-actions bot added the topic-groupby label Dec 6, 2025

pre-commit-ci bot and others added 13 commits December 6, 2025 13:44

[pre-commit.ci] auto fixes from pre-commit.com hooks

ae27632

for more information, see https://pre-commit.ci

Update groupby.py

a5f9326

Update groupby.py

50ccca4

[pre-commit.ci] auto fixes from pre-commit.com hooks

f55531e

for more information, see https://pre-commit.ci

Update groupby.py

06ac372

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

31244e6

…o cumsum_flox

Update groupby.py

dd47536

[pre-commit.ci] auto fixes from pre-commit.com hooks

e867f12

for more information, see https://pre-commit.ci

Update groupby.py

88e0ebc

[pre-commit.ci] auto fixes from pre-commit.com hooks

181d4a3

for more information, see https://pre-commit.ci

use apply_ufunc for dataset and dataarray handling

a82ec39

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

6c6abed

…o cumsum_flox

[pre-commit.ci] auto fixes from pre-commit.com hooks

24c3f1d

for more information, see https://pre-commit.ci

dcherian reviewed Dec 6, 2025

View reviewed changes

xarray/core/groupby.py Show resolved Hide resolved

dcherian reviewed Dec 6, 2025

View reviewed changes

xarray/core/groupby.py Show resolved Hide resolved

Illviljan and others added 11 commits December 6, 2025 16:21

Update groupby.py

d8d0eaa

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

55ff46a

…o cumsum_flox

[pre-commit.ci] auto fixes from pre-commit.com hooks

33d1360

for more information, see https://pre-commit.ci

sync protocols with each other

c97ae98

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

06b52ae

…o cumsum_flox

typing

84f9b44

[pre-commit.ci] auto fixes from pre-commit.com hooks

2978877

for more information, see https://pre-commit.ci

add dataset and version requirement

0a9adee

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

ae9a3d8

…o cumsum_flox

[pre-commit.ci] auto fixes from pre-commit.com hooks

c056d1f

for more information, see https://pre-commit.ci

Update _aggregations.py

d4873b9

dcherian reviewed Dec 6, 2025

View reviewed changes

xarray/core/groupby.py Outdated Show resolved Hide resolved

Update xarray/core/groupby.py

21cbde2

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

Illviljan and others added 21 commits December 9, 2025 22:34

Update _aggregations.py

9721574

Update _aggregations.py

e1fba81

Update _aggregations.py

5137fd8

Update _aggregations.py

59a7f38

Update _aggregations.py

7f519f0

Update _aggregations.py

c4f5f83

Update _aggregations.py

bf5197d

Update _aggregations.py

5563600

Update _aggregations.py

510300d

Update _aggregations.py

5fe07df

Update _aggregations.py

293cc1f

Update _aggregations.py

d9f694c

Update _aggregations.py

c9814db

Update _aggregations.py

6ed0f99

Update test_groupby.py

43a827d

Update test_groupby.py

8d65562

[pre-commit.ci] auto fixes from pre-commit.com hooks

d19bbca

for more information, see https://pre-commit.ci

Update test_groupby.py

acf4022

Update generate_aggregations.py

f263da6

Update test_groupby.py

e56d0b8

Merge branch 'main' into cumsum_flox

8cbfd9d

dcherian reviewed Dec 10, 2025

View reviewed changes

Illviljan and others added 6 commits December 12, 2025 20:43

Update test_groupby.py

bcf61b0

Update test_groupby.py

1927923

[pre-commit.ci] auto fixes from pre-commit.com hooks

45e9423

for more information, see https://pre-commit.ci

Update generate_aggregations.py

3a9cee8

Update _aggregations.py

a3281e6

Update _aggregations.py

f29cdd0

dcherian reviewed Dec 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use cumsum from flox #10987

Use cumsum from flox #10987

Uh oh!

Illviljan commented Dec 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dcherian Dec 10, 2025

Uh oh!

dcherian Dec 10, 2025

Uh oh!

Illviljan Dec 10, 2025

Uh oh!

dcherian Dec 11, 2025

Uh oh!

dcherian Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Use cumsum from flox #10987

Are you sure you want to change the base?

Use cumsum from flox #10987

Uh oh!

Conversation

Illviljan commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dcherian Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

dcherian Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Illviljan Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

dcherian Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

dcherian Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Illviljan commented Dec 6, 2025 •

edited

Loading