Skip to content

Conversation

@d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Jan 29, 2026

depends on #3676 and #3677

This PR demos a Zarr array class that supports lazy indexing. When you index this array, the result is something with the same type. This is a stark contrast to the behavior of zarr.Array, which returns a completely different thing (a numpy array) when you index it. This implementation is heavily inspired by tensorstore.

here's a self-contained demo:

# /// script
# requires-python = ">=3.11"
# dependencies = [
#     "zarr @ git+https://github.com/d-v-b/zarr-python.git@feat/lazy-indexing",
#     "dask[array]",
#     "numpy",
# ]
# ///

import zarr
import numpy as np
from zarr.experimental.lazy_indexing import Array, merge
import dask.array as da

store = {}
np_data = np.arange(100)
zarr.create_array(store, data=np_data, chunks=(10,), fill_value=0, write_data=True)

# Use the lazy array
lazy_array = Array.open(store)
print(lazy_array)
# <Array memory://129773024766528 domain=IndexDomain([0, 100)) dtype=int64>

slice_a = slice(0, 10)
slice_b = slice(10, None)

# Declare the lower 10% of the array
subregion_a = lazy_array[slice_a]
print(subregion_a)
# <Array memory://129773024766528 domain=IndexDomain([0, 10)) dtype=int64>
assert np.array_equal(np.array(subregion_a), np_data[slice_a])

# Declare the upper 90% of the array
subregion_b = lazy_array[slice_b]
print(subregion_b)
# <Array memory://129773024766528 domain=IndexDomain([10, 100)) dtype=int64>
assert np.array_equal(np.array(subregion_a), np_data[slice_a])

# Test that merging inverts slicing
merged = merge([subregion_a, subregion_b])
assert  merged == lazy_array
assert np.array_equal(np.array(merged), np_data)

# Test with dask
assert np.array_equal(da.from_array(lazy_array).compute(), np_data)

The Zarr array defined here should be thought of as a variably-sized view of another array.
Making this work requires adding attributes to the array that track the domain of the array's indices. These lazy arrays have an explicit origin in indexing space, and that origin can be negative. This means negative indexing does not wrap around! That's possibly the biggest semantic change here.

I'm still working on this, so I'm opening it as a draft for visibility. Lots of things might change, so expect an update or a new comment down the road that gives an overview once the design has cooled down.

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs release notes Automatically applied to PRs which haven't added release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant