What happened?
When opening a Zarr array created with a NumPy 2 StringDType the Xarray DataArray doesn't have the same chunks as the underlying Zarr array.
What did you expect to happen?
The Xarray chunking should be the same as the Zarr chunking. In the MVCE below, the chunking is the same if using a fixed-length unicode dtype (dtype="U").
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# "zarr",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!
import numpy as np
import xarray as xr
import zarr
xr.show_versions()
# create chunked string array using zarr
root = zarr.create_group(store="g")
data = np.array([
["a", "b", "c", "d"],
["e", "f", "g", "h"],
["i", "j", "k", "l"],
["m", "n", "o", "p"],
], dtype="T")
a = root.create_array(name="a", data=data, chunks=(2, 2), dimension_names=["x", "y"])
# open with xarray and check chunk sizes
x = xr.open_zarr("g", consolidated=False)
assert x["a"].chunks == ((2, 2), (2, 2)) # fails
Steps to reproduce
No response
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment
Details
What happened?
When opening a Zarr array created with a NumPy 2
StringDTypethe Xarray DataArray doesn't have the same chunks as the underlying Zarr array.What did you expect to happen?
The Xarray chunking should be the same as the Zarr chunking. In the MVCE below, the chunking is the same if using a fixed-length unicode dtype (
dtype="U").Minimal Complete Verifiable Example
Steps to reproduce
No response
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment
Details