-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
I’ve noticed a error when upgrading from Xarray 2025.9.1 to 2026.2.0. In the older version, chunks='auto' handled these datasets without issue. However, the newer version now triggers a NotImplementedError when encountering object dtypes. This suggests that the internal size-estimation logic in recent Xarray/Dask updates is now strictly enforcing a check that was previously bypassed
What did you expect to happen?
To load successfully.
Minimal Complete Verifiable Example
xr.open_dataset('https://data.gdex.ucar.edu/d640000/kerchunk/anl_isentrop-remote-https.parq', engine='kerchunk', chunks='auto')Steps to reproduce
The conda env is
name: arco_test
channels:
- conda-forge
- defaults
dependencies:
- python>3.11
- pip
# Core Data & Geospatial
- xarray == 2026.2.0
- netcdf4
- zarr
- fastparquet
- kerchunk
- dask-jobqueue
- intake-esm >=2025.12.12
# Visualization & Utilities
- matplotlib
- jupyterlab
- pip:
- pelicanfs>=1.3.1
activate the env and execute the code above.
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
xr.open_dataset('https://data.gdex.ucar.edu/d640000/kerchunk/anl_isentrop-remote-https.parq', engine='kerchunk', chunks='auto')
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/backends/api.py", line 613, in open_dataset
ds = _dataset_from_backend_dataset(
backend_ds,
...<11 lines>...
**kwargs,
)
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/backends/api.py", line 308, in _dataset_from_backend_dataset
ds = _chunk_ds(
ds,
...<7 lines>...
**extra_tokens,
)
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/backends/api.py", line 251, in _chunk_ds
var_chunks = _get_chunk(
var._data,
...<3 lines>...
dims=var.dims,
)
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/namedarray/utils.py", line 239, in _get_chunk
chunk_shape = chunkmanager.normalize_chunks(
chunk_shape,
...<3 lines>...
previous_chunks=preferred_chunk_shape,
)
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/xarray/namedarray/daskmanager.py", line 57, in normalize_chunks
return normalize_chunks(
chunks,
...<3 lines>...
previous_chunks=previous_chunks,
) # type: ignore[no-untyped-call]
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/dask/array/core.py", line 3198, in normalize_chunks
chunks = auto_chunks(chunks, shape, limit, dtype, previous_chunks)
File "/<REDACTED_PATH>/conda-envs/<ENV_NAME>/lib/python3.14/site-packages/dask/array/core.py", line 3302, in auto_chunks
raise NotImplementedError(
...<2 lines>...
)
NotImplementedError: Can not use auto rechunking with object dtype. We are unable to estimate the size in bytes of object dataAnything else we need to know?
Is it possible for reference file read to fall back to {} if auto option is not possible?
Environment
Details
INSTALLED VERSIONS
commit: None
python: 3.14.3 | packaged by conda-forge | (main, Feb 9 2026, 21:56:02) [GCC 14.3.0]
python-bits: 64
OS: Linux
OS-release: 6.4.0-150600.23.81-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.6
libnetcdf: 4.10.0
xarray: 2026.2.0
pandas: 3.0.1
numpy: 2.4.2
scipy: 1.17.1
netCDF4: 1.7.4
pydap: 3.5.8
h5netcdf: None
h5py: None
zarr: 3.1.5
cftime: 1.6.5
nc_time_axis: None
iris: None
bottleneck: None
dask: 2026.1.2
distributed: 2026.1.2
matplotlib: 3.10.8
cartopy: None
seaborn: None
numbagg: None
fsspec: 2026.2.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 82.0.0
pip: 26.0.1
conda: None
pytest: None
mypy: None
IPython: 9.11.0
sphinx: None