Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,9 @@ doc/team-panel.txt
doc/external-examples-gallery.txt
doc/notebooks-examples-gallery.txt
doc/videos-gallery.txt
doc/*.zarr
doc/*.nc
doc/*.h5

# Until we support this properly, excluding from gitignore. (adding it to
# gitignore to make it _easier_ to work with `uv`, not as an indication that I
Expand Down
5 changes: 0 additions & 5 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,11 +178,6 @@
# mermaid config
mermaid_version = "11.6.0"

# sphinx-llm config
# Some jupyter-execute cells are not thread-safe, so we need to build sequentially.
# See https://github.com/pydata/xarray/pull/11003#issuecomment-3641648868
llms_txt_build_parallel = False

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates", sphinx_autosummary_accessors.templates_path]

Expand Down
25 changes: 20 additions & 5 deletions doc/getting-started-guide/quick-overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -213,17 +213,32 @@ You can directly read and write xarray objects to disk using :py:meth:`~xarray.D

.. jupyter-execute::

ds.to_netcdf("example.nc")
reopened = xr.open_dataset("example.nc")
reopened
filename = "example.nc"

.. jupyter-execute::
:hide-code:

import os
# Ensure the file is located in a unique temporary directory
# so that it doesn't conflict with parallel builds of the
# documentation.

import tempfile
import os.path

tempdir = tempfile.TemporaryDirectory()
filename = os.path.join(tempdir.name, filename)

.. jupyter-execute::

ds.to_netcdf(filename)
reopened = xr.open_dataset(filename)
reopened

.. jupyter-execute::
:hide-code:

reopened.close()
os.remove("example.nc")
tempdir.cleanup()


It is common for datasets to be distributed across multiple files (commonly one file per timestep). Xarray supports this use-case by providing the :py:meth:`~xarray.open_mfdataset` and the :py:meth:`~xarray.save_mfdataset` methods. For more, see :ref:`io`.
Expand Down
83 changes: 59 additions & 24 deletions doc/internals/time-coding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -459,59 +459,94 @@ Default Time Unit

The current default time unit of xarray is ``'ns'``. When setting keyword argument ``time_unit`` unit to ``'s'`` (the lowest resolution pandas allows) datetimes will be converted to at least ``'s'``-resolution, if possible. The same holds true for ``'ms'`` and ``'us'``.

.. jupyter-execute::

datetimes1_filename = "test-datetimes1.nc"

.. jupyter-execute::
:hide-code:

# Ensure the file is located in a unique temporary directory
# so that it doesn't conflict with parallel builds of the
# documentation.

import tempfile
import os.path

tempdir = tempfile.TemporaryDirectory()
datetimes1_filename = os.path.join(tempdir.name, datetimes1_filename)

.. jupyter-execute::

attrs = {"units": "hours since 2000-01-01"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-datetimes1.nc")
ds.to_netcdf(datetimes1_filename)

.. jupyter-execute::

xr.open_dataset("test-datetimes1.nc")
xr.open_dataset(datetimes1_filename)

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-datetimes1.nc", decode_times=coder)
xr.open_dataset(datetimes1_filename, decode_times=coder)

If a coarser unit is requested the datetimes are decoded into their native
on-disk resolution, if possible.

.. jupyter-execute::

datetimes2_filename = "test-datetimes2.nc"

.. jupyter-execute::
:hide-code:

datetimes2_filename = os.path.join(tempdir.name, datetimes2_filename)

.. jupyter-execute::

attrs = {"units": "milliseconds since 2000-01-01"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-datetimes2.nc")
ds.to_netcdf(datetimes2_filename)

.. jupyter-execute::

xr.open_dataset("test-datetimes2.nc")
xr.open_dataset(datetimes2_filename)

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-datetimes2.nc", decode_times=coder)
xr.open_dataset(datetimes2_filename, decode_times=coder)

Similar logic applies for decoding timedelta values. The default resolution is
``"ns"``:

.. jupyter-execute::

timedeltas1_filename = "test-timedeltas1.nc"

.. jupyter-execute::
:hide-code:

timedeltas1_filename = os.path.join(tempdir.name, timedeltas1_filename)

.. jupyter-execute::

attrs = {"units": "hours"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas1.nc")
ds.to_netcdf(timedeltas1_filename)

.. jupyter-execute::
:stderr:

xr.open_dataset("test-timedeltas1.nc")
xr.open_dataset(timedeltas1_filename)

By default, timedeltas will be decoded to the same resolution as datetimes:

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas1.nc", decode_times=coder, decode_timedelta=True)
xr.open_dataset(timedeltas1_filename, decode_times=coder, decode_timedelta=True)

but if one would like to decode timedeltas to a different resolution, one can
provide a coder specifically for timedeltas to ``decode_timedelta``:
Expand All @@ -520,32 +555,41 @@ provide a coder specifically for timedeltas to ``decode_timedelta``:

timedelta_coder = xr.coders.CFTimedeltaCoder(time_unit="ms")
xr.open_dataset(
"test-timedeltas1.nc", decode_times=coder, decode_timedelta=timedelta_coder
timedeltas1_filename, decode_times=coder, decode_timedelta=timedelta_coder
)

As with datetimes, if a coarser unit is requested the timedeltas are decoded
into their native on-disk resolution, if possible:

.. jupyter-execute::

timedeltas2_filename = "test-timedeltas2.nc"

.. jupyter-execute::
:hide-code:

timedeltas2_filename = os.path.join(tempdir.name, timedeltas2_filename)

.. jupyter-execute::

attrs = {"units": "milliseconds"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas2.nc")
ds.to_netcdf(timedeltas2_filename)

.. jupyter-execute::

xr.open_dataset("test-timedeltas2.nc", decode_timedelta=True)
xr.open_dataset(timedeltas2_filename, decode_timedelta=True)

.. jupyter-execute::

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas2.nc", decode_times=coder, decode_timedelta=True)
xr.open_dataset(timedeltas2_filename, decode_times=coder, decode_timedelta=True)

To opt-out of timedelta decoding (see issue `Undesired decoding to timedelta64 <https://github.com/pydata/xarray/issues/1621>`_) pass ``False`` to ``decode_timedelta``:

.. jupyter-execute::

xr.open_dataset("test-timedeltas2.nc", decode_timedelta=False)
xr.open_dataset(timedeltas2_filename, decode_timedelta=False)

.. note::
Note that in the future the default value of ``decode_timedelta`` will be
Expand All @@ -557,13 +601,4 @@ To opt-out of timedelta decoding (see issue `Undesired decoding to timedelta64 <
:hide-code:

# Cleanup
import os

for f in [
"test-datetimes1.nc",
"test-datetimes2.nc",
"test-timedeltas1.nc",
"test-timedeltas2.nc",
]:
if os.path.exists(f):
os.remove(f)
tempdir.cleanup()
53 changes: 35 additions & 18 deletions doc/internals/zarr-encoding-spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,18 @@ with zarr-python.

**Example 1: Zarr V2 Format**

.. jupyter-execute::

zarr_v2_filename = "example_v2.zarr"

.. jupyter-execute::
:hide-code:

import tempfile
import os.path
tempdir = tempfile.TemporaryDirectory()
zarr_v2_filename = os.path.join(tempdir.name, zarr_v2_filename)

.. jupyter-execute::

import os
Expand All @@ -98,30 +110,33 @@ with zarr-python.

# Load tutorial dataset and write as Zarr V2
ds = xr.tutorial.load_dataset("rasm")
ds.to_zarr("rasm_v2.zarr", mode="w", consolidated=False, zarr_format=2)
ds.to_zarr(zarr_v2_filename, mode="w", consolidated=False, zarr_format=2)

# Open with zarr-python and examine attributes
zgroup = zarr.open("rasm_v2.zarr")
zgroup = zarr.open(zarr_v2_filename)
print("Zarr V2 - Tair attributes:")
tair_attrs = dict(zgroup["Tair"].attrs)
for key, value in tair_attrs.items():
print(f" '{key}': {repr(value)}")

**Example 2: Zarr V3 Format**

.. jupyter-execute::
:hide-code:

import shutil
shutil.rmtree("rasm_v2.zarr")
zarr_v3_filename = "example_v3.zarr"

**Example 2: Zarr V3 Format**
.. jupyter-execute::
:hide-code:

zarr_v3_filename = os.path.join(tempdir.name, zarr_v3_filename)

.. jupyter-execute::

# Write the same dataset as Zarr V3
ds.to_zarr("rasm_v3.zarr", mode="w", consolidated=False, zarr_format=3)
ds.to_zarr(zarr_v3_filename, mode="w", consolidated=False, zarr_format=3)

# Open with zarr-python and examine attributes
zgroup = zarr.open("rasm_v3.zarr")
zgroup = zarr.open(zarr_v3_filename)
print("Zarr V3 - Tair attributes:")
tair_attrs = dict(zgroup["Tair"].attrs)
for key, value in tair_attrs.items():
Expand All @@ -131,12 +146,6 @@ with zarr-python.
tair_array = zgroup["Tair"]
print(f"\nZarr V3 - dimension_names in metadata: {tair_array.metadata.dimension_names}")

.. jupyter-execute::
:hide-code:

import shutil
shutil.rmtree("rasm_v3.zarr")


Chunk Key Encoding
------------------
Expand All @@ -148,6 +157,16 @@ dimension separator in chunk keys.

For example, to specify a custom separator for chunk keys:


.. jupyter-execute::

example_filename = "example.zarr"

.. jupyter-execute::
:hide-code:

example_filename = os.path.join(tempdir.name, example_filename)

.. jupyter-execute::

import xarray as xr
Expand All @@ -161,7 +180,7 @@ For example, to specify a custom separator for chunk keys:
arr = np.ones((42, 100))
ds = xr.DataArray(arr, name="var1").to_dataset()
ds.to_zarr(
"example.zarr",
example_filename,
zarr_format=2,
mode="w",
encoding={"var1": {"chunks": (42, 50), "chunk_key_encoding": enc}},
Expand All @@ -181,6 +200,4 @@ when working with tools that expect a particular chunk key format.
.. jupyter-execute::
:hide-code:

import shutil

shutil.rmtree("example.zarr")
tempdir.cleanup()
Loading
Loading