Skip to content

Sync write_geotiff_gpu compression docstring against the actual codec set (#1644)#1649

Merged
brendancol merged 3 commits into
mainfrom
deep-sweep-api-consistency-geotiff-2026-05-11-1778545740
May 12, 2026
Merged

Sync write_geotiff_gpu compression docstring against the actual codec set (#1644)#1649
brendancol merged 3 commits into
mainfrom
deep-sweep-api-consistency-geotiff-2026-05-11-1778545740

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1644. The api-consistency sweep against xrspatial.geotiff on
2026-05-11 flagged that write_geotiff_gpu.__doc__ listed only four
codecs ('zstd', 'deflate', 'jpeg', 'none') under the
compression parameter, while the implementation actually accepts the
full codec set to_geotiff documents. gpu_compress_tiles dispatches
to nvCOMP for 'zstd' / 'deflate', to nvJPEG for 'jpeg', and falls
through to the CPU encoders for 'lzw', 'packbits', 'lz4',
'lerc', and 'jpeg2000' / 'j2k'. The resulting bytes match the
CPU writer byte-for-byte, but the docstring made the GPU writer look
much more restrictive than it actually is.

PR #1633 already addressed overview_resampling='cubic' and the
data type annotation but did not touch the codec list.

Regression test

xrspatial/geotiff/tests/test_compression_docstring_1644.py pins both
halves of the guarantee:

  • test_write_geotiff_gpu_docstring_lists_full_codec_set asserts the
    canonical 10 codec names appear in the compression block (matches
    _VALID_COMPRESSIONS plus the 'j2k' alias).
  • test_write_geotiff_gpu_accepts_cpu_fallback_codecs round-trips
    each CPU-fallback codec on a GPU host. JPEG 2000 / J2K use a uint16
    source so glymur's dtype check does not gate the encode path.

The new tests pass alongside the existing test_signature_parity_1631.py
suite (13/13 pass on a CUDA host).

Deprecation impact

None. Doc-only change. No signature change. Codecs that previously
worked continue to work; codecs that were previously documented
continue to be documented.

Test plan

  • pytest xrspatial/geotiff/tests/test_compression_docstring_1644.py (7/7 pass)
  • pytest xrspatial/geotiff/tests/test_signature_parity_1631.py (6/6 pass, no regression)
  • Verified each codec listed in the docstring writes a non-empty file via write_geotiff_gpu

Found by /sweep-api-consistency on 2026-05-11.

… set (#1644)

The api-consistency sweep on 2026-05-11 flagged that
write_geotiff_gpu.__doc__ listed only four codecs ('zstd', 'deflate',
'jpeg', 'none') while the implementation accepts every codec to_geotiff
does. Codecs without an nvCOMP encoder fall through to the CPU encoders
(lzw, packbits, lz4, lerc, jpeg2000 / j2k) so the output matches the
CPU writer byte-for-byte.

Mirror to_geotiff's wording in the compression block and note which
codecs run on GPU vs. CPU fallback so users understand the performance
trade-off without surprise.

Regression test test_compression_docstring_1644.py pins the codec list
and round-trips each CPU-fallback codec end-to-end on a GPU host.

Doc-only change. No signature change, no deprecation impact.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 12, 2026
@brendancol brendancol requested a review from Copilot May 12, 2026 00:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates xrspatial.geotiff.write_geotiff_gpu documentation to reflect the actual set of supported compression codec names (bringing it in line with to_geotiff) and adds a regression test to prevent future docstring drift.

Changes:

  • Expand write_geotiff_gpu’s compression docstring to list the full codec set and describe GPU vs CPU paths.
  • Add regression tests that pin the docstring codec list and exercise non-nvCOMP codecs on a CUDA host.
  • Update the api-consistency sweep state CSV to record issue #1644 as addressed.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
xrspatial/geotiff/tests/test_compression_docstring_1644.py New regression tests for docstring codec list parity and end-to-end acceptance of additional codecs on GPU hosts.
xrspatial/geotiff/__init__.py Updates write_geotiff_gpu docstring for compression to document the expanded codec set and backend behavior.
.claude/sweep-api-consistency-state.csv Records the api-consistency sweep finding and fix reference for #1644.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/__init__.py Outdated
Comment on lines +2726 to +2737
Codec name. Accepts the same set as ``to_geotiff``: ``'none'``,
``'deflate'``, ``'lzw'``, ``'jpeg'``, ``'packbits'``, ``'zstd'``,
``'lz4'``, ``'jpeg2000'`` (alias ``'j2k'``), or ``'lerc'``.

``'zstd'`` (default) and ``'deflate'`` compress on the GPU via
nvCOMP batch compression -- the fastest paths and the reason to
use this entry point. ``'jpeg'`` uses nvJPEG when available and
falls back to Pillow otherwise. ``'jpeg2000'`` / ``'j2k'`` and
``'lerc'`` route to the CPU encoders so the output matches the
CPU writer byte-for-byte, but lose the GPU compression speedup.
``'lzw'``, ``'packbits'``, and ``'lz4'`` likewise fall through
to the CPU encoder for parity with ``to_geotiff``.
Comment thread xrspatial/geotiff/__init__.py Outdated
Comment on lines +2733 to +2737
falls back to Pillow otherwise. ``'jpeg2000'`` / ``'j2k'`` and
``'lerc'`` route to the CPU encoders so the output matches the
CPU writer byte-for-byte, but lose the GPU compression speedup.
``'lzw'``, ``'packbits'``, and ``'lz4'`` likewise fall through
to the CPU encoder for parity with ``to_geotiff``.
Comment on lines +7 to +11
parameter, while the implementation actually accepts every codec
``to_geotiff`` does. Codecs unsupported by nvCOMP fall through to the
CPU encoders (``lzw``, ``packbits``, ``lz4``, ``lerc``, ``jpeg2000`` /
``j2k``) so the output matches the CPU writer byte-for-byte. This
module pins the full codec list against future drift and confirms the
Comment on lines +85 to +90
same codec set as ``to_geotiff``. ``jpeg`` is exercised separately
by ``test_features.py`` because the test data must be 8-bit
integer. ``jpeg2000`` / ``j2k`` go through ``glymur`` which only
accepts uint8/uint16 -- pick a uint16 source for those codecs so
the encode path is the one users actually hit, not a dtype-rejected
pre-check inside glymur.
Four wording fixes; no behavioural change:

- compression='jpeg': the docstring previously implied parity with
  to_geotiff, but to_geotiff rejects jpeg at runtime (it omits the
  JPEGTables tag and produces files that don't round-trip through GDAL).
  Spell out that write_geotiff_gpu DOES accept jpeg, that the on-disk
  bytes are self-contained JFIF tiles, and that GDAL/rasterio interop is
  not guaranteed until the JPEGTables fix lands.
- compression='jpeg2000' / 'j2k': replace "route to the CPU encoders"
  with the actual conditional behaviour (nvJPEG2K GPU encode first, CPU
  glymur fallback when libnvjpeg2k is missing) and call out that the two
  paths are not byte-stable, so byte-for-byte parity with to_geotiff
  isn't a contract here.
- test module docstring: distinguish "not nvCOMP-accelerated" (lzw,
  packbits, lz4, lerc — truly CPU-only) from jpeg2000/j2k (GPU first
  with CPU fallback) and from jpeg (write_geotiff_gpu only, separate
  test module).
- "exercised by test_features.py" referenced a file that doesn't cover
  JPEG. Repoint the test docstring to
  test_gpu_writer_compression_modes_2026_05_11.py which actually pins
  JPEG round-trips for the GPU writer.

All 7 tests in test_compression_docstring_1644.py still pass.
The earlier paragraphs had a multi-line ``literal`` and unbalanced
parens nested inside it, which RST handles awkwardly and which may have
been contributing to slow/timing-out Sphinx builds on RTD. Convert the
per-codec block to a bullet list with no multi-line inline literals.
Same content, no behavioural change.

All 7 docstring tests still pass.
@brendancol brendancol merged commit 181056e into main May 12, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: write_geotiff_gpu compression docstring lists 4 codecs but accepts the full 9

2 participants