Skip to content

cuda/tests: add array manipulation tests for cudasim and grid-stride correctness#706

Open
CodersAcademy006 wants to merge 1 commit into
NVIDIA:mainfrom
CodersAcademy006:test/cuda-array-manipulation-fixes
Open

cuda/tests: add array manipulation tests for cudasim and grid-stride correctness#706
CodersAcademy006 wants to merge 1 commit into
NVIDIA:mainfrom
CodersAcademy006:test/cuda-array-manipulation-fixes

Conversation

@CodersAcademy006
Copy link
Copy Markdown

@CodersAcademy006 CodersAcademy006 commented Jan 8, 2026

cuda/tests: add array manipulation tests for cudadrv and cudapy

Summary

Adds two new test suites for CUDA array manipulation as part of #515.

cudadrv/test_device_array_manipulation.py

Tests for the DeviceArray API (no kernels):

  • Shape, dtype, size, ndim attributes
  • to_device / copy_to_host roundtrip
  • copy_to_device into a pre-allocated array
  • Device-to-device copy
  • device_array_like
  • Multidimensional shape consistency
  • Memory lifetime across multiple kernel launches
  • Device pointer stability across kernel launches (@skip_on_cudasim)
  • reshape, ravel, view
  • Step slicing (arr[::2]) and multidimensional slicing
  • xfail: boolean indexing, fancy indexing

cudapy/test_array_manipulation.py

Tests for array operations inside CUDA kernels:

  • Basic fill via kernel
  • Elementwise addition
  • Integer getitem/setitem inside a kernel
  • 2D multidimensional indexing
  • Dtype casting (int32 → float32, float32 → int32, mixed)
  • Shape attribute access inside a kernel
  • Grid-stride loop correctness (intentionally over-provisioned grid)

Notes

  • All tests pass on the CUDA simulator except test_memory_pointer_stability
    (simulator does not model real device pointers).
  • No production code changes — test coverage only.
  • Contributes to Incomplete Test Coverage in Numba-CUDA #515.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jan 8, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Adds comprehensive array manipulation test coverage for CUDA backend. The implementation correctly uses grid-stride loops with cuda.grid(1) and cuda.gridsize(1), properly skips tests on cudasim that require real device memory semantics, and marks unsupported operations with @pytest.mark.xfail. Test kernels are well-structured and reuse module-level kernel definitions for consistency.

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk
  • The new test file follows established patterns in the codebase for grid-stride loops and cudasim handling. All test implementations are straightforward and correct. Minor note: some tests use uninitialized arrays but only check specific indices they write to, which is acceptable for testing purposes.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
numba/cuda/tests/test_array_manipulation.py 4/5 New comprehensive test file for array manipulation - well-structured with correct grid-stride patterns and appropriate cudasim skips

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 18, 2026

Greptile Overview

Greptile Summary

Adds two comprehensive test suites for CUDA array manipulation operations:

  • cudadrv/test_device_array_manipulation.py: Tests DeviceArray API including shape/dtype attributes, memory transfers (to_device, copy_to_host), device-to-device copies, and device_array_like. Properly uses @skip_on_cudasim for tests requiring real device memory semantics.

  • cudapy/test_array_manipulation.py: Tests kernel-level operations including element-wise operations, multidimensional indexing, dtype casting, shape semantics in kernels, and grid-stride loop patterns. Correctly uses cuda.grid(1) and cuda.gridsize(1) for true grid-stride semantics. Includes xfail tests documenting unsupported operations (slicing, boolean indexing, fancy indexing).

The tests follow existing codebase patterns, include proper license headers, and the previously reported duplicate tests between files have been removed. The split between cudadrv (DeviceArray API) and cudapy (kernel operations) is clear and logical.

Confidence Score: 5/5

  • This PR is safe to merge with no issues
  • The PR adds well-structured test coverage with no changes to production code. Tests follow established patterns in the codebase, use appropriate decorators (@skip_on_cudasim, @pytest.mark.xfail), and the previously reported issues (missing license headers and duplicate tests) have been resolved.
  • No files require special attention

Important Files Changed

Filename Overview
numba_cuda/numba/cuda/tests/cudadrv/test_device_array_manipulation.py Adds comprehensive DeviceArray API tests covering shape, dtype, memory operations, and device-to-device copies with appropriate cudasim skips
numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Adds kernel-level array operation tests including grid-stride loops, multidimensional indexing, dtype transitions, and xfail tests for unsupported operations

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 18, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

Copy link
Copy Markdown
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I see you have several PRs in-flight - I'll work through one at a time, and it may be worth checking if the feedback in the current PR applies to any of the other PRs too.

Some initial comments and questions on this PR:

  • The test is in the wrong location - it is in numba/cuda/tests/test_array_manipluation.py. Tests for Numba-CUDA are all in numba_cuda/numba/cuda/tests. We don't put tests directly in this directory though, but instead a subdirectory depending on what the test is testing. Most of the tests in this PR would be appropriate for the cudapy subdirectory, but the tests that test parts of the DeviceArray API would make more sense under cudadrv as its implementation is part of the cudadrv module.
  • The tests marked as skipping on the CUDA simulator don't actually fail when run under the CUDA simulator. Generally if a test can pass on the CUDA simulator, it should be allowed to run under it, in order to ensure parity between the simulator and real hardware as much as possible.
  • The PR description seems to be at odds with what the PR does. To my reading, the PR is adding some new tests; the description claims it is fixing tests, and then the descriptions that follow from that introduction then seem not to have any connection with addition of code. Can you clarify this discrepancy please?
  • The comments in the test code suggest they are adapted from "the CPU array manipulation suite". I see in the CPU target for Numba, there is a class called TestArrayManipulation, but I don't see much correlation between it and the tests added here. What part of the CPU test suite is this derived from?

@gmarkall gmarkall added the 4 - Waiting on author Waiting for author to respond to review label Jan 26, 2026
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py
Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Outdated
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@CodersAcademy006
Copy link
Copy Markdown
Author

Hey @gmarkall, thanks a ton for the review and for going through this one step by step – really helps a lot, especially since I've got a few PRs open right now.

I pushed the updates you asked for:

  • Split the tests the way you suggested:
    cudadrv/test_device_array_manipulation.py for the pure DeviceArray stuff (shape, dtype, size, copies, lifetime)
    cudapy/test_array_manipulation.py for the kernel + indexing + elementwise ops side

  • Dropped almost all the @skip_on_cudasim lines. Kept just two where it actually makes sense (memory lifetime and the grid-stride one), everything else should run happily in the simulator now.

  • Fixed up the PR title and description so it actually says what it does (adding new tests, not patching old ones). Also threw in a couple inline comments about where the CPU test inspiration came from.

About the CPU connection – yeah, it's mostly taking the shape/structure/intent from numba/tests/test_array_manipulation.py (the TestArrayManipulation class), but rewritten to fit CUDA realities (kernels instead of Python loops, no views or reshapes, etc.). Not a 1:1 copy, but the spirit is the same for the parts that actually work on device arrays.

Locally I'm on Windows without a GPU, so the editable install is fighting me a bit with the C extensions and I couldn't run the full sim suite yet. But the code sticks pretty close to how the existing tests are written (like test_device_array.py and test_datetime.py), so I'm pretty confident the non-skipped stuff will be fine in CI.

If this looks alright to you now, would love to get your 👍 so we can land it and move on to the ufunc/reduction tests for #515.

Thanks again man, Please correct me otherwise!

@gmarkall
Copy link
Copy Markdown
Contributor

gmarkall commented Feb 4, 2026

Locally I'm on Windows without a GPU, so the editable install is fighting me a bit with the C extensions and I couldn't run the full sim suite yet.

If you're having issues getting the install to work, feel free to open an issue with the details of what you're encountering... Perhaps there is some issue with editable installs on Windows that we never noticed because most development is done on Linux.

Copy link
Copy Markdown
Contributor

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I think there are still some items to address here, which I have marked on the diff. Some issues are functional, and others are about clarifying the intent of the tests - it's important that the commentary surrounding test code is clear and accurate, so as to prevent confusion when reading test cases (or debugging failures).

Comment thread numba_cuda/numba/cuda/tests/cudadrv/test_device_array_manipulation.py Outdated
Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Outdated
"""Tests for DeviceArray memory behavior and shape semantics."""

@skip_on_cudasim("Device memory lifetime tracking requires real device memory")
def test_memory_lifetime(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I were aiming to verify the lifetime of the memory, I'd probably check the pointer is unchanged after each kernel as well. I can imagine this would still pass if the second kernel operated on some different memory to the first.

You can get the pointer value with arr.gpu_data.device_ctypes_pointer.value.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — I've added the pointer check as a separate test, test_memory_pointer_stability, which captures arr.gpu_data.device_ctypes_pointer.value before the first kernel launch and asserts it's unchanged after each subsequent launch. I kept test_memory_lifetime focused on value correctness and put the allocation-identity check in its own test so the failure modes are distinct — a value failure vs. pointer drift point to different bugs. test_memory_pointer_stability is decorated @skip_on_cudasim since the simulator doesn't model real device pointers, as you noted earlier.

class TestDeviceArrayManipulation(CUDATestCase):
"""Tests for DeviceArray memory behavior and shape semantics."""

@skip_on_cudasim("Device memory lifetime tracking requires real device memory")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For info: the test as-written does not need to be skipped in the simulator. However, if the suggestion to check the pointer values is followed, then this skip would need to remain because I don't think the simulator models pointers to the data.

result = dev.copy_to_host()
self.assertTrue(np.array_equal(host, result))

@skip_on_cudasim("Device-to-device copy requires real device memory")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This skip does not seem necessary - the test should pass on the simulator.

def test_advanced_slicing(self):
"""Advanced slicing (e.g., ::2) is not supported on device arrays."""
arr = cuda.to_device(np.arange(10))
_ = arr[::2]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not testing functionality in a kernel as the docstring for the test module suggests.

Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Outdated
Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Outdated
Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Outdated
Comment thread numba_cuda/numba/cuda/tests/cudapy/test_array_manipulation.py Outdated
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 2, 2026

Automatic reviews are disabled for this repository.

@CodersAcademy006
Copy link
Copy Markdown
Author

I see test_reshape, test_view, and test_ravel may pass on the simulator — should I remove xfail, or keep it until verified on hardware?

@CodersAcademy006 CodersAcademy006 force-pushed the test/cuda-array-manipulation-fixes branch 3 times, most recently from 5fdb208 to 4bc6bc8 Compare March 2, 2026 09:30
@CodersAcademy006
Copy link
Copy Markdown
Author

@gmarkall — pushed a squashed update addressing all remaining review items:

test_boolean_indexing / test_fancy_indexing: restored @pytest.mark.xfail and removed incorrect positive assertions — these operations are unsupported so the tests document the failure, not assert success
Replaced all assertTrue(np.array_equal/allclose) with np.testing.assert_array_equal / np.testing.assert_allclose throughout both files
test_reshape and test_ravel now seed with real data and assert content, not just shape
test_device_to_device_copy: removed unnecessary @skip_on_cudasim
test_grid_stride_correctness: removed @skip_on_cudasim, fixed docstring
Slicing and indexing tests moved from cudapy to cudadrv where they belong
Replied inline on test_memory_lifetime explaining the split into two tests

One open question: test_reshape, test_ravel, test_view pass on the simulator — should these remain as positive tests, or revert to xfail until verified on real hardware?

@CodersAcademy006 CodersAcademy006 changed the title cuda/tests: fix array manipulation tests for cudasim and grid-stride correctness cuda/tests: add array manipulation tests for cudasim and grid-stride correctness Mar 2, 2026
@CodersAcademy006 CodersAcademy006 force-pushed the test/cuda-array-manipulation-fixes branch 2 times, most recently from 3ea027b to e25dee4 Compare March 2, 2026 10:12
Adds two new test suites addressing gaps in Issue NVIDIA#515:

- cudadrv/test_device_array_manipulation.py: DeviceArray API tests
  covering shape, dtype, memory lifetime, pointer stability, copies,
  reshape, ravel, view, and slicing behavior
- cudapy/test_array_manipulation.py: kernel-level tests covering
  fill, elementwise ops, integer indexing, 2D indexing, dtype casting,
  shape semantics, and grid-stride loops

Uses np.testing throughout for informative failure messages. Skips
limited to test_memory_pointer_stability (simulator doesn't model
device pointers). Boolean/fancy indexing documented as xfail.
@CodersAcademy006 CodersAcademy006 force-pushed the test/cuda-array-manipulation-fixes branch from e25dee4 to 93e8ee9 Compare March 2, 2026 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Waiting on author Waiting for author to respond to review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants