cuda/tests: add array manipulation tests for cudasim and grid-stride correctness#706
Conversation
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
Adds comprehensive array manipulation test coverage for CUDA backend. The implementation correctly uses grid-stride loops with cuda.grid(1) and cuda.gridsize(1), properly skips tests on cudasim that require real device memory semantics, and marks unsupported operations with @pytest.mark.xfail. Test kernels are well-structured and reuse module-level kernel definitions for consistency.
Confidence Score: 4/5
- This PR is safe to merge with minimal risk
- The new test file follows established patterns in the codebase for grid-stride loops and cudasim handling. All test implementations are straightforward and correct. Minor note: some tests use uninitialized arrays but only check specific indices they write to, which is acceptable for testing purposes.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| numba/cuda/tests/test_array_manipulation.py | 4/5 | New comprehensive test file for array manipulation - well-structured with correct grid-stride patterns and appropriate cudasim skips |
Greptile OverviewGreptile SummaryAdds two comprehensive test suites for CUDA array manipulation operations:
The tests follow existing codebase patterns, include proper license headers, and the previously reported duplicate tests between files have been removed. The split between Confidence Score: 5/5
Important Files Changed
|
Greptile's behavior is changing!From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
gmarkall
left a comment
There was a problem hiding this comment.
Thanks for the PR! I see you have several PRs in-flight - I'll work through one at a time, and it may be worth checking if the feedback in the current PR applies to any of the other PRs too.
Some initial comments and questions on this PR:
- The test is in the wrong location - it is in
numba/cuda/tests/test_array_manipluation.py. Tests for Numba-CUDA are all innumba_cuda/numba/cuda/tests. We don't put tests directly in this directory though, but instead a subdirectory depending on what the test is testing. Most of the tests in this PR would be appropriate for thecudapysubdirectory, but the tests that test parts of theDeviceArrayAPI would make more sense undercudadrvas its implementation is part of thecudadrvmodule. - The tests marked as skipping on the CUDA simulator don't actually fail when run under the CUDA simulator. Generally if a test can pass on the CUDA simulator, it should be allowed to run under it, in order to ensure parity between the simulator and real hardware as much as possible.
- The PR description seems to be at odds with what the PR does. To my reading, the PR is adding some new tests; the description claims it is fixing tests, and then the descriptions that follow from that introduction then seem not to have any connection with addition of code. Can you clarify this discrepancy please?
- The comments in the test code suggest they are adapted from "the CPU array manipulation suite". I see in the CPU target for Numba, there is a class called
TestArrayManipulation, but I don't see much correlation between it and the tests added here. What part of the CPU test suite is this derived from?
|
Hey @gmarkall, thanks a ton for the review and for going through this one step by step – really helps a lot, especially since I've got a few PRs open right now. I pushed the updates you asked for:
About the CPU connection – yeah, it's mostly taking the shape/structure/intent from Locally I'm on Windows without a GPU, so the editable install is fighting me a bit with the C extensions and I couldn't run the full sim suite yet. But the code sticks pretty close to how the existing tests are written (like test_device_array.py and test_datetime.py), so I'm pretty confident the non-skipped stuff will be fine in CI. If this looks alright to you now, would love to get your 👍 so we can land it and move on to the ufunc/reduction tests for #515. Thanks again man, Please correct me otherwise! |
If you're having issues getting the install to work, feel free to open an issue with the details of what you're encountering... Perhaps there is some issue with editable installs on Windows that we never noticed because most development is done on Linux. |
gmarkall
left a comment
There was a problem hiding this comment.
Thanks for the updates. I think there are still some items to address here, which I have marked on the diff. Some issues are functional, and others are about clarifying the intent of the tests - it's important that the commentary surrounding test code is clear and accurate, so as to prevent confusion when reading test cases (or debugging failures).
| """Tests for DeviceArray memory behavior and shape semantics.""" | ||
|
|
||
| @skip_on_cudasim("Device memory lifetime tracking requires real device memory") | ||
| def test_memory_lifetime(self): |
There was a problem hiding this comment.
I think if I were aiming to verify the lifetime of the memory, I'd probably check the pointer is unchanged after each kernel as well. I can imagine this would still pass if the second kernel operated on some different memory to the first.
You can get the pointer value with arr.gpu_data.device_ctypes_pointer.value.
There was a problem hiding this comment.
Good point — I've added the pointer check as a separate test, test_memory_pointer_stability, which captures arr.gpu_data.device_ctypes_pointer.value before the first kernel launch and asserts it's unchanged after each subsequent launch. I kept test_memory_lifetime focused on value correctness and put the allocation-identity check in its own test so the failure modes are distinct — a value failure vs. pointer drift point to different bugs. test_memory_pointer_stability is decorated @skip_on_cudasim since the simulator doesn't model real device pointers, as you noted earlier.
| class TestDeviceArrayManipulation(CUDATestCase): | ||
| """Tests for DeviceArray memory behavior and shape semantics.""" | ||
|
|
||
| @skip_on_cudasim("Device memory lifetime tracking requires real device memory") |
There was a problem hiding this comment.
For info: the test as-written does not need to be skipped in the simulator. However, if the suggestion to check the pointer values is followed, then this skip would need to remain because I don't think the simulator models pointers to the data.
| result = dev.copy_to_host() | ||
| self.assertTrue(np.array_equal(host, result)) | ||
|
|
||
| @skip_on_cudasim("Device-to-device copy requires real device memory") |
There was a problem hiding this comment.
This skip does not seem necessary - the test should pass on the simulator.
| def test_advanced_slicing(self): | ||
| """Advanced slicing (e.g., ::2) is not supported on device arrays.""" | ||
| arr = cuda.to_device(np.arange(10)) | ||
| _ = arr[::2] |
There was a problem hiding this comment.
This is not testing functionality in a kernel as the docstring for the test module suggests.
|
Automatic reviews are disabled for this repository. |
|
I see |
5fdb208 to
4bc6bc8
Compare
|
@gmarkall — pushed a squashed update addressing all remaining review items:
One open question: |
3ea027b to
e25dee4
Compare
Adds two new test suites addressing gaps in Issue NVIDIA#515: - cudadrv/test_device_array_manipulation.py: DeviceArray API tests covering shape, dtype, memory lifetime, pointer stability, copies, reshape, ravel, view, and slicing behavior - cudapy/test_array_manipulation.py: kernel-level tests covering fill, elementwise ops, integer indexing, 2D indexing, dtype casting, shape semantics, and grid-stride loops Uses np.testing throughout for informative failure messages. Skips limited to test_memory_pointer_stability (simulator doesn't model device pointers). Boolean/fancy indexing documented as xfail.
e25dee4 to
93e8ee9
Compare
cuda/tests: add array manipulation tests for cudadrv and cudapy
Summary
Adds two new test suites for CUDA array manipulation as part of #515.
cudadrv/test_device_array_manipulation.pyTests for the
DeviceArrayAPI (no kernels):to_device/copy_to_hostroundtripcopy_to_deviceinto a pre-allocated arraydevice_array_like@skip_on_cudasim)reshape,ravel,viewarr[::2]) and multidimensional slicingxfail: boolean indexing, fancy indexingcudapy/test_array_manipulation.pyTests for array operations inside CUDA kernels:
getitem/setiteminside a kernelint32 → float32,float32 → int32, mixed)Notes
test_memory_pointer_stability(simulator does not model real device pointers).