Skip to content

Changed cudaExecutor to be const&#1104

Merged
cliffburdick merged 3 commits intomainfrom
exec_ref
Dec 19, 2025
Merged

Changed cudaExecutor to be const&#1104
cliffburdick merged 3 commits intomainfrom
exec_ref

Conversation

@cliffburdick
Copy link
Collaborator

@cliffburdick cliffburdick commented Dec 18, 2025

Fixes #1103

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 18, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 18, 2025

Greptile Summary

This PR fixes a critical SIGSEGV bug in allclose and other transform functions by changing cudaExecutor parameters from pass-by-value to pass-by-const-reference across 43 function signatures.

Key Changes:

  • include/matx/transforms/reduce.h: Updated 12 reduction functions (allclose, mean_impl, sum_impl, prod_impl, max_impl, min_impl, argmax_impl, argmin_impl, argminmax_impl, any_impl, all_impl, median_impl)
  • include/matx/transforms/cub.h: Updated 8 CUB wrapper functions (sort_impl, sort_impl_inner, sort_pairs_impl_inner, argsort_impl, cumsum_impl, find_impl, find_idx_impl, unique_impl)
  • include/matx/transforms/chol/chol_cuda.h: Replaced cudaExecutor member with cudaStream_t in DnCholCUDAParams_t struct to maintain copyability for use as std::unordered_map key
  • include/matx/transforms/transpose.h: Updated transpose_matrix_impl
  • docs_input/api/logic/truth/allclose.rst: Updated documentation

Root Cause: When cudaExecutor objects with profiling enabled are passed by value, the copy's destructor destroys CUDA events prematurely. When the original object's destructor runs, it attempts to destroy already-destroyed events, causing SIGSEGV.

Solution: Pass by const reference prevents copy construction, ensuring destructors only run once when the original object goes out of scope.

Confidence Score: 5/5

  • This PR is safe to merge - it fixes a critical bug with a clean, systematic solution
  • The changes are straightforward parameter type modifications that prevent double-free of CUDA events. The fix is well-tested (addresses a reproducible crash), follows C++ best practices (pass expensive objects by const reference), and maintains API compatibility. The chol_cuda.h refactoring correctly extracts the stream to maintain struct copyability.
  • No files require special attention

Important Files Changed

Filename Overview
include/matx/transforms/reduce.h Changed 12 function signatures to pass cudaExecutor by const reference, fixing the SIGSEGV issue in allclose and other reduction functions
include/matx/transforms/chol/chol_cuda.h Replaced cudaExecutor member with cudaStream_t in params struct to maintain copyability for use as map key
include/matx/transforms/cub.h Updated 8 function signatures to pass cudaExecutor by const reference for consistency
include/matx/transforms/transpose.h Changed transpose_matrix_impl to pass cudaExecutor by const reference
docs_input/api/logic/truth/allclose.rst Updated documentation to reflect new const reference parameter type for allclose

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant AllClose as allclose()
    participant Exec as cudaExecutor
    participant Events as CUDA Events

    Note over User,Events: Before Fix (Pass by Value)
    User->>+AllClose: allclose(dest, in1, in2, rtol, atol, exec)
    Note over AllClose: Copy constructor called
    AllClose->>Exec: Create copy of exec
    Note over AllClose: Function executes
    AllClose->>Exec: Destructor called on copy
    Exec->>Events: cudaEventDestroy(start_)
    Exec->>Events: cudaEventDestroy(stop_)
    AllClose-->>-User: Return
    User->>Exec: Destructor called on original
    Exec->>Events: cudaEventDestroy() on already destroyed events
    Note over Events: SIGSEGV! Events already destroyed

    Note over User,Events: After Fix (Pass by const&)
    User->>+AllClose: allclose(dest, in1, in2, rtol, atol, exec)
    Note over AllClose: Reference to original exec
    Note over AllClose: Function executes
    AllClose-->>-User: Return (no destructor called)
    User->>Exec: Destructor called on original (once)
    Exec->>Events: cudaEventDestroy(start_)
    Exec->>Events: cudaEventDestroy(stop_)
    Note over Events: Success! Events destroyed once
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (3)

  1. include/matx/transforms/chol/chol_cuda.h, line 131 (link)

    syntax: cannot assign to reference member after initialization

    C++ references cannot be reassigned. With the reference member on line 61, this line will not compile.

  2. include/matx/transforms/chol/chol_cuda.h, line 211 (link)

    syntax: update hash to use stream instead of executor reference

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  3. include/matx/transforms/chol/chol_cuda.h, line 226 (link)

    syntax: update equality to use stream instead of executor reference

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@cliffburdick
Copy link
Collaborator Author

/build

5 similar comments
@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick
Copy link
Collaborator Author

/build

1 similar comment
@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick cliffburdick merged commit 2d5fb4f into main Dec 19, 2025
2 checks passed
@cliffburdick cliffburdick deleted the exec_ref branch December 19, 2025 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG+FIX] SIGSEGV using allclose with user-owned exec with profiling activated

1 participant