Skip to content

Fixes for 32-bit builds. Tested w/ gcc 11.4 and CTK 12.9#1120

Merged
tbensonatl merged 1 commit intomainfrom
bugfix/alternate-32b-build-fixes
Jan 16, 2026
Merged

Fixes for 32-bit builds. Tested w/ gcc 11.4 and CTK 12.9#1120
tbensonatl merged 1 commit intomainfrom
bugfix/alternate-32b-build-fixes

Conversation

@tbensonatl
Copy link
Collaborator

Adjust the make_tensor constraints to not match allocators to a ShapeType. Use MATX_INDEX_T_FMT in place of lld and explicitly convert indices in filter.cuh to index_t.

Adjust the make_tensor constraints to not match allocators to a ShapeType.
Use MATX_INDEX_T_FMT in place of lld and explicitly convert indices in
filter.cuh to index_t.

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl tbensonatl self-assigned this Jan 15, 2026
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 15, 2026

Greptile Summary

Fixed compilation issues for 32-bit builds by addressing type mismatches and format string incompatibilities.

  • Added is_tuple_c constraint to make_tensor and make_tensor_p template functions to prevent allocator types from incorrectly matching the ShapeType parameter
  • Changed loop iteration variables from uint32_t to index_t in filter.cuh and added explicit casts for CUDA built-in variables (blockIdx.y) to avoid implicit conversion warnings when index_t is 32-bit
  • Replaced hardcoded %lld format specifier with MATX_INDEX_T_FMT macro in printf statement to correctly handle both 32-bit (int32_t) and 64-bit (long long int) index types

All changes are well-targeted fixes for 32-bit compatibility without affecting 64-bit builds or runtime behavior.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • All changes are minimal, well-scoped fixes for 32-bit build compatibility. The type constraint addition prevents incorrect template matching, explicit casts ensure proper type conversions, and the format string macro usage follows existing patterns. Changes have been tested with gcc 11.4 and CTK 12.9.
  • No files require special attention

Important Files Changed

Filename Overview
include/matx/core/make_tensor.h Added is_tuple_c constraint to prevent allocators from matching the ShapeType parameter in make_tensor and make_tensor_p overloads
include/matx/kernels/filter.cuh Changed loop counters from uint32_t to index_t and added explicit casts for blockIdx.y and index calculations to index_t for 32-bit build compatibility
examples/black_scholes.cu Replaced hardcoded format string %lld with MATX_INDEX_T_FMT macro to handle both 32-bit and 64-bit index types correctly in printf statement

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant MT as make_tensor/make_tensor_p
    participant Filter as RecursiveFilter Kernel
    participant Print as printf

    Note over User,Print: 32-bit Build Compilation Flow

    User->>MT: Call make_tensor(data, shape)
    Note over MT: Before: ShapeType could match allocators
    Note over MT: After: is_tuple_c constraint ensures<br/>ShapeType is a tuple
    MT-->>User: Return tensor_t

    User->>Filter: Launch kernel with tensor
    Note over Filter: Loop counters changed from<br/>uint32_t to index_t
    Filter->>Filter: Access d_in/d_out with<br/>static_cast<index_t>(blockIdx.y)
    Note over Filter: Explicit casts prevent<br/>implicit conversion warnings
    Filter-->>User: Kernel complete

    User->>Print: printf with index value
    Note over Print: Format string uses<br/>MATX_INDEX_T_FMT macro
    Note over Print: 32-bit: "d", 64-bit: "lld"
    Print-->>User: Correct output
Loading

@cliffburdick
Copy link
Collaborator

/build

@cliffburdick
Copy link
Collaborator

cliffburdick commented Jan 15, 2026

Fixes #1117

@tbensonatl tbensonatl merged commit b00d894 into main Jan 16, 2026
2 checks passed
@tbensonatl tbensonatl deleted the bugfix/alternate-32b-build-fixes branch January 23, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants