apps/nccl: fix a bug in allreduce kernels for graph mode#502
Merged
chhwang merged 3 commits intomicrosoft:mainfrom Apr 24, 2025
Merged
apps/nccl: fix a bug in allreduce kernels for graph mode#502chhwang merged 3 commits intomicrosoft:mainfrom
chhwang merged 3 commits intomicrosoft:mainfrom
Conversation
6a7a6c3 to
12b46eb
Compare
Contributor
There was a problem hiding this comment.
Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.
Files not reviewed (1)
- apps/nccl/src/nccl.cu: Language not supported
Comments suppressed due to low confidence (2)
apps/nccl/src/allreduce.hpp:451
- Casting the deviceFlag value from uint64_t to uint32_t may cause data truncation if the flag value exceeds 32 bits. Verify that this conversion is safe for all expected flag values.
uint32_t flag = (uint32_t) commFlag;
apps/nccl/src/allreduce.hpp:545
- Casting the deviceFlag value from uint64_t to uint32_t here may lead to truncation issues. Ensure this conversion will not lose significant bits in scenarios where the flag value grows beyond 32 bits.
uint32_t flag = (uint32_t) commFlag;
Binyang2014
reviewed
Apr 16, 2025
Binyang2014
reviewed
Apr 22, 2025
chhwang
reviewed
Apr 22, 2025
c4952bb to
50cd2bb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
allreduce7andallreduceAllpairskernels were updating the LL protocol flag on the host side. So, it was not properly captured in graph mode. This PR fixes the issue by updating the flag in the kernels.