fix contiguity computation by zasdfgbnm · Pull Request #5489 · NVIDIA/Fuser

zasdfgbnm · 2025-11-06T22:59:19Z

No description provided.

github-actions · 2025-11-06T23:00:20Z

Review updated until commit 5874b2b

Description

Fix contiguity computation for non-reduction dimensions
Improve allocation domain validation in tensor metadata
Correct argument manager update flag in kernel runtime

Changes walkthrough

Relevant files

Bug fix

fusion_cache_utils.cpp `Revise contiguity computation logic` csrc/runtime/fusion_cache_utils.cpp Rewrite contiguity inference to handle reduction and broadcast dims correctly Use single right-to-left pass with dynamic index tracking Set contiguity based on stride relationships between consecutive non-skipped dimensions Add bounds checks for sizes/strides consumption	+44/-11
fusion_kernel_runtime.cpp `Fix segment outputs update flag` csrc/runtime/fusion_kernel_runtime.cpp Change updateWithSegmentOutputs flag from true to false	+1/-1
tensor_metadata.cpp `Validate allocation domain completeness` csrc/tensor_metadata.cpp Replace dummy values for missing IDs with error check Ensure all allocation domain IDs are present in active_ids Simplify stride and size assignment logic Remove unnecessary conditional branches	+11/-13

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 No relevant tests
⚡ Recommended focus areas for review
Possible Issue The logic for computing contiguity may not correctly handle cases where broadcast dimensions are present, as it unconditionally decrements sizes_idx for broadcasts but does not validate if the stride matches expected contiguous patterns when resuming after skips. if (id->isBroadcast()) { sizes_idx--; // Move to next dimension in sizes/strides continue; } Correctness Concern The removal of fallback logic for IDs not found in active_ids may lead to errors in cases like grouped matmul where some IDs are expected to be missing; previously dummy values were used, but now a strict error is thrown. auto it = active_ids.find(id); NVF_ERROR( it != active_ids.end(), "Allocation domain of tensor ", tv->toString(), " is not complete. Missing ID: ", id->toString()); Behavior Change The update to updateWithSegmentOutputs changes the last argument from true to false, which may affect how outputs are managed or synchronized; the rationale for this change is not explained. group_to_run->outputs(), group_runtime_outputs, run_order_id, false);

zasdfgbnm · 2025-11-06T23:03:16Z

!test

zasdfgbnm · 2025-11-10T22:10:41Z

!test

zasdfgbnm · 2025-11-11T00:56:17Z

!test

greptile-apps · 2025-11-12T21:45:13Z

Greptile Overview

Greptile Summary

This PR fixes contiguity computation for tensor allocation domains by replacing a two-phase approach with a single-pass dual-index algorithm that correctly handles reduction and broadcast dimensions.

Key changes:

csrc/runtime/fusion_cache_utils.cpp: Replaced computeContiguity() call with custom inline algorithm that iterates through allocation domain right-to-left, using two indices (alloc_idx for allocation domain, sizes_idx for sizes/strides array). This correctly handles the fact that reduction dimensions have no entry in sizes/strides while broadcast dimensions do.
csrc/tensor_metadata.cpp: Strengthened validation by removing fallback to dummy values when allocation domain IDs are missing from active_ids map. Now fails fast with clear error message.
csrc/runtime/fusion_kernel_runtime.cpp: Disabled contiguity updates in segment output processing by changing flag from true to false.

Confidence Score: 4/5

Safe to merge with minor concerns about test coverage for edge cases
The core algorithm rewrite in fusion_cache_utils.cpp is well-commented and logically sound, with proper bounds checking. The removal of dummy value fallbacks in tensor_metadata.cpp makes behavior more predictable. However, the change from true to false in fusion_kernel_runtime.cpp appears unrelated to the contiguity fix and lacks explanation in the PR description.
csrc/runtime/fusion_kernel_runtime.cpp - verify the update_contiguity flag change is intentional and correct

Important Files Changed

File Analysis

Filename	Score	Overview
csrc/runtime/fusion_cache_utils.cpp	4/5	Rewrote contiguity computation algorithm to correctly handle reduction and broadcast dimensions in allocation domain by using dual-index traversal
csrc/runtime/fusion_kernel_runtime.cpp	5/5	Changed update_contiguity flag from true to false in updateWithSegmentOutputs call
csrc/tensor_metadata.cpp	5/5	Strengthened validation to require all allocation domain IDs in active_ids map, removing fallback to dummy values

Sequence Diagram

sequenceDiagram
    participant FKR as FusionKernelRuntime
    participant AM as ArgumentManager
    participant RAD as resetAllocationDomainAndContiguity
    participant IAS as inferAllocationSizesAndStrides
    participant TV as TensorView

    FKR->>AM: updateWithSegmentOutputs(outputs, runtime_outputs, id, false)
    Note over FKR,AM: Changed from true to false
    AM->>AM: Store outputs in tensor_map_
    
    alt update_contiguity == true
        loop For each TensorView output
            AM->>RAD: resetAllocationDomainAndContiguity(tv, tensor)
            RAD->>IAS: inferAllocationSizesAndStrides(tensor, tv)
            IAS->>IAS: Traverse logical to allocation domain
            IAS->>IAS: Validate all IDs present in active_ids
            Note over IAS: Now enforces strict validation
            IAS-->>RAD: Return (sizes, strides) without reductions
            RAD->>RAD: Iterate allocation domain right-to-left
            RAD->>RAD: Dual-index traversal (alloc_idx, sizes_idx)
            Note over RAD: Skip reductions (no sizes/strides entry)<br/>Skip broadcasts (nullopt contiguity)<br/>Compute contiguity for normal dims
            RAD->>TV: setContiguity(contiguity)
        end
    end
    
    AM->>AM: Clean up unused tensor_map_ entries

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2025-11-12T21:45:13Z

csrc/runtime/fusion_kernel_runtime.cpp


    args_manager.updateWithSegmentOutputs(
-        group_to_run->outputs(), group_runtime_outputs, run_order_id, true);
+        group_to_run->outputs(), group_runtime_outputs, run_order_id, false);


style: Check whether changing this flag from true to false is intentional and related to the contiguity computation fix, as it's not explained in the PR description.

fix

391102f

zasdfgbnm added 2 commits November 10, 2025 14:07

Merge branch 'eval-segment' into esfix

db6eb56

save

25c081b

zasdfgbnm added 3 commits November 10, 2025 14:58

save

6df32ec

save

2ddc942

save

5874b2b

zasdfgbnm marked this pull request as ready for review November 12, 2025 21:42

zasdfgbnm merged commit 25c3fee into eval-segment Nov 12, 2025
56 of 60 checks passed

zasdfgbnm deleted the esfix branch November 12, 2025 21:42

greptile-apps bot reviewed Nov 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix contiguity computation#5489

fix contiguity computation#5489
zasdfgbnm merged 6 commits intoeval-segmentfrom
esfix

zasdfgbnm commented Nov 6, 2025

Uh oh!

github-actions bot commented Nov 6, 2025 •

edited

Loading

Changes walkthrough

PR Reviewer Guide

Uh oh!

zasdfgbnm commented Nov 6, 2025

Uh oh!

zasdfgbnm commented Nov 10, 2025

Uh oh!

zasdfgbnm commented Nov 11, 2025

Uh oh!

Uh oh!

greptile-apps bot commented Nov 12, 2025

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zasdfgbnm commented Nov 6, 2025

Uh oh!

github-actions bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough

PR Reviewer Guide

Uh oh!

zasdfgbnm commented Nov 6, 2025

Uh oh!

zasdfgbnm commented Nov 10, 2025

Uh oh!

zasdfgbnm commented Nov 11, 2025

Uh oh!

Uh oh!

greptile-apps bot commented Nov 12, 2025

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Nov 6, 2025 •

edited

Loading