[DO NOT LAND] Test run for MLP bias accuracy issue by xuzhao9 · Pull Request #609 · meta-pytorch/tritonbench

xuzhao9 · 2025-10-31T14:25:47Z

torch.compile has numeric issue with used with torch.amp.autocast and bias

The problem is gone when:

Not using amp, or
Not using bias

Reproduce:

LD_LIBRARY_PATH="$HOME/.conda/envs/py312/lib" python run.py --op mlp  --metrics accuracy --use_bias --num-inputs 1

The issue is not related to amp. It is only related to the post_grad pass of pt2 where it decomposes addmm into mm + triton add. This should be easily reproduced with a simpler test.

xuzhao9 · 2026-01-28T18:17:48Z

Closing as this is nvidia kernel issue.

xuzhao9 · 2026-02-04T17:48:47Z

A minimal reproduction:

import torch
input1 = torch.randn((960, 512), dtype=torch.bfloat16).cuda()
input2 = torch.randn((512, 2048), dtype=torch.bfloat16).cuda()
input3 = torch.randn((2048,), dtype=torch.bfloat16).cuda()
output1_int = torch.addmm(input3, input1, input2)
output1 = torch.randn(960, 2048, dtype=torch.bfloat16).cuda()
# torch.clamp_min(output1_int, 0, out=output1)
output1 = output1_int
output2 = torch.randn(960, 2048, dtype=torch.bfloat16).cuda()
output2_int = torch.mm(input1, input2)
output2_int2 = torch.add(output2_int, input3)
# torch.clamp_min(output2_int2, 0, out=output2)
output2 = output2_int2
torch.testing.assert_close(output1, output2)
## CLI output
## Traceback (most recent call last):
##   File "/data/users/xzhao9/tmp/test.py", line 19, in <module>
##     torch.testing.assert_close(output1, output2)
##   File "/data/users/xzhao9/uv_venvs/py312/lib/python3.12/site-packages/torch/testing/_comparison.py", line 1600, in assert_close
##     raise error_metas[0].to_error(msg)
## AssertionError: Tensor-likes are not close!
## Mismatched elements: 4924 / 1966080 (0.3%)

After discussing with PyTorch developers, it is because post_grad pass will decompose addmm into mm and add by default, and it needs to set torch.mm(..., out_dtype=torch.float32) to make the result consistent with torch.addmm().

xuzhao9 · 2026-02-05T19:01:01Z

Fix: pytorch/pytorch#174403

meta-cla Bot added the cla signed label Oct 31, 2025

xuzhao9 had a problem deploying to docker-s3-upload October 31, 2025 14:25 — with GitHub Actions Error

xuzhao9 had a problem deploying to docker-s3-upload October 31, 2025 14:28 — with GitHub Actions Failure

xuzhao9 force-pushed the xz9/chenning-main branch from ee418eb to 1daafda Compare December 19, 2025 15:46

xuzhao9 had a problem deploying to docker-s3-upload December 19, 2025 15:46 — with GitHub Actions Error

xuzhao9 temporarily deployed to docker-s3-upload December 19, 2025 15:48 — with GitHub Actions Inactive

xuzhao9 had a problem deploying to docker-s3-upload December 19, 2025 15:48 — with GitHub Actions Failure

xuzhao9 temporarily deployed to docker-s3-upload December 19, 2025 18:30 — with GitHub Actions Inactive

xuzhao9 had a problem deploying to docker-s3-upload December 19, 2025 18:30 — with GitHub Actions Failure

Zhiyang-Z and others added 4 commits January 20, 2026 08:32

Add operator

5774edb

simplify

005b00f

fix mlp

62a2420

run at bfloat16

37727f5

xuzhao9 force-pushed the xz9/chenning-main branch from 921b3bc to 37727f5 Compare January 20, 2026 16:32

xuzhao9 had a problem deploying to docker-s3-upload January 20, 2026 16:32 — with GitHub Actions Failure

xuzhao9 temporarily deployed to docker-s3-upload January 20, 2026 16:32 — with GitHub Actions Inactive

xuzhao9 added 2 commits January 21, 2026 08:18

addmm-triton-discrepency

6ceae1a

bugfix

84bc0dc

xuzhao9 temporarily deployed to docker-s3-upload January 21, 2026 16:19 — with GitHub Actions Inactive

xuzhao9 had a problem deploying to docker-s3-upload January 21, 2026 16:19 — with GitHub Actions Failure

xuzhao9 mentioned this pull request Jan 21, 2026

Output divergence in MLP with torch.compile + AMP when biases are enabled #607

Closed

xuzhao9 closed this Jan 28, 2026

xuzhao9 reopened this Feb 4, 2026

FindHao mentioned this pull request Feb 4, 2026

Suspicious trace unmatch facebookexperimental/CUTracer#70

Closed

xuzhao9 had a problem deploying to docker-s3-upload February 4, 2026 17:34 — with GitHub Actions Failure

xuzhao9 temporarily deployed to docker-s3-upload February 4, 2026 17:34 — with GitHub Actions Inactive

xuzhao9 had a problem deploying to docker-s3-upload February 4, 2026 17:34 — with GitHub Actions Failure

xuzhao9 temporarily deployed to docker-s3-upload February 4, 2026 17:34 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT LAND] Test run for MLP bias accuracy issue#609

[DO NOT LAND] Test run for MLP bias accuracy issue#609
xuzhao9 wants to merge 6 commits into
mainfrom
xz9/chenning-main

xuzhao9 commented Oct 31, 2025 •

edited

Loading

Uh oh!

xuzhao9 commented Jan 28, 2026

Uh oh!

xuzhao9 commented Feb 4, 2026 •

edited

Loading

Uh oh!

xuzhao9 commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xuzhao9 commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuzhao9 commented Jan 28, 2026

Uh oh!

xuzhao9 commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuzhao9 commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xuzhao9 commented Oct 31, 2025 •

edited

Loading

xuzhao9 commented Feb 4, 2026 •

edited

Loading