Disable SageAttention for Hunyuan3D v2.1 DiT#12772
Disable SageAttention for Hunyuan3D v2.1 DiT#12772comfyanonymous merged 1 commit intoComfy-Org:masterfrom
Conversation
SageAttention's quantized kernels produce NaN in the Hunyuan3D v2.1 diffusion transformer, causing the downstream VoxelToMesh to generate zero vertices and crash in save_glb. Add low_precision_attention=False to both optimized_attention calls in the v2.1 DiT (CrossAttention and Attention classes), following the same pattern used by ACE (ace_step15.py). This makes SageAttention fall back to pytorch attention for Hunyuan3D only, while all other models keep the SageAttention speedup. Root cause: the 3D occupancy/SDF prediction requires higher numerical precision at voxel boundaries than SageAttention's quantized kernels provide. Image and video diffusion tolerate this precision loss. Fixes: Comfy-Org#10943 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis change modifies the HunyuanDIT attention implementation by explicitly setting 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
SageAttention produces NaN in the Hunyuan3D v2.1 DiT, causing empty meshes and a crash in
save_glb:This adds
low_precision_attention=Falseto bothoptimized_attentioncalls in the v2.1 DiT, following the same pattern as ACE Step 1.5 (#12297). SageAttention falls back to pytorch attention for Hunyuan3D only.Diagnosis
Debug logging on the VAE decode input confirmed the DiT output is entirely NaN when SageAttention is active. The 3D occupancy prediction requires higher precision at voxel boundaries than SageAttention's quantized kernels provide. Image/video diffusion tolerates this precision loss; 3D does not.
Fixes #10943
Test plan
--use-sage-attentionenabled🤖 Generated with Claude Code