[gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor by anijain2305 · Pull Request #13166 · huggingface/diffusers

anijain2305 · 2026-02-19T23:02:10Z

Once dequantize_gguf_tensor fetches the quant_type attributed from the GGUFParamter tensor subclass, there is no further need of running the actual dequantize operations on the Tensor subclass, we can just convert to plain tensor right away.

This not only makes PyTorch eager faster, but reduces torch.compile tracer compile time from 36 seconds to 10 seconds for auroflow quantized model, because there is lot less code to trace now.

cc @yiyixuxu @sayakpaul

Once dequantize_gguf_tensor fetches the quant_type attributed from the GGUFParamter tensor subclass, there is no further need of running the actual dequantize operations on the Tensor subclass, we can just convert to plain tensor right away. This not only makes PyTorch eager faster, but reduces torch.compile tracer compile time from 36 seconds to 10 seconds, because there is lot less code to trace now.

sayakpaul · 2026-02-20T03:04:55Z

src/diffusers/quantizers/gguf/utils.py

+    # Conver to plain tensor to avoid unnecessary __torch_function__ overhead.
+    tensor = tensor.as_tensor()


Does it have any impact on quality?

DN6

Good catch! Thanks @anijain2305 👍🏽

HuggingFaceDocBuilderDev · 2026-02-20T03:19:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul reviewed Feb 20, 2026

View reviewed changes

sayakpaul requested a review from DN6 February 20, 2026 03:05

DN6 approved these changes Feb 20, 2026

View reviewed changes

DN6 merged commit 01de02e into huggingface:main Feb 20, 2026
10 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor#13166

[gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor#13166
DN6 merged 1 commit intohuggingface:mainfrom
anijain2305:gguf-fix

anijain2305 commented Feb 19, 2026 •

edited

Loading

Uh oh!

sayakpaul Feb 20, 2026

Uh oh!

DN6 left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

		# Conver to plain tensor to avoid unnecessary __torch_function__ overhead.
		tensor = tensor.as_tensor()

Conversation

anijain2305 commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

DN6 left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Feb 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

anijain2305 commented Feb 19, 2026 •

edited

Loading