Skip to content

Reduce tiled decode peak memory#13050

Merged
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
kijai:tiled_decode
Mar 19, 2026
Merged

Reduce tiled decode peak memory#13050
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
kijai:tiled_decode

Conversation

@kijai
Copy link
Copy Markdown
Contributor

@kijai kijai commented Mar 18, 2026

Reduces peak RAM usage on tiled VAE decode:

  • Reuse output as accumulator — avoid allocating a separate blend buffer.
  • Single-channel mask and blend weights — feathering is the same across channels, no need to store per-channel.
  • In-place normalization — divide in-place instead of allocating a temporary.

LTX23 test, 2 decode runs at 720p for 601 frames, using spatial tiling:

Before:

I87LTf4qXg

After:

WindowsTerminal_VkOZEjApoQ

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ac9e336a-aa64-4154-83b0-c692b943ef38

📥 Commits

Reviewing files that changed from the base of the PR and between b67ed2a and 193fb9e.

📒 Files selected for processing (1)
  • comfy/utils.py

📝 Walkthrough

Walkthrough

The tiled_scale_multidim function in comfy/utils.py has been modified to change how intermediate accumulators are initialized and processed. The out tensor now references a zeroed slice from the output batch directly rather than allocating a new tensor. The out_div normalizer tensor has its channel dimension reduced to 1 instead of the full output channel count. The mask construction is simplified by explicitly specifying dimensions. The final composition operation is converted to an in-place division instead of slice assignment.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main objective of the PR: reducing peak memory usage in tiled VAE decode operations.
Description check ✅ Passed The description is well-structured, explaining the specific memory optimization techniques used and providing concrete performance metrics with before/after comparisons.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.

OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required.

@comfyanonymous comfyanonymous merged commit fd0261d into Comfy-Org:master Mar 19, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants