Skip to content

Further Reduce LTX VAE decode peak RAM usage#13052

Merged
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
kijai:ltx2vae_ram
Mar 18, 2026
Merged

Further Reduce LTX VAE decode peak RAM usage#13052
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
kijai:ltx2vae_ram

Conversation

@kijai
Copy link
Copy Markdown
Contributor

@kijai kijai commented Mar 18, 2026

Further reduces LTX2 VAE peak RAM to output level.

  • Make VAE decoder write decoded chunks directly into a pre-allocated output buffer, eliminating intermediate allocations and the full-output torch.cat

  • unpatchify runs per-chunk on GPU instead of on the full output on CPU

  • When the VAE supports decode_output_shape, the caller passes its output buffer directly to the decoder, eliminating the intermediate bf16 buffer entirely

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

The changes introduce buffer-based decoding optimization to the video VAE pipeline. The Decoder class now supports preallocation of output buffers through a new decode_output_shape method and accepts an optional output_buffer parameter in forward_orig, enabling direct writes instead of tensor concatenation. VideoVAE delegates these new capabilities and passes the output_buffer through the decode path. The VAE.decode method detects support for decode_output_shape and conditionally preallocates pixel_samples for direct buffer writes when available, falling back to the original approach otherwise.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main objective: reducing peak RAM usage during LTX VAE decoding, which is the primary focus of all changes.
Description check ✅ Passed The description directly relates to the changeset by detailing the implementation approach: preallocated output buffers, per-chunk unpatchify, and decode_output_shape support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can enforce grammar and style rules using `languagetool`.

Configure the reviews.tools.languagetool setting to enable/disable rules and categories. Refer to the LanguageTool Community to learn more.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/sd.py`:
- Around line 956-964: The code currently assumes that having
first_stage_model.decode_output_shape implies first_stage_model.decode accepts
an output_buffer kwarg, which can raise a TypeError; update the logic around
preallocated/pixel_samples to verify the decode() signature (e.g., via
inspect.signature or a safe trial call) before setting preallocated True and
passing output_buffer to first_stage_model.decode, and if decode() does not
accept output_buffer then fall back to the safe copy path (call decode without
output_buffer and copy into pixel_samples) so that
first_stage_model.decode_output_shape, first_stage_model.decode, pixel_samples,
preallocated and vae_options are handled compatibly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f0bf258-0c0d-4fdf-aa54-c99f3a39d8c9

📥 Commits

Reviewing files that changed from the base of the PR and between b67ed2a and fbc97e0.

📒 Files selected for processing (2)
  • comfy/ldm/lightricks/vae/causal_video_autoencoder.py
  • comfy/sd.py

@comfyanonymous comfyanonymous merged commit 9fff091 into Comfy-Org:master Mar 18, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants