MEGAPAK image for ComfyUI

Using PyTorch 2.9.1 and CUDA 12.8

MEGAPAK uses the same base mechanism as the slim images. The key differences are:

Includes 40+ custom nodes. See the full list.
Includes CUDA development kit for compiling PyTorch C++ extensions, .cu files, etc.
Includes performance optimization libraries such as Nunchaku and SageAttention (powerful but may have compatibility issues).
Includes additional tools and dependencies.

What’s special about this cu128-megapak-pt29 image:

Pinned to PyTorch 2.9.1 and CUDA 12.8.
Pre-installed:
- SageAttention 2.2.0
- SpargeAttention
- FlashAttention 2.8.3
- Nunchaku
With:
- Python 3.12
- GCC 11
- glibc 2.38 (from openSUSE Leap 15.6)

Usage

Please successfully run the slim image before attempting the megapak image. The prerequisites/setup sections are omitted from this document.

Run with Docker

mkdir -p \
  storage \
  storage-models/models \
  storage-models/hf-hub \
  storage-models/torch-hub \
  storage-user/input \
  storage-user/output \
  storage-user/workflows

docker run -it --rm \
  --name comfyui-megapak \
  --runtime nvidia \
  --gpus all \
  -p 8188:8188 \
  -v "$(pwd)"/storage:/root \
  -v "$(pwd)"/storage-models/models:/root/ComfyUI/models \
  -v "$(pwd)"/storage-models/hf-hub:/root/.cache/huggingface/hub \
  -v "$(pwd)"/storage-models/torch-hub:/root/.cache/torch/hub \
  -v "$(pwd)"/storage-user/input:/root/ComfyUI/input \
  -v "$(pwd)"/storage-user/output:/root/ComfyUI/output \
  -v "$(pwd)"/storage-user/workflows:/root/ComfyUI/user/default/workflows \
  -e CLI_ARGS="" \
  yanwk/comfyui-boot:cu128-megapak-pt29

Run with Podman

mkdir -p \
  storage \
  storage-models/models \
  storage-models/hf-hub \
  storage-models/torch-hub \
  storage-user/input \
  storage-user/output \
  storage-user/workflows

podman run -it --rm \
  --name comfyui-megapak \
  --device nvidia.com/gpu=all \
  --security-opt label=disable \
  -p 8188:8188 \
  -v "$(pwd)"/storage:/root \
  -v "$(pwd)"/storage-models/models:/root/ComfyUI/models \
  -v "$(pwd)"/storage-models/hf-hub:/root/.cache/huggingface/hub \
  -v "$(pwd)"/storage-models/torch-hub:/root/.cache/torch/hub \
  -v "$(pwd)"/storage-user/input:/root/ComfyUI/input \
  -v "$(pwd)"/storage-user/output:/root/ComfyUI/output \
  -v "$(pwd)"/storage-user/workflows:/root/ComfyUI/user/default/workflows \
  -e CLI_ARGS="" \
  docker.io/yanwk/comfyui-boot:cu128-megapak-pt29

CLI_ARGS - Attention Selection

args	description
--use-sage-attention	Use SageAttention. Keep current config for xFormers.
--use-flash-attention	Use FlashAttention. Keep current config for xFormers.
--use-pytorch-cross-attention	Use PyTorch’s built-in cross-attention. Disable xFormers.

Only one attention implementation can be selected at a time. If none is specified, xFormers is enabled by default.
For example, you may want to use one of these:
- --use-flash-attention
- --use-flash-attention --disable-xformers
- --use-sage-attention
- --use-sage-attention --disable-xformers
- --use-pytorch-cross-attention
- Leave empty (will use xFormers by default).

Compatibility (only applies to this image)

GPU Architecture	Blackwell	Hopper	Ada Lovelace	Ampere	Turing	Volta
Example GPU	RTX 5090	H100	RTX 4090	RTX 3090	RTX 2080 GTX 1660	TITAN V
SageAttention	✔️	❌	✔️	✔️	❌	❌
FlashAttention	✔️	✔️	✔️	✔️	❌	❌
xFormers	✔️	✔️	✔️	✔️	✔️	❌
PyTorch Native	✔️	✔️	✔️	✔️	✔️	✔️

Note the xFormers compatibility issues on Blackwell GPUs were fixed in this version. Now xFormers is enabled by default.

CLI_ARGS - Frequently Used

args	description
--disable-xformers	Disable xFormers.
--fast	Enable experimental optimizations. (e.g. float8_e4m3fn matrix multiplication on Ada Lovelace and later GPUs). Might lower image quality. Turn it off if you want stability over speed.
--disable-smart-memory	Force ComfyUI to offload models from VRAM to RAM more frequently. Slows performance but reduce memory leaks.
--lowvram	Force ComfyUI to split the model (UNET) into parts to use less VRAM, at the cost of speed. Use only if your GPU has less than 6 GB of VRAM.
--novram	Use system RAM only, no VRAM at all. Very slow.
--cpu	Run on CPU. Very slow. Used for testing purposes.

More CLI_ARGS available at ComyfyUI’s cli_args.py.

Environment Variables Reference

Variable	Example Value	Memo
HTTP_PROXY HTTPS_PROXY	http://localhost:1081 http://localhost:1081	Set HTTP proxy. Works the same as `set-proxy.sh`.
PIP_INDEX_URL	'https://pypi.org/simple'	Set mirror site for Python Package Index.
HF_ENDPOINT	'https://huggingface.co'	Set mirror site for HuggingFace Hub.
HF_TOKEN	'hf_your_token'	Set HuggingFace Access Token. More info
HF_XET_HIGH_PERFORMANCE	1	Enable HuggingFace Hub’s high performance mode. Only make sense if you have >5Gbps and VERY STABLE connection (e.g. cloud server). More info
TORCH_CUDA_ARCH_LIST	8.6 or '7.0;7.5;8.0;8.6;9.0;10.0;12.0+PTX'	Build target for PyTorch and its extensions. For most users, no setup is needed as it will be automatically selected on Linux. When needed, you only need to set one build target just for your GPU. More info
CMAKE_ARGS	'-DBUILD_opencv_world=ON -DWITH_CUDA=ON -DCUDA_FAST_MATH=ON -DWITH_CUBLAS=ON -DWITH_NVCUVID=ON'	Build options for CMAKE projects using CUDA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MEGAPAK image for ComfyUI

Usage

Run with Docker

Run with Podman

CLI_ARGS - Attention Selection

Compatibility (only applies to this image)

CLI_ARGS - Frequently Used

Environment Variables Reference

Uh oh!

FilesExpand file tree

README.adoc

Latest commit

History

README.adoc

File metadata and controls

MEGAPAK image for ComfyUI

Usage

Run with Docker

Run with Podman

CLI_ARGS - Attention Selection

Compatibility (only applies to this image)

CLI_ARGS - Frequently Used

Environment Variables Reference