Lhotse: add prefetch_factor option to LhotseDataLoadingConfig by XuesongYang · Pull Request #15665 · NVIDIA-NeMo/NeMo

XuesongYang · 2026-05-05T21:23:19Z

Add configurable prefetch_factor for PyTorch DataLoader, allowing users to increase the per-worker prefetch buffer depth to absorb I/O latency spikes from network filesystems. Applies to both single-config and multi-config dataloader paths.

When unset (None), PyTorch's default of 2 is used, preserving existing behavior.

Usage: model.train_ds.prefetch_factor=4

Add configurable prefetch_factor for PyTorch DataLoader, allowing users to increase the per-worker prefetch buffer depth to absorb I/O latency spikes from network filesystems. Applies to both single-config and multi-config dataloader paths. When unset (None), PyTorch's default of 2 is used, preserving existing behavior. Usage: model.train_ds.prefetch_factor=4 Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

copy-pr-bot · 2026-05-05T21:23:22Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copilot

Pull request overview

Adds a new prefetch_factor knob to NeMo’s Lhotse-backed dataloader configuration so users can tune PyTorch DataLoader per-worker prefetch depth to better tolerate I/O latency spikes (e.g., on network filesystems), while keeping existing behavior when unset.

Changes:

Introduced prefetch_factor: int | None = None in LhotseDataLoadingConfig.
Passed prefetch_factor through to torch.utils.data.DataLoader when num_workers > 0 in both single-config and multi-config dataloader creation paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    if shared_opts.num_workers > 0 and shared_opts.get("prefetch_factor") is not None:
+        dloader_kwargs["prefetch_factor"] = shared_opts.prefetch_factor


        pin_memory=config.pin_memory,
    )
+    if config.num_workers > 0 and config.get("prefetch_factor") is not None:
+        dloader_kwargs["prefetch_factor"] = config.prefetch_factor


Copilot AI review requested due to automatic review settings May 5, 2026 21:23

github-actions Bot added the common label May 5, 2026

Copilot started reviewing on behalf of XuesongYang May 5, 2026 21:24 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lhotse: add prefetch_factor option to LhotseDataLoadingConfig#15665

Lhotse: add prefetch_factor option to LhotseDataLoadingConfig#15665
XuesongYang wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
XuesongYang:xueyang/pr-prefetch-factor

XuesongYang commented May 5, 2026

Uh oh!

copy-pr-bot Bot commented May 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if shared_opts.num_workers > 0 and shared_opts.get("prefetch_factor") is not None:
		dloader_kwargs["prefetch_factor"] = shared_opts.prefetch_factor

Conversation

XuesongYang commented May 5, 2026

Uh oh!

copy-pr-bot Bot commented May 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants