Configurable Z Loss by francesco-bertolotti · Pull Request #2576 · pytorch/torchtitan

francesco-bertolotti · 2026-03-14T14:08:08Z

Overview

Following #2523, this PR introduces an initial implementation of z-loss and includes some refactoring to make the loss configuration more flexible.

Z-Loss

I added z-loss support to the cross-entropy loss. The implementation is inspired by the one used in OLMo:
https://github.com/allenai/OLMo-core/blob/main/src/olmo_core/nn/cross_entropy_loss.py.

If z_loss_weight is set to a value different from 0, the z-loss is computed and added to the cross-entropy loss, scaled by z_loss_weight.

Refactoring

I also refactored the loss configuration so it can be defined via the CLI or through TOML configuration files. This is just a proposal and can be adapted if it does not align with the Torchtitan design.

The current setup allows configuring multiple loss types (currently MSE and CrossEntropy). Each loss is defined as a configurable object with the following fields:

enable: whether the loss is active (exactly one loss must be enabled)
compile: if true, the loss module is compiled with torch.compile

Additional options are available for specific losses:

CrossEntropyLoss
- z_loss_weight (default: 0.0)
- ignore_index (default: -100)

Both CrossEntropy and MSE return a LossOutput object containing:

main: the loss used for gradient computation (the one .backward() should be called on)
aux: a dictionary containing auxiliary values intended only for logging

For example, the CrossEntropy loss populates LossOutput.aux with both the unscaled z-loss and the raw cross-entropy loss.

francesco-bertolotti added 4 commits March 10, 2026 18:43

tmp

4f26eaf

Merge branch 'main' into f14-z-loss

293cbf8

loss configurable

d1de307

pre-commit

c11d722

francesco-bertolotti requested review from fegin, tianyu-l, wconstab and wwwjn as code owners March 14, 2026 14:08

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable Z Loss#2576

Configurable Z Loss#2576
francesco-bertolotti wants to merge 4 commits intopytorch:mainfrom
francesco-bertolotti:f14-z-loss

francesco-bertolotti commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

francesco-bertolotti commented Mar 14, 2026

Overview

Z-Loss

Refactoring

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant