Auto-enable padding-free SFT #3672

djsaunde · 2025-12-02T19:28:37Z

This PR is built on top of #3566, and should be merged after it.

This PR auto-sets padding_free=True when applicable (text-only SFT training) and does some work behind the scenes to compute sequence length metadata, so we can use the varlen flash attention kernels / block-diagonal SDPA / xformers kernels.

This gives us throughput gains with sufficiently large models / batch sizes; e.g., a super small model like unsloth/qwen2.5-0.5b requires per_device_train_batch_size = 16 or higher in order to see (significant) throughput gains, while unsloth/llama-3-8b requires only per_device_train_batch_size = 4 or higher. For example, with unsloth/llama-3-8b with per_device_train_batch_size = 8, we observe much faster training (52s padding-free vs 96s not; about 1.85x speedup).

There are very slight loss and gradient norm differences in the padding-free vs. not settings; I think these can be chalked up to the different kernels being used (FA2 varlen vs. dense, block-diagonal vs. causal kernels for xformers / SDPA).

for more information, see https://pre-commit.ci

…to pr/3566

djsaunde · 2025-12-10T02:34:29Z

Closing in favor of #3702.

unsloth/trainer.py

for more information, see https://pre-commit.ci

…djsaunde/unsloth into pr/3672

for more information, see https://pre-commit.ci

djsaunde and others added 30 commits November 23, 2025 11:20

implement (sdpa, xformers, fa2) sample packing

7bd2558

attention dispatching

ebd9c77

ddp working OOTB with CLI

96b06fb

packed SWA and softcap support

cc62c07

enable batch flattening

a6d1fe2

LGPL license headers

7986a09

mask packed sequence boundaries

7e430cf

auto-enable sample packing

f54d7d3

[pre-commit.ci] auto fixes from pre-commit.com hooks

112351d

for more information, see https://pre-commit.ci

Add explicit toggle for sample packing

2b9f5a7

Add explicit toggle for sample packing

6c90169

Merge branch 'main' into pr/3566

ed93d5e

Update __init__.py

9b51675

Update unsloth/kernels/rope_embedding.py

1f26af6

Update unsloth/kernels/rope_embedding.py

21a5cc9

remove grad output clones; restore deleted FastLanguageModel arg

097fe12

fix

fd4323e

restore rope embedding clones

cbcffe1

xformers mask cache

7ca4a33

implement (sdpa, xformers, fa2) sample packing

178dd89

attention dispatching

4afee04

ddp working OOTB with CLI

000dee6

packed SWA and softcap support

c8aff29

enable batch flattening

9d9dfe4

LGPL license headers

392ba01

mask packed sequence boundaries

60a5eb8

auto-enable sample packing

05cfcbd

[pre-commit.ci] auto fixes from pre-commit.com hooks

0886058

for more information, see https://pre-commit.ci

Add explicit toggle for sample packing

aa3743c

Add explicit toggle for sample packing

15df165

danielhanchen and others added 16 commits December 8, 2025 00:02

Add **kwargs

6dd1864

Merge branch 'auto-packing' of https://github.com/djsaunde/unsloth in…

265e831

…to pr/3566

Merge branch 'main' into pr/3566

6a09393

Merge branch 'main' into pr/3566

c3b4a04

Merge branch 'main' into pr/3566

da8c849

add back clobbered

4c46d96

Merge branch 'main' into pr/3566

d110c0a

Update rope_embedding.py

db156b7

Update rope_embedding.py

2b41544

simplify trl warnings filter

66107e3

docstring

7960543

nit

6db6f43

bugfix

2f28474

add padding-free seqlen metadata

66db468

auto-enable padding free

0410c09

gemma2 disable

a69f35b

djsaunde force-pushed the padding-free-seqlen-metadata-v2 branch from d5b342f to a69f35b Compare December 9, 2025 22:03

djsaunde mentioned this pull request Dec 10, 2025

Auto-enable padding-free SFT #3702

Closed

djsaunde closed this Dec 10, 2025

djsaunde reopened this Dec 10, 2025

Merge origin/main into padding-free-seqlen-metadata-v2

c44224a

danielhanchen reviewed Dec 10, 2025

View reviewed changes

unsloth/trainer.py Outdated Show resolved Hide resolved

danielhanchen and others added 6 commits December 9, 2025 21:08

Apply suggestion from @danielhanchen

ee1b262

Update trainer.py

275f17a

[pre-commit.ci] auto fixes from pre-commit.com hooks

7ac623f

for more information, see https://pre-commit.ci

Update trainer.py

f16a129

Merge branch 'padding-free-seqlen-metadata-v2' of https://github.com/…

20b9fd3

…djsaunde/unsloth into pr/3672

[pre-commit.ci] auto fixes from pre-commit.com hooks

707b721

for more information, see https://pre-commit.ci

danielhanchen merged commit 35606da into unslothai:main Dec 10, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Auto-enable padding-free SFT #3672

Auto-enable padding-free SFT #3672

djsaunde commented Dec 2, 2025

Uh oh!

djsaunde commented Dec 10, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Auto-enable padding-free SFT #3672

Auto-enable padding-free SFT #3672

Conversation

djsaunde commented Dec 2, 2025

Uh oh!

djsaunde commented Dec 10, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants