RoPE positions are never set


I am looking into how RoPE positions are handled. I noticed that the models support a `positions` field:

[https://github.com/pytorch/torchtitan/blob/5732118a65c4bd4c86ee752b0e20c4411064568f/torchtitan/models/qwen3/model.py#L57-L63](https://github.com/pytorch/torchtitan/blob/5732118a65c4bd4c86ee752b0e20c4411064568f/torchtitan/models/qwen3/model.py#L57-L63)

However, when examining the dataloader and dataset implementation, it appears that `positions` is always `None`.

If document packing is used, this could lead to misaligned RoPE embeddings across documents. Specifically, the first token of a packed document might not receive the first positional embedding, because the position index continues from the previous document in the packed sequence. As a result, the RoPE embedding applied to the first token of a document may not correspond to position 0 of that document.

Is this expected behavior, or should positions be reset per document when packing is enabled?



	def forward(
	self,
	x: torch.Tensor,
	freqs_cis: torch.Tensor,
	attention_masks: AttentionMasksType \| None,
	positions: torch.Tensor \| None = None,
	):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RoPE positions are never set #2559

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RoPE positions are never set #2559

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions