[Question] layer wise learning rate with both mae and eva02

* The mae and eva02 are vision transformer.
* I think that the layer wise lr factor function has to be applied both or not.
* however, eva02 apply the layer wise lr factor function in [here](https://github.com/IDEA-Research/detrex/blob/c56d32e3d0262cff9835ebe80a0642965ae0cb3e/projects/dino_eva/configs/dino-eva-02/dino_eva_02_vitdet_l_4attn_1024_lrd0p8_4scale_12ep.py#L52). mae do not apply the layer wise lr factor function in [here](https://github.com/IDEA-Research/detrex/blob/c56d32e3d0262cff9835ebe80a0642965ae0cb3e/projects/dino/configs/dino-vitdet/dino_vitdet_base_4scale_12ep.py#L40).
* I want to know the reason of difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] layer wise learning rate with both mae and eva02 #375

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] layer wise learning rate with both mae and eva02 #375

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions