Skip to content

[Question] layer wise learning rate with both mae and eva02 #375

@chagmgang

Description

@chagmgang
  • The mae and eva02 are vision transformer.
  • I think that the layer wise lr factor function has to be applied both or not.
  • however, eva02 apply the layer wise lr factor function in here. mae do not apply the layer wise lr factor function in here.
  • I want to know the reason of difference.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions