[AnyPrecision optimizer] consider FP32 defaults, possibly automated via BF16 support check

Enhancement (credit to @rohan-varma): 
"this can be done in a follow up PR, but let's maybe consider not defaulting things to torch.bfloat16 eventually. this is because it might be good to make this optimizer usable out of the box with the defaults on all HW architectures, but only A100 supports bfloat16 well at the moment.

But the downside here would be that the default optimizer won't be too interesting, it'd just be AdamW"

Possible option to accomplish this would be a simple bf16 native support check, and then revert any BF16 defaults to FP32 (and turn off Kahan as well since it would not add benefit). 
Downside dilemma is if you should warn user about this change - positive they know they are not getting BF16 benefits, negative is they may have been aware and don't enjoy one line warning * 128 gpus.  

 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AnyPrecision optimizer] consider FP32 defaults, possibly automated via BF16 support check #59

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AnyPrecision optimizer] consider FP32 defaults, possibly automated via BF16 support check #59

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions