🚀 The feature, motivation and pitch
It would be great if vLLM can also support serving GPT-OSS BF16 weights (either upcasted, or finetuned), like this one: https://huggingface.co/unsloth/gpt-oss-20b-BF16/discussions/1
Many people have the needs to finetune or tweak GPT-OSS, but they won't be able to quantize it back to MXFP4 due to the lack of QAT capability
Alternatives
No response
Additional context
No response
Before submitting a new issue...
🚀 The feature, motivation and pitch
It would be great if vLLM can also support serving GPT-OSS BF16 weights (either upcasted, or finetuned), like this one: https://huggingface.co/unsloth/gpt-oss-20b-BF16/discussions/1
Many people have the needs to finetune or tweak GPT-OSS, but they won't be able to quantize it back to MXFP4 due to the lack of QAT capability
Alternatives
No response
Additional context
No response
Before submitting a new issue...