Prerequisites
Feature Description
Hello! recently NVIDIA released an embedding model that looks amazing.
I have tried to convert it into gguf using hte classic convert_hf_to_gguf.py and I am getting this message:
python3 convert_hf_to_gguf.py models/nvidia/llama-embed-nemotron-8b/ --outfile llama-embed-f16.gguf
INFO:hf-to-gguf:Loading model: llama-embed-nemotron-8b
INFO:hf-to-gguf:Model architecture: LlamaBidirectionalModel
ERROR:hf-to-gguf:Model LlamaBidirectionalModel is not supported
Is there any plan to support this architecture in the next weeks?
Thanks in advance.
Motivation
This model seems to be the best FOSS model so it would be fantastic to be supported by llama.cpp
Possible Implementation
No response
Prerequisites
Feature Description
Hello! recently NVIDIA released an embedding model that looks amazing.
I have tried to convert it into gguf using hte classic
convert_hf_to_gguf.pyand I am getting this message:Is there any plan to support this architecture in the next weeks?
Thanks in advance.
Motivation
This model seems to be the best FOSS model so it would be fantastic to be supported by llama.cpp
Possible Implementation
No response