llama : add option to render special/control tokens#6807
Conversation
Performance dropped - maybe generation does not stop properly after the #6745 EOG changes? |
Very likely, because we're using phi-2 model which does not have native support for chatml (so
|
|
I think we are incorrectly using a base model instead of instruction-tuned one for this test: https://huggingface.co/microsoft/phi-2 The |
Ah yeah that's right. We can use dolphin-phi2 then. Here is the link: https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF The |





fix #6770
Setting
special == trueinllama_token_to_piece()will cause special/control tokens' text to be rendered in the output:https://github.com/ggerganov/llama.cpp/blob/1f45c2adc7b10637c2035e622573f1851e403979/llama.h#L827-L837