Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Running this code:
model = llama_cpp.Llama ("mxbai-embed-xsmall-v1-q8_0.gguf", embedding = True)
embeddings = model.embed (["Hello", "World"])
used to work in v0.3.14
Current Behavior
The code raises an exception RuntimeError: llama_decode returned -1. The following messages are printed to the console:
init: invalid seq_id[3][0] = 1 >= 1
encode: failed to initialize batch
Environment and Context
llama-cpp-python was compiled in CUDA mode
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce
Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import llama_cpp
>>> model = llama_cpp.Llama ("../models/mxbai-embed-xsmall-v1-q8_0.gguf", embedding = True)
...
>>> embeddings = model.embed (["Hello", "World"])
decode: cannot decode batches with this context (calling encode() instead)
init: invalid seq_id[3][0] = 1 >= 1
encode: failed to initialize batch
llama_decode: failed to decode, ret = -1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../site-packages/llama_cpp/llama.py", line 1108, in embed
decode_batch(s_batch)
File ".../site-packages/llama_cpp/llama.py", line 1045, in decode_batch
self._ctx.decode(self._batch)
File ".../site-packages/llama_cpp/_internals.py", line 327, in decode
raise RuntimeError(f"llama_decode returned {return_code}")
RuntimeError: llama_decode returned -1
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Running this code:
used to work in v0.3.14
Current Behavior
The code raises an exception
RuntimeError: llama_decode returned -1. The following messages are printed to the console:Environment and Context
llama-cpp-python was compiled in CUDA mode
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce