-
Notifications
You must be signed in to change notification settings - Fork 713
Open
Labels
Description
- I have looked for existing issues (including closed) about this
Bug Report
Tool call from llama.cpp can't be deserialized.
CompletionError: JsonError: data did not match any variant of untagged enum ApiResponse
Reproduction
let client: openai::Client = openai::Client::builder()
.base_url("llama.cpp url")
.build()
.expect("Failed to build client");
let agent = client
.completions_api()
.agent(&config.model.name)
.preamble("You are a helpful assistant with access to tools.")
.tool(HelloWorld)
.build();
match agent.prompt("Call the hello_world tool and give me the output.").await {
Ok(response) => {
println!("\nResponse from model:");
println!(" {}", response);
}
Err(e) => {
eprintln!("Error: {}", e);
}
}the model gives:
{
"choices": [{
"finish_reason": "tool_calls",
"index": 0,
"message": {
"role": "assistant",
"content": "",
"tool_calls": [{ "type": "function", "function": { "name": "hello_world", "arguments": {} }, "id": "xxx" }]
}
}],
"created": 0,
"model": "unsloth/Qwen3-Coder-Next-GGUF:Q8_0",
"system_fingerprint": "b8113-xxxx",
"object": "chat.completion",
"usage": { "completion_tokens": 13, "prompt_tokens": 255, "total_tokens": 268 },
"id": "xxx",
"timings": {
"cache_n": 0,
"prompt_n": 255,
"prompt_ms": 670,
"prompt_per_token_ms": 2.63,
"prompt_per_second": 380,
"predicted_n": 13,
"predicted_ms": 367,
"predicted_per_token_ms": 28,
"predicted_per_second": 35
}
}I get the error:
CompletionError: JsonError: data did not match any variant of untagged enum ApiResponse
Expected behavior
I would be nice if the llama.cpp tool calling would be supported. And i could get the output of the hello_world tool.
Additional context
The problem is probably that llama.cpp is not exactly compatible to the openai API, because it gives the tool call arguments as an map. The openai API expects them to be a JSON string.
However i think the way that llama.cpp is embedding the arguments is more natural (but sadly not fully compatible to the openai API) so i think it would be nice to support both. For the huggingface provider there is already a deserialize_arguments that is handling JSON-objects and stringified JSON-objects.
Reactions are currently unavailable