Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

common : gpt-oss handle builtins and unsolicited tool calls testing Everything test related
#21213 opened Mar 31, 2026 by aldehir Loading…
opencl: fix leak in Adreno q8_0 path ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#21212 opened Mar 31, 2026 by lhez Draft
vendor : update BoringSSL to 0.20260327.0
#21211 opened Mar 31, 2026 by angt Loading…
arg: fix incorrect default for backend sampling
#21210 opened Mar 31, 2026 by Galunid Loading…
CI: Enable CPU and Vulkan ARM64 Release devops improvements to build systems and github actions
#21207 opened Mar 31, 2026 by ehfd Loading…
CANN: Add suport for Qwen35 ops Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#21204 opened Mar 31, 2026 by hipudding Draft
server: respect the ignore eos flag examples python python script changes server
#21203 opened Mar 31, 2026 by ykhrustalev Loading…
[SYCL] Enhance flash-attention performance ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#21185 opened Mar 30, 2026 by arthw Loading…
tests: allow exporting graph ops from HF file without downloading weights testing Everything test related
#21182 opened Mar 30, 2026 by 0cc4m Loading…
Add API key server support with optional arguments --api-key and --ju… examples python python script changes
#21180 opened Mar 30, 2026 by gelim Loading…
common : init in params parser, add Windows UTF-8 support examples server testing Everything test related
#21176 opened Mar 30, 2026 by angt Loading…
ggml-cuda: fix ROCm multi-GPU illegal memory access in recurrent state restore ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#21170 opened Mar 30, 2026 by uaruss Loading…
ggml-cuda: ds_read_b128 for q4_0 and q4_1 mmq kernels ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#21168 opened Mar 30, 2026 by iacopPBK Loading…
contrib : clarify code origin guidelines
#21165 opened Mar 29, 2026 by ddh0 Loading…
cpp: Adding new arch RUGPT3XL model Model specific python python script changes
#21161 opened Mar 29, 2026 by EvilFreelancer Loading…
Cross-backend profiler Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Ascend NPU issues specific to Ascend NPUs documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning Hexagon IBM zDNN issues specific to IBM zDNN Accelerator Nvidia GPU Issues specific to Nvidia GPUs OpenCL Issues specific to the OpenCL backend OpenVINO python python script changes SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Vulkan Issues specific to the Vulkan backend WebGPU
#21160 opened Mar 29, 2026 by pwilkin Draft
[CUDA ] Write an optimized flash_attn_stream_k_fixup kernel ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#21159 opened Mar 29, 2026 by gaugarg-nv Loading…
ggml-cpu: fix fallback for RVV kernels without zvfh ggml changes relating to the ggml tensor library for machine learning
#21157 opened Mar 29, 2026 by taimur-10x Loading…
examples : add llama-eval examples python python script changes
#21152 opened Mar 29, 2026 by ggerganov Draft
5 tasks
ProTip! Updated in the last three days: updated:>2026-03-28.