UPSTREAM PR #18180: vulkan: fix im2col overflowing maxworkgroupcount by loci-dev · Pull Request #618 · auroralabs-loci/llama.cpp

loci-dev · 2025-12-18T18:44:43Z

Fixes #18164.

loci-review · 2025-12-18T19:34:03Z

Explore the complete analysis inside the Version Insights

Performance Analysis Summary: PR #618

Overview

This PR addresses a Vulkan backend crash when processing large tensors in the im2col operation. The fix clamps workgroup counts to hardware limits and implements grid-stride loops in the shader. Changes are isolated to the Vulkan backend with no impact on CPU inference paths or tokenization functions.

Key Findings

Performance-Critical Areas Impact:

The modifications affect only the Vulkan backend's im2col operation used in convolution preprocessing for vision models. Core inference functions (llama_decode, llama_encode, llama_tokenize) show no changes in response time or throughput. The CPU backend, which handles text-only inference, remains completely unaffected.

Tokens Per Second Impact:

No impact on tokens per second for LLM inference. The tokenization and decode functions execute on CPU backend paths that are unchanged by this PR. Vision model processing may experience 5-10% overhead in overflow scenarios, but this represents enabling execution where crashes previously occurred rather than degrading existing performance.

Power Consumption Analysis:

No measurable power consumption changes for the llama-cli binary during text inference workloads. The Vulkan backend modifications only activate during vision model convolution operations, which are not part of standard LLM token generation pipelines.

Modified Functions:

The changes affect ggml_vk_im2col and ggml_vk_dispatch_pipeline within the Vulkan backend. These functions are not in the critical path for text generation. For typical LLM workloads processing text tokens, the execution flow bypasses these functions entirely, maintaining baseline performance characteristics.

vulkan: fix im2col overflowing maxworkgroupcount

442723c

loci-dev temporarily deployed to PROD__AL_DEMO December 18, 2025 18:44 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 27 times, most recently from e8bf2a6 to 9c8623e Compare December 22, 2025 20:09

loci-dev force-pushed the main branch 19 times, most recently from 048ad94 to 6c1fde6 Compare February 3, 2026 13:32

loci-dev force-pushed the main branch 8 times, most recently from 823244c to bab7d39 Compare February 19, 2026 02:17

loci-dev force-pushed the main branch 3 times, most recently from 9ea4a65 to c001e9f Compare February 22, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

UPSTREAM PR #18180: vulkan: fix im2col overflowing maxworkgroupcount#618

UPSTREAM PR #18180: vulkan: fix im2col overflowing maxworkgroupcount#618
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18180-branch_jeffbolznv-im2col_wglimit

loci-dev commented Dec 18, 2025

Uh oh!

loci-review bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

loci-dev commented Dec 18, 2025

Uh oh!

loci-review bot commented Dec 18, 2025

Performance Analysis Summary: PR #618

Overview

Key Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants