chore: release 0.4.22 by github-actions[bot] · Pull Request #207 · defilantech/LLMKube

github-actions · 2026-03-04T09:41:32Z

🚀 Release ${version}

0.4.22 (2026-03-04)

Features

add --jinja flag for tool/function calling support (#162) (47624ca)
add 32B models to catalog with --context flag (#88) (6c06602)
add air-gapped deployment support for local model paths (#85) (31fe8d0)
Add benchmark command and reorganize documentation (58307be)
Add benchmark command and reorganize documentation (ac8888e), closes #6
add custom CA support and fix deprecated image tags (#124) (5ec912e)
add GPU contention visibility, queue position, and priority classes (#81) (c0220e5)
add GPU observability config and Grafana dashboard (#105) (571643f)
Add Helm chart for easy installation (5718804)
Add Helm chart for easy installation with comprehensive CI testing (3ea3bfd), closes #9
add license compliance scanning for GGUF models (#188) (c26400a)
Add Metal GPU support for macOS (Apple Silicon) (f673c26), closes #33
Add model catalog with 10 pre-configured models (404d722)
Add model catalog with 10 pre-configured models (Phase 1) (0fd969a)
Add persistent model cache to avoid re-downloading (83f844f), closes #52
add Prometheus metrics, OpenTelemetry tracing, and inference observability (#189) (c653ff1)
add PVC inspection to cache list for orphaned entry detection (#183) (2723d92)
Add Release Please automation and version-agnostic docs (dc2d54e)
agent: add --host-ip flag for remote K8s cluster support (#155) (b425569)
agent: add structured zap logging to metal agent (#164) (e9d143c)
cli: add comprehensive benchmark test suites and sweeps (#107) (323a28a)
cli: add stress testing mode to benchmark command (#104) (530c82e)
controller: make init container image configurable (#128) (38ccdf0)
deps: upgrade to Kubernetes 1.35 and controller-runtime v0.23.1 (#175) (3c323f4)
expose llama.cpp parallel slots in InferenceService CRD (#133) (cae7b52)
gguf: add native Go GGUF parser with CRD integration and CLI inspect (#140) (9d96ed4)
helm: Add image digest support for production deployments (a38801d)
helm: add optional NetworkPolicy for controller manager (#135) (8d61ce3)
Implement automatic port forwarding for benchmark command (472b3ae)
inference: add flashAttention and contextSize to sample manifest (914c929), closes #145
Multi-GPU support with layer-based sharding (#47) (4797609)
Persistent model cache with per-namespace PVC support (ab04261)
Set up Helm repository on GitHub Pages (8d62737)
Support configurable context size for llama.cpp server (#73) (6f8e04b)
Support per-namespace model cache PVCs (c3cb891)
update model catalog with DeepSeek R1 and refresh stale entries (#131) (89eb5a6)

Bug Fixes

Add cacheKey to CRD and restrict cache to llmkube-system namespace (464c23d)
Add CRD keep policy and improve security test reliability (ff32296)
Add Helm chart publishing to release workflow (8baf9c4)
Add Helm chart publishing to release workflow (03bab72)
Add Homebrew archive IDs and v0.3.0 release notes (cea933b)
Address lint issues in benchmark command (bf80610)
Address linter errors in catalog implementation (8932e4f)
Address linter issues in Metal agent code (3f1f678)
agent: filter InferenceServices by Metal accelerator type (#157) (5737bb7)
agent: read contextSize from InferenceService CRD (#160) (17f58d4)
agent: unregister service endpoints on metal process delete (#168) (147b9bc)
ci: use go-version-file in release workflows (3b14fc8)
Clean up release process - single release with proper notes (#66) (4deae85)
cli: use numeric comparison for version checking (#109) (05e0025)
controller: Add Model watch to InferenceService controller (cb4e201)
controller: use fully qualified image names for curl (#121) (213660b)
Correct CLI binary path in E2E tests (41af555)
correct Metal quickstart docs for selectorless services (#173) (89471ec)
Don't mark Helm chart release as latest (#70) (761b154)
enable controller metrics endpoint in Helm chart (#195) (70940af)
Fix GoReleaser Homebrew tap configuration for v0.3.0 (4e95c04)
Further increase Helm CI timeout and readiness probe delay (5453d66)
Further increase Helm CI timeout and readiness probe delay (fd577d3)
Handle resp.Body.Close error in version check (linter) (fb3adf5)
Increase Helm chart CI timeout from 2m to 5m (7a08b45)
Increase Helm chart CI timeout from 2m to 5m (ced2210)
inference: pass value to --flash-attn for newer llama.cpp versions (#148) (25e08d0)
InferenceService stuck in Pending when Model becomes Ready (4d20aec)
Metal agent production fixes and testing improvements (8744c7b)
prevent command injection in init container shell commands (#172) (3aa9cc3)
prevent model re-download of cached models after helm upgrade (#203) (a8f9a88)
remove mutable latest tags and pin container images (#174) (3c4569a)
Resolve Helm chart CI test failures (9919696)
Resolve staticcheck SA5011 lint errors and update CONTRIBUTING.md (#60) (c0b5824)
Sanitize Service names for DNS-1035 compliance (v0.3.3) (db81990)
Sanitize Service names to comply with DNS-1035 requirements (b431986)
Set empty component to prevent llmkube- prefix in releases (#68) (45b61c6)
Skip containerized Deployment for Metal accelerator and add version check (d300e64)
Skip containerized Deployment for Metal accelerator and add version check (8dab955)
Suppress Endpoints API deprecation warnings (e70a4b3)
Trigger GoReleaser and Helm release from Release Please workflow (#64) (9a37a77)
Update operator deployment to use correct container image (00fee75)
Update operator deployment to use correct container image (4c67a78)
Update version.go to 0.2.1 and add automation for future releases (8dd613d)
Update version.go to 0.2.1 and add automation for future releases (2ff68bd)
use Recreate strategy for GPU workloads to prevent rolling update deadlock (#196) (2e45181)
Use simple v* tag format for releases (#62) (bda9f19)
Use workspace path for kubeconform validation (fc066d8)

Documentation

add Apple Silicon Metal option to bug report template (#169) (e7689d8)
Add CLI option to quick start, keep kubectl as fallback (f6829ee)
add community standards and security policy (#92) (e7c9cad)
add getting started video to README (#76) (ceb83d7)
Add Metal Agent (Apple Silicon) support to README (#151) (3579426)
Add release notes for v0.3.2 (177abf8)
Add release notes for v0.3.2 (ca1bb12)
Add release notes for v0.4.0 (144b960)
Add release notes for v0.4.0 (a61321f)
Overhaul README and roadmap for public launch (b42c17e)
rewrite README for clarity, positioning, and growth (#190) (a7fc152)
Update binary download links to version 0.2.1 (fad530a)
Update binary download links to version 0.2.1 (63bb0fa)
update documentation for v0.4.9 GPU scheduling features (#83) (0934e8f)
Update Helm installation to use GitHub Pages repository (477e037)
Update MODEL-CACHE.md for per-namespace PVC pattern (0be3f46)
update README and Metal Agent guide for remote K8s architecture (#156) (79145b2)

This PR was generated with Release Please. See documentation.

Defilan · 2026-03-05T03:54:57Z

Closing stale release-please PR. v0.5.0 was already released. Release-please will create a fresh PR on the next qualifying commit.

chore: release 0.4.22

38b6d30

github-actions bot added the autorelease: pending label Mar 4, 2026

Defilan closed this Mar 5, 2026

Defilan deleted the release-please--branches--main branch March 5, 2026 03:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: release 0.4.22#207

chore: release 0.4.22#207
github-actions[bot] wants to merge 1 commit intomainfrom
release-please--branches--main

github-actions bot commented Mar 4, 2026

Uh oh!

Defilan commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions bot commented Mar 4, 2026

🚀 Release ${version}

0.4.22 (2026-03-04)

Features

Bug Fixes

Documentation

Uh oh!

Defilan commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant