Skip to content

chore: release 0.4.22#207

Closed
github-actions[bot] wants to merge 1 commit intomainfrom
release-please--branches--main
Closed

chore: release 0.4.22#207
github-actions[bot] wants to merge 1 commit intomainfrom
release-please--branches--main

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot commented Mar 4, 2026

🚀 Release ${version}

0.4.22 (2026-03-04)

Features

  • add --jinja flag for tool/function calling support (#162) (47624ca)
  • add 32B models to catalog with --context flag (#88) (6c06602)
  • add air-gapped deployment support for local model paths (#85) (31fe8d0)
  • Add benchmark command and reorganize documentation (58307be)
  • Add benchmark command and reorganize documentation (ac8888e), closes #6
  • add custom CA support and fix deprecated image tags (#124) (5ec912e)
  • add GPU contention visibility, queue position, and priority classes (#81) (c0220e5)
  • add GPU observability config and Grafana dashboard (#105) (571643f)
  • Add Helm chart for easy installation (5718804)
  • Add Helm chart for easy installation with comprehensive CI testing (3ea3bfd), closes #9
  • add license compliance scanning for GGUF models (#188) (c26400a)
  • Add Metal GPU support for macOS (Apple Silicon) (f673c26), closes #33
  • Add model catalog with 10 pre-configured models (404d722)
  • Add model catalog with 10 pre-configured models (Phase 1) (0fd969a)
  • Add persistent model cache to avoid re-downloading (83f844f), closes #52
  • add Prometheus metrics, OpenTelemetry tracing, and inference observability (#189) (c653ff1)
  • add PVC inspection to cache list for orphaned entry detection (#183) (2723d92)
  • Add Release Please automation and version-agnostic docs (dc2d54e)
  • agent: add --host-ip flag for remote K8s cluster support (#155) (b425569)
  • agent: add structured zap logging to metal agent (#164) (e9d143c)
  • cli: add comprehensive benchmark test suites and sweeps (#107) (323a28a)
  • cli: add stress testing mode to benchmark command (#104) (530c82e)
  • controller: make init container image configurable (#128) (38ccdf0)
  • deps: upgrade to Kubernetes 1.35 and controller-runtime v0.23.1 (#175) (3c323f4)
  • expose llama.cpp parallel slots in InferenceService CRD (#133) (cae7b52)
  • gguf: add native Go GGUF parser with CRD integration and CLI inspect (#140) (9d96ed4)
  • helm: Add image digest support for production deployments (a38801d)
  • helm: add optional NetworkPolicy for controller manager (#135) (8d61ce3)
  • Implement automatic port forwarding for benchmark command (472b3ae)
  • inference: add flashAttention and contextSize to sample manifest (914c929), closes #145
  • Multi-GPU support with layer-based sharding (#47) (4797609)
  • Persistent model cache with per-namespace PVC support (ab04261)
  • Set up Helm repository on GitHub Pages (8d62737)
  • Support configurable context size for llama.cpp server (#73) (6f8e04b)
  • Support per-namespace model cache PVCs (c3cb891)
  • update model catalog with DeepSeek R1 and refresh stale entries (#131) (89eb5a6)

Bug Fixes

  • Add cacheKey to CRD and restrict cache to llmkube-system namespace (464c23d)
  • Add CRD keep policy and improve security test reliability (ff32296)
  • Add Helm chart publishing to release workflow (8baf9c4)
  • Add Helm chart publishing to release workflow (03bab72)
  • Add Homebrew archive IDs and v0.3.0 release notes (cea933b)
  • Address lint issues in benchmark command (bf80610)
  • Address linter errors in catalog implementation (8932e4f)
  • Address linter issues in Metal agent code (3f1f678)
  • agent: filter InferenceServices by Metal accelerator type (#157) (5737bb7)
  • agent: read contextSize from InferenceService CRD (#160) (17f58d4)
  • agent: unregister service endpoints on metal process delete (#168) (147b9bc)
  • ci: use go-version-file in release workflows (3b14fc8)
  • Clean up release process - single release with proper notes (#66) (4deae85)
  • cli: use numeric comparison for version checking (#109) (05e0025)
  • controller: Add Model watch to InferenceService controller (cb4e201)
  • controller: use fully qualified image names for curl (#121) (213660b)
  • Correct CLI binary path in E2E tests (41af555)
  • correct Metal quickstart docs for selectorless services (#173) (89471ec)
  • Don't mark Helm chart release as latest (#70) (761b154)
  • enable controller metrics endpoint in Helm chart (#195) (70940af)
  • Fix GoReleaser Homebrew tap configuration for v0.3.0 (4e95c04)
  • Further increase Helm CI timeout and readiness probe delay (5453d66)
  • Further increase Helm CI timeout and readiness probe delay (fd577d3)
  • Handle resp.Body.Close error in version check (linter) (fb3adf5)
  • Increase Helm chart CI timeout from 2m to 5m (7a08b45)
  • Increase Helm chart CI timeout from 2m to 5m (ced2210)
  • inference: pass value to --flash-attn for newer llama.cpp versions (#148) (25e08d0)
  • InferenceService stuck in Pending when Model becomes Ready (4d20aec)
  • Metal agent production fixes and testing improvements (8744c7b)
  • prevent command injection in init container shell commands (#172) (3aa9cc3)
  • prevent model re-download of cached models after helm upgrade (#203) (a8f9a88)
  • remove mutable latest tags and pin container images (#174) (3c4569a)
  • Resolve Helm chart CI test failures (9919696)
  • Resolve staticcheck SA5011 lint errors and update CONTRIBUTING.md (#60) (c0b5824)
  • Sanitize Service names for DNS-1035 compliance (v0.3.3) (db81990)
  • Sanitize Service names to comply with DNS-1035 requirements (b431986)
  • Set empty component to prevent llmkube- prefix in releases (#68) (45b61c6)
  • Skip containerized Deployment for Metal accelerator and add version check (d300e64)
  • Skip containerized Deployment for Metal accelerator and add version check (8dab955)
  • Suppress Endpoints API deprecation warnings (e70a4b3)
  • Trigger GoReleaser and Helm release from Release Please workflow (#64) (9a37a77)
  • Update operator deployment to use correct container image (00fee75)
  • Update operator deployment to use correct container image (4c67a78)
  • Update version.go to 0.2.1 and add automation for future releases (8dd613d)
  • Update version.go to 0.2.1 and add automation for future releases (2ff68bd)
  • use Recreate strategy for GPU workloads to prevent rolling update deadlock (#196) (2e45181)
  • Use simple v* tag format for releases (#62) (bda9f19)
  • Use workspace path for kubeconform validation (fc066d8)

Documentation

  • add Apple Silicon Metal option to bug report template (#169) (e7689d8)
  • Add CLI option to quick start, keep kubectl as fallback (f6829ee)
  • add community standards and security policy (#92) (e7c9cad)
  • add getting started video to README (#76) (ceb83d7)
  • Add Metal Agent (Apple Silicon) support to README (#151) (3579426)
  • Add release notes for v0.3.2 (177abf8)
  • Add release notes for v0.3.2 (ca1bb12)
  • Add release notes for v0.4.0 (144b960)
  • Add release notes for v0.4.0 (a61321f)
  • Overhaul README and roadmap for public launch (b42c17e)
  • rewrite README for clarity, positioning, and growth (#190) (a7fc152)
  • Update binary download links to version 0.2.1 (fad530a)
  • Update binary download links to version 0.2.1 (63bb0fa)
  • update documentation for v0.4.9 GPU scheduling features (#83) (0934e8f)
  • Update Helm installation to use GitHub Pages repository (477e037)
  • Update MODEL-CACHE.md for per-namespace PVC pattern (0be3f46)
  • update README and Metal Agent guide for remote K8s architecture (#156) (79145b2)

This PR was generated with Release Please. See documentation.

@Defilan
Copy link
Copy Markdown
Member

Defilan commented Mar 5, 2026

Closing stale release-please PR. v0.5.0 was already released. Release-please will create a fresh PR on the next qualifying commit.

@Defilan Defilan closed this Mar 5, 2026
@Defilan Defilan deleted the release-please--branches--main branch March 5, 2026 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

1 participant