v0.1.0 — Hanzo Engine

zeekay released this 25 Feb 08:11

08e95f6

Initial release of Hanzo Engine — high-performance cloud inference engine.

Features

60+ model architectures (Llama, Qwen, Phi, Gemma, Mistral, etc.)
CUDA 12+, Metal, and CPU backends
Paged attention and continuous batching
Speculative decoding and tensor parallelism
OpenAI-compatible REST API
Docker and Kubernetes native deployment

Links

Documentation: https://engine.hanzo.ai
Docker: ghcr.io/hanzoai/engine

Assets 2