Hanzo ML is the official ML framework for the Hanzo ecosystem, based on Candle from Hugging Face with optimizations for edge AI, multimodal workloads, and integration with Hanzo Engine.
~/work/hanzo/
├── ml/ # hanzoai/ml - ML framework (based on HF candle)
├── engine/ # hanzoai/engine - Inference engine (based on mistral-rs)
├── jin/ # Jin multimodal models
└── llm/ # LLM Gateway proxy
Add to ~/work/hanzo/engine/Cargo.toml:
[dependencies]
hanzo-ml = { git = "https://github.com/hanzoai/ml", branch = "main" }
hanzo-nn = { git = "https://github.com/hanzoai/ml", branch = "main" }
hanzo-transformers = { git = "https://github.com/hanzoai/ml", branch = "main" }Both projects support consistent feature flags:
[features]
default = ["metal"]
metal = ["hanzo-ml/metal", "hanzo-nn/metal"]
cuda = ["hanzo-ml/cuda"]
mkl = ["hanzo-ml/mkl"]
accelerate = ["hanzo-ml/accelerate"]use hanzo_ml_core::{Device, Tensor};
use hanzo_ml_transformers::models::llama::LlamaConfig;
// Load model using Hanzo ML
let device = Device::new_metal(0)?;
let model = LlamaConfig::load(&device, &config_path)?;
// Use with mistral-rs pipeline
let pipeline = Pipeline::new(model, tokenizer)?;Both frameworks support:
- AFQ (Affine Quantization) - Optimized for Metal/Apple Silicon
- GGUF/GGML - Universal quantization format
- GPTQ/AWQ - GPU-optimized quantization
- In-Situ Quantization (ISQ) - Runtime quantization
cd ~/work/hanzo/ml
git fetch upstream
git merge upstream/main # Merge HF candle updates
cargo test --workspace
git push origin maincd ~/work/hanzo/engine
cargo update hanzo-ml hanzo-nn hanzo-transformers
cargo testcd ~/work/hanzo/engine
cargo run --features metal --release -- \
-i --isq 4 plain -m meta-llama/Llama-3.2-3B-InstructThe framework publishes these crates:
hanzo-ml- Core tensor operationshanzo-nn- Neural network layershanzo-transformers- Transformer modelshanzo-datasets- Dataset utilitieshanzo-ml-pyo3- Python bindings
cd ~/work/hanzo/ml
cargo release --workspace minor
git push --tags- HF Candle:
a2029da3(Jan 2025) - Features Added: SmolLM3, Qwen3 WASM, Mamba2, PaddleOCR-VL
- ✅ Metal backend support
- ✅ AFQ quantization compatibility
- ✅ SIMD optimizations
- ✅ Memory introspection
- 🔄 Jin model integration (in progress)
- Use
AFQ4quantization for best performance - Enable
--features "metal accelerate" - Set group size to 64 for balanced speed/accuracy
- Use
GPTQorAWQquantization - Enable Flash Attention for long sequences
- Use PagedAttention for memory efficiency
- Use
GGUFmodels with appropriate quantization - Enable
mklfeature for Intel optimizations - Consider
accelerateon Apple platforms
# Clean and rebuild
cd ~/work/hanzo/ml
cargo clean
cargo build --workspace
# Check feature alignment
cd ~/work/hanzo/engine
cargo tree | grep hanzo-ml# Metal validation
cd ~/work/hanzo/engine
cargo run --features metal -- --help
# Check device detection
RUST_LOG=debug cargo run --features metal- Model Format Standardization - Universal model interchange
- Joint Training Pipeline - Train models for both frameworks
- Distributed Inference - Multi-device model serving
- WebAssembly Optimization - Browser-based inference
- MCP Integration - Model Context Protocol support
For issues with Hanzo ML integration:
- GitHub Issues: hanzoai/ml
- Engine Issues: hanzoai/engine