Olive-ai 0.9.2

xiaoyu-work released this 07 Aug 17:52

· 2 commits to rel-0.9.2 since this release

b2d32b2

New Features:

Selective Mixed Precision. (#1898)
Native GPTQ Implementation with support for Selective Mixed Precision. (#1949)
Blockwise RTN Quantization for ONNX models. (#1899)
Ability to add custom metadata in ONNX model. (#1900)
New simplified olive optimize CLI command and the olive.quantize() Python API for effortless model optimization with minimal developer input. See CLI usage and Python API docs for more details. (#1996)
New command line olive run-pass provides advanced users ability to run individual passes. (#1904)

New Integrations

GPTQModel. (#1999)
AIMET (#2028). This is a work in progress.
ONNX model support while targeting OpenVINO. (#2019)
QuarkQuantization: AMD Quark quantization for LLMs. (#2010)
VitisGenerateModelLLM for optimized LLM model generation for Vitis AI Execution Provider. (#2010)

Improvements

New graph surgeries including dla transformers, DecomposeRotaryEmbedding and DecomposeQuickGelu. (#2018, #1972, #2000)
Exposed WorkflowOutput in Python API and added unified APIs for CLI commands. (#1907)
Refactored Docker system for simplified setup and execution. (#1990)
ExtractAdapters:
- Added support for DORA and LoHA adapters. (#1611)
NVMO quantization:
- Exposed more configurable parameters: nodes_to_exclude, save_external_data, calibration_params, calibration_providers and int4_block_size support. Add RTN algorithm. (#2004, #1985)
OnnxPeepholeOptimizer:
- Removed fuse_transpose_qat and patch_unsupported_argmax_operator. (#1976)

Deprecation

Azure ML will be deprecated in the next release, including:

Azure ML system
Azure ML workspace model
Remote workflow

Recipes Migration

All recipes are being migrated to the olive-recipes repository. New recipes will be added and maintained there going forward.

Assets 3