Skip to content

Olive-ai 0.9.2

Choose a tag to compare

@xiaoyu-work xiaoyu-work released this 07 Aug 17:52
· 2 commits to rel-0.9.2 since this release

New Features:

  • Selective Mixed Precision. (#1898)
  • Native GPTQ Implementation with support for Selective Mixed Precision. (#1949)
  • Blockwise RTN Quantization for ONNX models. (#1899)
  • Ability to add custom metadata in ONNX model. (#1900)
  • New simplified olive optimize CLI command and the olive.quantize() Python API for effortless model optimization with minimal developer input. See CLI usage and Python API docs for more details. (#1996)
  • New command line olive run-pass provides advanced users ability to run individual passes. (#1904)

New Integrations

  • GPTQModel. (#1999)
  • AIMET (#2028). This is a work in progress.
  • ONNX model support while targeting OpenVINO. (#2019)
  • QuarkQuantization: AMD Quark quantization for LLMs. (#2010)
  • VitisGenerateModelLLM for optimized LLM model generation for Vitis AI Execution Provider. (#2010)

Improvements

  • New graph surgeries including dla transformers, DecomposeRotaryEmbedding and DecomposeQuickGelu. (#2018, #1972, #2000)
  • Exposed WorkflowOutput in Python API and added unified APIs for CLI commands. (#1907)
  • Refactored Docker system for simplified setup and execution. (#1990)
  • ExtractAdapters:
    • Added support for DORA and LoHA adapters. (#1611)
  • NVMO quantization:
    • Exposed more configurable parameters: nodes_to_exclude, save_external_data, calibration_params, calibration_providers and int4_block_size support. Add RTN algorithm. (#2004, #1985)
  • OnnxPeepholeOptimizer:
    • Removed fuse_transpose_qat and patch_unsupported_argmax_operator. (#1976)

Deprecation

Azure ML will be deprecated in the next release, including:

  • Azure ML system
  • Azure ML workspace model
  • Remote workflow

Recipes Migration

All recipes are being migrated to the olive-recipes repository. New recipes will be added and maintained there going forward.