Skip to content

Introduce float 8 types#14731

Merged
askhade merged 254 commits into
microsoft:mainfrom
xadupre:f8
May 30, 2023
Merged

Introduce float 8 types#14731
askhade merged 254 commits into
microsoft:mainfrom
xadupre:f8

Conversation

@xadupre

@xadupre xadupre commented Feb 17, 2023

Copy link
Copy Markdown
Member

Description

The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR onnx/onnx#4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8.

  • It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA.
  • It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape
  • It implements Equal(19).
  • Cast, QuantizeLinear, DequantizeLinear operators now support a parameter saturate only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite.
  • QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel

Motivation and Context

Supports latest onnx version.

Fixes AB#15395

@xadupre xadupre requested a review from a team as a code owner February 17, 2023 13:49
Comment thread onnxruntime/test/python/onnxruntime_test_float8.py Fixed
Comment thread onnxruntime/test/python/onnxruntime_test_float8.py Fixed
edgchen1
edgchen1 previously approved these changes May 26, 2023
Comment thread include/onnxruntime/core/framework/float8.h
Comment thread include/onnxruntime/core/framework/float8.h
Comment thread include/onnxruntime/core/framework/float8.h
Comment thread include/onnxruntime/core/framework/float8.h
@askhade askhade merged commit e726151 into microsoft:main May 30, 2023
@snnn

snnn commented Jun 1, 2023

Copy link
Copy Markdown
Contributor

@xadupre, the "Windows GPU Reduced Ops CI Pipeline" fails since this change. Would you please help fix it?

siweic0 pushed a commit to siweic0/onnxruntime-web that referenced this pull request May 9, 2024
### Description
The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ
as described in PR onnx/onnx#4805. It uses CUDA
API to cast float/half to float8 if CUDA>=11.8, a custom implementation
if CUDA<11.8.

* It implements, Cast, QuantizeLinear, DequantizeLinear for all types on
CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA.
* It extends the supported types for control flow operator, Shape,
Reshape, Identity, If, Loop, Scan, Reshape
* It implements Equal(19).
* Cast, QuantizeLinear, DequantizeLinear operators now support a
parameter `saturate` only valid for float 8 types. It is true by
default. In that case, any value out of range is converted into the
maximum float 8 value. If false, it is infinite.
* QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA
(and ROCm by extension), scale = 1D tensor with one scale per channel

### Motivation and Context
Supports latest onnx version.

Fixes
[AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395)

---------

Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
hariharans29 added a commit that referenced this pull request Sep 3, 2025
### Description
onnx/onnx#6318 and
onnx/onnx#6283 added FP4 support to ONNX. This
change introduces the FP4 type in ORT and adds type support to one
relevant operator (`Cast`) as a proof-of-concept for the type
integration into ORT. More op support will be added on a need-basis.

This change took inspiration from the following PRs:

#14731
#22228
#20362 

Some notes:

1) Only `tensor` type gets support for FP4 initially. Secondary types
like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so
as to not introduce unnecessary bloat to the framework without a solid
use-case)

2) Flatbuffer related files receive no updates in this PR

### Motivation and Context
Be able to run FP4 models with ORT
TedThemistokleous pushed a commit to ROCm/onnxruntime that referenced this pull request Oct 1, 2025
onnx/onnx#6318 and
onnx/onnx#6283 added FP4 support to ONNX. This
change introduces the FP4 type in ORT and adds type support to one
relevant operator (`Cast`) as a proof-of-concept for the type
integration into ORT. More op support will be added on a need-basis.

This change took inspiration from the following PRs:

microsoft#14731
microsoft#22228
microsoft#20362

Some notes:

1) Only `tensor` type gets support for FP4 initially. Secondary types
like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so
as to not introduce unnecessary bloat to the framework without a solid
use-case)

2) Flatbuffer related files receive no updates in this PR

Be able to run FP4 models with ORT
TedThemistokleous added a commit to ROCm/onnxruntime that referenced this pull request Oct 17, 2025
* Support fp4 type in ORT  (microsoft#25767)

onnx/onnx#6318 and
onnx/onnx#6283 added FP4 support to ONNX. This
change introduces the FP4 type in ORT and adds type support to one
relevant operator (`Cast`) as a proof-of-concept for the type
integration into ORT. More op support will be added on a need-basis.

This change took inspiration from the following PRs:

microsoft#14731
microsoft#22228
microsoft#20362

Some notes:

1) Only `tensor` type gets support for FP4 initially. Secondary types
like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so
as to not introduce unnecessary bloat to the framework without a solid
use-case)

2) Flatbuffer related files receive no updates in this PR

Be able to run FP4 models with ORT

* [Core] Fix debug node input output compilation after Fp4 support was enabled in ORT (microsoft#25940)

### Description
As title

### Motivation and Context
Follow-up fixes to microsoft#25767

* Link FP4 types between OnnxRT and MIGraphX APIs

Do this so that MIGraphX can take in fp4 types from input/output tensors and then use that to perform an inference via the MIGraphX API.

---------

Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
TedThemistokleous added a commit to ROCm/onnxruntime that referenced this pull request Nov 19, 2025
* Support fp4 type in ORT  (microsoft#25767)

onnx/onnx#6318 and
onnx/onnx#6283 added FP4 support to ONNX. This
change introduces the FP4 type in ORT and adds type support to one
relevant operator (`Cast`) as a proof-of-concept for the type
integration into ORT. More op support will be added on a need-basis.

This change took inspiration from the following PRs:

microsoft#14731
microsoft#22228
microsoft#20362

Some notes:

1) Only `tensor` type gets support for FP4 initially. Secondary types
like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so
as to not introduce unnecessary bloat to the framework without a solid
use-case)

2) Flatbuffer related files receive no updates in this PR

Be able to run FP4 models with ORT

* [Core] Fix debug node input output compilation after Fp4 support was enabled in ORT (microsoft#25940)

### Description
As title

### Motivation and Context
Follow-up fixes to microsoft#25767

* Link FP4 types between OnnxRT and MIGraphX APIs

Do this so that MIGraphX can take in fp4 types from input/output tensors and then use that to perform an inference via the MIGraphX API.

---------

Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
adrastogi pushed a commit that referenced this pull request Jan 5, 2026
### Description
onnx/onnx#6318 and
onnx/onnx#6283 added FP4 support to ONNX. This
change introduces the FP4 type in ORT and adds type support to one
relevant operator (`Cast`) as a proof-of-concept for the type
integration into ORT. More op support will be added on a need-basis.

This change took inspiration from the following PRs:

#14731
#22228
#20362 

Some notes:

1) Only `tensor` type gets support for FP4 initially. Secondary types
like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so
as to not introduce unnecessary bloat to the framework without a solid
use-case)

2) Flatbuffer related files receive no updates in this PR

### Motivation and Context
Be able to run FP4 models with ORT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.