Skip to content

Latest commit

 

History

History
192 lines (158 loc) · 10.2 KB

File metadata and controls

192 lines (158 loc) · 10.2 KB

Summary

Update the Rust specification to allow floating-point operations to provide more precision than specified, but not less precision; this allows many safe optimizations. Specify robust mechanisms to disable this behavior.

Motivation

Some platforms provide instructions to run a series of floating-point operations quickly, such as fused multiply-add instructions; using these instructions can provide performance wins up to 2x or more. These instructions may provide more precision than required by IEEE floating-point operations, such as by doing multiple operations before rounding or losing precision. Similarly, high-performance floating-point code could perform multiple operations with higher-precision floating-point registers before converting back to a lower-precision format.

In general, providing more precision than required will only bring a calculation closer to the mathematically precise answer, never further away.

This RFC proposes allowing floating-point types to perform intermediate calculations using more precision than the type itself, as long as they provide at least as much precision as the IEEE 754 standard requires.

See the prior art section for precedent in several other languages and compilers.

Guide-level explanation

Floating-point operations in Rust have a guaranteed minimum accuracy, which specifies how far the result may differ from an infinitely accurate, mathematically exact answer. The implementation of Rust for any target platform must provide at least that much accuracy. In some cases, Rust can perform operations with higher accuracy than required, and doing so provides greater performance (such as by removing intermediate rounding steps).

A note for users of other languages: this is not the equivalent of the "fast math" option provided by some compilers. Unlike such options, this behavior will never make any floating-point operation less accurate, but it can make floating-point operations more accurate, making the result closer to the mathematically exact answer.

Due to differences in hardware, in platform libm implementations, and various other factors, Rust cannot fully guarantee identical results on all target platforms. (Doing so on all platforms would incur a massive performance loss.) However, with some additional care, applications desiring cross-platform identical results can potentially achieve that on multiple target platforms. In particular, applications prioritizing identical, portable results across two or more target platforms can disable extra floating-point precision entirely.

Reference-level explanation

Currently, Rust's specification for floating-point types states only that:

The IEEE 754-2008 "binary32" and "binary64" floating-point types are f32 and f64, respectively.

This RFC proposes updating that definition as follows:

The f32 and f64 types represent the IEEE 754-2008 "binary32" and "binary64" floating-point types. Operations on those types must provide at least as much precision as the IEEE standard requires; such operations may provide more precision than the standard requires, such as by doing a series of operations with higher precision before storing a value of the desired precision.

rustc should provide a codegen (-C) option to disable this behavior, such as -C extra-fp-precision=off; compiling with this option will disable extra precision in all crates compiled into an application. (Cargo should provide a means of specifying this option.) Rust should also provide attributes to disable this behavior from within code, such as #[extra_fp_precision(off)]; this attribute will disable extra precision within the module or function it is applied to. On platforms that do not currently implement disabling extra precision, the codegen option and attribute should produce an error (not a warning), to avoid surprises.

In addition, because this change makes extra floating-point precision visible on more platforms, the Rust release notes, documentation, and similar channels should explicitly discuss the issue of extra floating-point precision and how to disable it. Furthermore, this change should not become part of a stable Rust release until at least eight stable releases after it first becomes implemented in the nightly compiler.

Drawbacks

If Rust already provided bit-for-bit identical floating-point computations across platforms, then this change could potentially allow floating-point computations to differ (in the amount of additional accuracy beyond the standards requirements) by platform, enabled target features (e.g. instruction sets), or optimization level.

However, standards-compliant implementations of operations on floating-point values can and do already vary slightly by platform, sufficiently so to produce different binary results; in particular, floating-point operations in Rust can already produce more precise results depending on target platform, optimization level, the target's libm library, and the version of the target libm. As with that existing behavior, this proposal can never make results less accurate, it can only make results more accurate. Nonetheless, this change potentially introduces such variations on target platforms that did not previously have them.

Rationale and alternatives

For the attribute and codegen option, we could allow code to opt in via attribute even if disabled via codegen, and then provide a force-off codegen option to override that. This would have two serious downsides, however: it would propagate the perception of extra floating-point precision as an unsafe optimization that requires opting into, and it would make life more difficult for people who wish to opt out of this behavior and attempt to achieve identical results on multiple target platforms. This RFC recommends the simpler approach of not providing an enablement option via attribute, such that the codegen option always force-disables extra precision everywhere.

We could provide an option to enable extra accuracy for the default floating-point types, but disable it by default. This would leave the majority of floating-point code unable to use these optimizations, however; defaults matter, and the majority of code seems likely to use the defaults. In addition, permitting extra floating-point precision by default would match the existing behavior of Rust, and would allow the Rust compiler to assume that code explicitly disabling extra precision has a specific requirement to do so and depends on that behavior. Nonetheless, this alternative would still provide the option to produce more optimized code, making it preferable over doing nothing. This alternative would necessitate respecifying the codegen option and attribute to support enabling it, as well as having a force-off codegen option to override enablement via the attribute.

We could provide a separate set of types and allow extra accuracy in their operations; however, this would create API incompatibilities between floating-point functions, and the longer, less-well-known types seem unlikely to see widespread use. Furthermore, allowing or disallowing extra accuracy seems more closely a property of the calculation than a property of the type.

We could provide additional methods for floating-point operations that allow passing additional flags, including floating-point contraction. The compiler could then fuse and otherwise optimize such operations. However, this would make optimized floating-point code substantially less ergonomic, due to the inability to use operators. To enable operators, we could additionally implement wrapper types, as above, with the same upsides and downsides.

We could do nothing, and require code to use a.mul_add(b, c) for optimization; however, this would not allow for similar future optimizations, and would not allow code to easily enable this optimization without substantial code changes.

We could narrow the scope of optimization opportunities to only include floating-point contraction but not any other precision-increasing operations. See the future possibilities section for further discussion on this point.

Prior art

This has precedent in several other languages and compilers:

  • C11 allows extra floating-point precision with the STDC FP_CONTRACT pragma enabled, and the default state of that pragma is implementation-defined. GCC, ICC, MSVC, and some other C compilers enable this behavior by default; Clang disables it by default, though some downstream users of Clang re-enable it system-wide.

  • The C++ standard states that "The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby."

  • The Fortran standard states that "the processor may evaluate any mathematically equivalent expression", where "Two arithmetic expressions are mathematically equivalent if, for all possible values of their primaries, their mathematical values are equal. However, mathematically equivalent arithmetic expressions may produce different computational results."

Future possibilities

The initial implementation of this RFC can simply enable floating-point contraction within LLVM (and equivalent options in future codegen backends). However, this RFC also allows other precision-increasing optimizations; in particular, this RFC would allow the implementation of f32 or future f16 formats using higher-precision registers, without having to apply rounding after each operation.