Skip to content

fix: correct subnormal float/double decode exponent#318

Open
rophy wants to merge 1 commit intobersler:masterfrom
rophy:fix/subnormal-decode
Open

fix: correct subnormal float/double decode exponent#318
rophy wants to merge 1 commit intobersler:masterfrom
rophy:fix/subnormal-decode

Conversation

@rophy
Copy link
Copy Markdown

@rophy rophy commented Mar 20, 2026

Summary

IEEE 754 subnormal (denormalized) values are decoded with the wrong exponent,
producing approximately half the correct value.

For subnormals (exponent field == 0), the effective exponent should be 1 - bias,
not 0 - bias. The current code falls through without adjusting the exponent,
resulting in values shifted by a factor of 2.

Example

BINARY_FLOAT 1.401298e-45 (smallest positive subnormal) decodes as 7.006492e-46.

Fix

Set exponent = 1 before subtracting the bias in the subnormal path, for both
positive and negative cases of decodeFloat() and decodeDouble().

Test

Standalone test to reproduce and verify:

// Build: g++ -std=c++17 -o test_subnormal test_subnormal.cpp

#include <cmath>
#include <cstdint>
#include <cstdio>
#include <limits>

static double decodeFloat(const uint8_t* data) {
    // ... copy from Builder.cpp ...
}

int main() {
    // Smallest positive subnormal float
    // IEEE 754: sign=0, exp=0x00, sig=0x000001
    // Oracle:   sign flipped -> {0x80, 0x00, 0x00, 0x01}
    const uint8_t data[] = {0x80, 0x00, 0x00, 0x01};
    double result = decodeFloat(data);
    printf("got: %.6e, expected: %.6e\n", result, 1.401298e-45);
    // Without fix: 7.006492e-46 (wrong)
    // With fix:    1.401298e-45 (correct)
}

IEEE 754 subnormals use exponent 1-bias, not 0-bias. OLR used 0-bias,
producing half the correct value (e.g. 7e-46 instead of 1.4e-45).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant