feat(extensions): add unsigned integer extension types (u8, u16, u32, u64)#953
feat(extensions): add unsigned integer extension types (u8, u16, u32, u64)#953kadinrabo wants to merge 24 commits intosubstrait-io:mainfrom
Conversation
e4584ca to
daccb86
Compare
daccb86 to
d19c87a
Compare
6d28b59 to
4af53f8
Compare
4af53f8 to
7d6f7f9
Compare
8650358 to
10baf81
Compare
1a8fd73 to
856e63d
Compare
…sion-types # Conflicts: # tests/coverage/antlr_parser/FuncTestCaseParser.py
benbellick
left a comment
There was a problem hiding this comment.
Thanks for working on this 🙂
Left a few more comments
extensions/unsigned_integers.yaml
Outdated
| - | ||
| name: "divide" | ||
| description: > | ||
| Divide x by y. Partial values are truncated (rounded towards 0). |
There was a problem hiding this comment.
We should include the divide description from
substrait/extensions/functions_arithmetic.yaml
Lines 177 to 186 in 2705258
There was a problem hiding this comment.
I can go with something like this, thoughts?
Divide x by y. Partial values are truncated (i.e. rounded towards 0).
The `on_division_by_zero` option governs behavior in cases where y is 0.
If either x or y are out of range, behavior will be governed by `on_domain_error`.
There was a problem hiding this comment.
Is it possible to have null cases added to these tests? Might be overkill but can't hurt.
There was a problem hiding this comment.
I'd prefer to keep parity with signed tests here, which don't have null cases for add/subtract/multiply either. Open to adding them if you feel strongly, but might be better as a follow up
There was a problem hiding this comment.
I'm okay with not doing this for now.
vbarua
left a comment
There was a problem hiding this comment.
This looks good overall to me.
The one thing I would like to see is that for the test files, let's map over not just the basic tests but also the overflow, null handling, etc tests that are applicable to usigned integers. I think the floating exception tests are the only ones that don't make apply.
extensions/unsigned_integers.yaml
Outdated
| - | ||
| name: "divide" | ||
| description: > | ||
| Divide x by y. Partial values are truncated (rounded towards 0). |
There was a problem hiding this comment.
We should include the divide description from
substrait/extensions/functions_arithmetic.yaml
Lines 177 to 186 in 2705258
vbarua
left a comment
There was a problem hiding this comment.
Left one comment on the definition of divide, and I think we should pull modulus out of this entirely because the existing definition could use a once over, but otherwise looks good to me.
The core set of functions you've added, along with the tests files, set a good example for how to add both future functions for unsigned integers, and also for adding new types.
extensions/unsigned_integers.yaml
Outdated
| division_type: | ||
| values: [ TRUNCATE, FLOOR ] | ||
| overflow: | ||
| values: [ SILENT, SATURATE, ERROR ] |
extensions/unsigned_integers.yaml
Outdated
| value: u!u8 | ||
| options: | ||
| division_type: | ||
| values: [ TRUNCATE, FLOOR ] |
There was a problem hiding this comment.
Is division_type an option we need here. The signed modulus operator defines 2 tests for them:
substrait/tests/cases/arithmetic/modulus.test
Lines 18 to 20 in e4ce3f8
There was a problem hiding this comment.
I'm okay with not doing this for now.
extensions/unsigned_integers.yaml
Outdated
| value: u!u8 | ||
| options: | ||
| overflow: | ||
| values: [ SILENT, SATURATE, ERROR ] |
There was a problem hiding this comment.
With signed integers, the overflow cases involve a sign change.
# overflow: Examples demonstrating overflow behavior
divide(-9223372036854775808::i64, -1::i64) [overflow:ERROR] = <!ERROR>
divide(-128::i8, -1::i8) [overflow:SATURATE] = 127::i8
With unsigned integers, this case isn't applicable, and in general overflow shouldn't be possible so we can drop this option fully for all of the impls.
vbarua
left a comment
There was a problem hiding this comment.
Changes look good to me, thanks for working on this Kadin.
Looking at this has definitely made me notice some stuff in the existing arithmetic extension that could use improvements as well 🧹
| # Type is "u!" + identifier, e.g., "u!u8" | ||
| type_str = "u!" + ctx.Identifier().getText().lower() |
There was a problem hiding this comment.
does this handle YAML dependency reference or that is out of scope in the test?
There was a problem hiding this comment.
By yaml dependency reference, are you talking about this feature for extensions?
substrait/text/simple_extensions_schema.yaml
Lines 10 to 19 in 00bc3c2
I don't think the test cases support dependency references, yet
Lines 92 to 101 in 00bc3c2
yongchul
left a comment
There was a problem hiding this comment.
+0 using string for representation (because we go at length in defining representation in core types (e.g., decimal)) but overall looks good to me.
|
@yongchul what does
from you mean here? Is that an approval with no vote? |
Description
Adds unsigned integer types (u8, u16, u32, u64) as first-class extension types with arithmetic function support and test coverage.
unsigned_integers.yamlwith type definitions (string structure encoding) and arithmetic function overloads (add, subtract, multiply, divide, modulus, sum, min, max)functions_arithmetic.yamlis untouchedtests/cases/arithmetic_unsigned/, following the arithmetic_decimal conventionCloses #944 and follows up community agreement from Substrait Meeting Notes on 28 Jan 2026 that type variations are not appropriate for unsigned integers due to differing semantics.
This change is