You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add UUID conversion to and from 16 byte fixed sequences
UUIDs are often passed around in application code in their canonical,
hex as string representation e.g. "550e8400-e29b-41d4-a716-446655440000".
Encoding UUIDs as Avro "string"s takes 37 bytes, while encoding UUIDs in
their binary form fits into a 16 byte sized "fixed", saving 21 bytes per
encoding.
This change allows application code to keep passing around canonical hex
UUIDs while converting to the compact encoding, requiring only
`uuid_format: :canonical_string` to be given in decode options.
The [Java reference implementation][java-implementation] also supports
encoding UUIDs as both strings and 16 byte fixed sequences.
* Encoding is augmented such that a 16 byte fixed schema with
`%{"logicalType" => "uuid"}`, converts a hex-string UUID to the 16
byte binary representation.
* Decoding is augmented such that given `uuid_format: :canonical_string`
in decode options, the binary representation is converted to the
canonical hex-string representation.
The encoding change is nearly backwards-compatible, previously when
given an incorrectly size "fixed" with `{"logicalType": "uuid"}`, an
error was raised, while now conversion is attempted.
The decoding change is fully backwards-compatible, as `uuid_format`
defaults to `:binary`.
For UUID codec, the `uniq` library was added (no transitive
dependencies).
[java-implementation]: https://github.com/apache/avro/blob/230414abbb68e63e68f3b55bfc0cbca94f2737f6/lang/java/avro/src/main/java/org/apache/avro/LogicalTypes.java#L291-L309
0 commit comments