-
Notifications
You must be signed in to change notification settings - Fork 85
fix(arrow/avro-reader): bunch of types that didn't work #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for this, can you add a test for it? |
Will do when I get around to it! |
@zeroshade I've added a test with some sample data based on the pre-existing avro schema that was in the repo. It encodes the data to avro & to json, and validates when the avro is decoded that it matches the json. For the json encoder I aligned it to what I saw the arrow record as json produced. I'm wondering what happened here, as quite a few types were not working. Maybe hamba had some breaking changes and due to no tests they were not detected? What do you think about the reader_type switch cases that do not seem to be applicable anymore; are there valid cases where these are still needed or does it make sense to delete them? |
Was not working because array does not type match as a slice. The solution seems to be not possible without reflection, because a type match would require each possible array size to match [0]byte [1]byte etc.
|
Which cases do you see are no longer applicable and not working? the tests that are being removed seem to be types that should be supported |
|
Tests that are being removed? I think no tests are being removed? As an example, the cases that I'm referring to is this: Not sure if these are legacy or still relevant in cases that I cannot oversee! |
|
Seems like my changes do not work on windows, I added the license info to the avro schema which seems a bit weird. Is there a better way to deal with the license for this file? |
|
You can add the file to the |
|
My example was fixed binary, notice it shows up twice in above screenshot. I observed the fixed binary to return an array, as is described in the screenshot as well. The case didn't apply because when I use arrow reader it converts to array, not to map and not to slice. I don't know if there's a scenario where it would actually be used, that's my question! I suppose if hamba doesn't decode into slice or map for fixed bytes, is there any other decoder that users can use that do hit these code paths? If not than maybe these are only applicable to hamba v1 or so, and not needed anymore. If you're unsure we can also just leave it. The most important part for me is to have the working cases added, rather than the not working cases removed. And fixed binary was an example, the same applies to timestamps and the other types I fixed. |
|
A |
|
I added working fixes for all the types already, including for fixed! |
|
Added the schema file to the rat exclude, that should fix the windows tests. Any other feedback I'd need to look at? Thanks! |
zeroshade
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!


Rationale for this change
#415
What changes are included in this PR?
Fix decoding of avro types that no longer worked: fixed bytes, timestamp, time, date & decimals
Are these changes tested?
Added tests to the avro/reader_test.go
Are there any user-facing changes?
No API changes, just broader compatability.