-
Notifications
You must be signed in to change notification settings - Fork 85
fix(arrow/array): update timestamp json format #450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zeroshade
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6 does this solve your issue?
amoeba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
We could also test that the values round-trip back as their correct time.Date which should exercise the OP's original issue. Thoughts?
diff --git a/arrow/array/timestamp_test.go b/arrow/array/timestamp_test.go
index a421765c..5ed18877 100644
--- a/arrow/array/timestamp_test.go
+++ b/arrow/array/timestamp_test.go
@@ -17,6 +17,7 @@
package array_test
import (
+ "encoding/json"
"testing"
"time"
@@ -24,6 +25,7 @@ import (
"github.com/apache/arrow-go/v18/arrow/array"
"github.com/apache/arrow-go/v18/arrow/memory"
"github.com/stretchr/testify/assert"
+ "github.com/stretchr/testify/require"
)
func TestTimestampStringRoundTrip(t *testing.T) {
@@ -298,3 +300,41 @@ func TestTimestampEquality(t *testing.T) {
assert.Equal(t, arrs[0].Value(1), arrs[2].Value(1))
assert.Equal(t, arrs[1].Value(1), arrs[2].Value(1))
}
+
+func TestTimestampArrayJSONRoundTrip(t *testing.T) {
+ mem := memory.NewCheckedAllocator(memory.NewGoAllocator())
+ defer mem.AssertSize(t, 0)
+
+ tz, _ := time.LoadLocation("America/Phoenix")
+ dt := &arrow.TimestampType{Unit: arrow.Second, TimeZone: tz.String()}
+ b := array.NewTimestampBuilder(mem, dt)
+ defer b.Release()
+
+ b.Append(-34226955)
+ b.Append(1456767743)
+
+ arr := b.NewArray()
+ defer arr.Release()
+
+ assert.Equal(t, "1968-11-30T13:30:45-07:00", arr.ValueStr(0))
+ assert.Equal(t, "2016-02-29T10:42:23-07:00", arr.ValueStr(1))
+ assert.Equal(t, 2, arr.Len())
+ assert.Equal(t, 0, arr.NullN())
+
+ json_bytes, err := arr.MarshalJSON()
+ require.NoError(t, err)
+
+ expectedJSON := `["1968-11-30T13:30:45-07:00","2016-02-29T10:42:23-07:00"]`
+ require.Equal(t, expectedJSON, string(json_bytes))
+
+ var timestamps []time.Time
+ err = json.Unmarshal(json_bytes, ×tamps)
+ require.NoError(t, err)
+ require.Len(t, timestamps, 2)
+
+ expectedTime1 := time.Date(1968, time.November, 30, 13, 30, 45, 0, tz)
+ expectedTime2 := time.Date(2016, time.February, 29, 10, 42, 23, 0, tz)
+
+ assert.Equal(t, expectedTime1.Unix(), timestamps[0].Unix())
+ assert.Equal(t, expectedTime2.Unix(), timestamps[1].Unix())
+}|
Yeah, this change is essentially what I was hoping for. Thanks. Thanks @amoeba for the test suggestion; this will prevent future skew away from stdlib JSON behaviour. |
|
Hi all 👋 Thanks for the fix, the new format makes sense. Shouldn't this have been marked as a breaking change? Consumers of the JSON might expect the old format. Maybe put it behind an option so it's opt-in? |
|
Personally I consider this a bug fix, that the format should have been this way originally and would only cause an issue if the consumer of the JSON was using a very strict parsing which didn't follow RFC3339. |
|
@zeroshade completely agree this is bug fix and this is how the format should have been implemented. However, if someone is using this module as a middleware to generate JSONs for others to consume, they can't control the parsing down the line and it's impossible for them to know how this will impact their consumers. I'm not sure that SemVer differentiates between a "big" breaking change and a "small" breaking change, nor if the initial implementation was correct or not. |
|
@zeroshade Are you open to adding a configuration to control the format? I'm happy to submit a PR for it. Below some places I found that this change can be breaking (used this search)
|
|
@erezrokah What would be the default? At this stage any choice would be a breaking change. |
|
I'd be open to adding a configuration to control the format, but given the current release the default should be the current format used in the most recent release. |
Technically what you would usually do is deprecate v18.4.1 (maybe also make it a pre-release so automated tools like renovate/dependabot don't pick it up), then release v18.4.2 with the old behavior, then do v18.5.0 to control the format (since it's a new feature). But I get that this could create more issues. We can keep the new format as the default if that sounds good to everyone |
So far no one has filed an issue or otherwise complained about the format change in v18.4.1. As a result, my preference is to keep the new format as the default unless it comes up as an issue. That work for you @erezrokah ? @efd6 ? |
|
That works for me thanks @zeroshade |
|
Thanks, @zeroshade. Yes, that works for me. We have just merged a change that depends on the new behaviour and would be sad if that went away via a retraction (even is this is really just an inconvenience to update a to-be-created knob). It's unfortunate that the old behaviour sits right at the boundary of bug/feature. If it were more clearly a bug (I do believe that it sits on that side of the line), there'd be no argument here that a breaking change would be justified. |
|
Opened #510 as draft for feedback, tried to do it in a way that doesn't breaking existing code
|
…#510) ### Rationale for this change See discussion in #450 ### What changes are included in this PR? Allow setting the formatting of timestamps when using `ValueStr` (or JSON marshalling) ### Are these changes tested? Yes - re-added the original test that was modified in #450 ### Are there any user-facing changes? Yes - users will now be able to revert back to the old format before #450, to avoid unintentional breakage
Rationale for this change
resolves #445
What changes are included in this PR?
Use
time.RFC3339Nanofor the format to output the time valueAre these changes tested?
Yes
Are there any user-facing changes?
Changes the format of timestamp strings for
ValueStrand marshalling to JSON to match JSON and Go conventions.