Skip to content

Conversation

@zeroshade
Copy link
Member

Rationale for this change

resolves #445

What changes are included in this PR?

Use time.RFC3339Nano for the format to output the time value

Are these changes tested?

Yes

Are there any user-facing changes?

Changes the format of timestamp strings for ValueStr and marshalling to JSON to match JSON and Go conventions.

@zeroshade zeroshade requested review from amoeba, kou and lidavidm July 25, 2025 15:27
Copy link
Member Author

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efd6 does this solve your issue?

Copy link
Member

@amoeba amoeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

We could also test that the values round-trip back as their correct time.Date which should exercise the OP's original issue. Thoughts?

diff --git a/arrow/array/timestamp_test.go b/arrow/array/timestamp_test.go
index a421765c..5ed18877 100644
--- a/arrow/array/timestamp_test.go
+++ b/arrow/array/timestamp_test.go
@@ -17,6 +17,7 @@
 package array_test
 
 import (
+	"encoding/json"
 	"testing"
 	"time"
 
@@ -24,6 +25,7 @@ import (
 	"github.com/apache/arrow-go/v18/arrow/array"
 	"github.com/apache/arrow-go/v18/arrow/memory"
 	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
 )
 
 func TestTimestampStringRoundTrip(t *testing.T) {
@@ -298,3 +300,41 @@ func TestTimestampEquality(t *testing.T) {
 	assert.Equal(t, arrs[0].Value(1), arrs[2].Value(1))
 	assert.Equal(t, arrs[1].Value(1), arrs[2].Value(1))
 }
+
+func TestTimestampArrayJSONRoundTrip(t *testing.T) {
+	mem := memory.NewCheckedAllocator(memory.NewGoAllocator())
+	defer mem.AssertSize(t, 0)
+
+	tz, _ := time.LoadLocation("America/Phoenix")
+	dt := &arrow.TimestampType{Unit: arrow.Second, TimeZone: tz.String()}
+	b := array.NewTimestampBuilder(mem, dt)
+	defer b.Release()
+
+	b.Append(-34226955)
+	b.Append(1456767743)
+
+	arr := b.NewArray()
+	defer arr.Release()
+
+	assert.Equal(t, "1968-11-30T13:30:45-07:00", arr.ValueStr(0))
+	assert.Equal(t, "2016-02-29T10:42:23-07:00", arr.ValueStr(1))
+	assert.Equal(t, 2, arr.Len())
+	assert.Equal(t, 0, arr.NullN())
+
+	json_bytes, err := arr.MarshalJSON()
+	require.NoError(t, err)
+
+	expectedJSON := `["1968-11-30T13:30:45-07:00","2016-02-29T10:42:23-07:00"]`
+	require.Equal(t, expectedJSON, string(json_bytes))
+
+	var timestamps []time.Time
+	err = json.Unmarshal(json_bytes, &timestamps)
+	require.NoError(t, err)
+	require.Len(t, timestamps, 2)
+
+	expectedTime1 := time.Date(1968, time.November, 30, 13, 30, 45, 0, tz)
+	expectedTime2 := time.Date(2016, time.February, 29, 10, 42, 23, 0, tz)
+
+	assert.Equal(t, expectedTime1.Unix(), timestamps[0].Unix())
+	assert.Equal(t, expectedTime2.Unix(), timestamps[1].Unix())
+}

@efd6
Copy link

efd6 commented Jul 27, 2025

Yeah, this change is essentially what I was hoping for. Thanks.

Thanks @amoeba for the test suggestion; this will prevent future skew away from stdlib JSON behaviour.

@zeroshade zeroshade merged commit 2900ed6 into apache:main Jul 27, 2025
23 checks passed
@erezrokah
Copy link
Contributor

Hi all 👋 Thanks for the fix, the new format makes sense. Shouldn't this have been marked as a breaking change? Consumers of the JSON might expect the old format. Maybe put it behind an option so it's opt-in?

@zeroshade
Copy link
Member Author

Personally I consider this a bug fix, that the format should have been this way originally and would only cause an issue if the consumer of the JSON was using a very strict parsing which didn't follow RFC3339.

@erezrokah
Copy link
Contributor

erezrokah commented Sep 8, 2025

@zeroshade completely agree this is bug fix and this is how the format should have been implemented.

However, if someone is using this module as a middleware to generate JSONs for others to consume, they can't control the parsing down the line and it's impossible for them to know how this will impact their consumers.

I'm not sure that SemVer differentiates between a "big" breaking change and a "small" breaking change, nor if the initial implementation was correct or not.

@efd6
Copy link

efd6 commented Sep 9, 2025

@erezrokah What would be the default? At this stage any choice would be a breaking change.

@zeroshade
Copy link
Member Author

I'd be open to adding a configuration to control the format, but given the current release the default should be the current format used in the most recent release.

@erezrokah
Copy link
Contributor

@erezrokah What would be the default? At this stage any choice would be a breaking change.

Technically what you would usually do is deprecate v18.4.1 (maybe also make it a pre-release so automated tools like renovate/dependabot don't pick it up), then release v18.4.2 with the old behavior, then do v18.5.0 to control the format (since it's a new feature).

But I get that this could create more issues.

We can keep the new format as the default if that sounds good to everyone

@zeroshade
Copy link
Member Author

Technically what you would usually do is deprecate v18.4.1 (maybe also make it a pre-release so automated tools like renovate/dependabot don't pick it up), then release v18.4.2 with the old behavior, then do v18.5.0 to control the format (since it's a new feature).

So far no one has filed an issue or otherwise complained about the format change in v18.4.1. As a result, my preference is to keep the new format as the default unless it comes up as an issue.

That work for you @erezrokah ? @efd6 ?

@erezrokah
Copy link
Contributor

That works for me thanks @zeroshade

@efd6
Copy link

efd6 commented Sep 10, 2025

Thanks, @zeroshade. Yes, that works for me. We have just merged a change that depends on the new behaviour and would be sad if that went away via a retraction (even is this is really just an inconvenience to update a to-be-created knob).

It's unfortunate that the old behaviour sits right at the boundary of bug/feature. If it were more clearly a bug (I do believe that it sits on that side of the line), there'd be no argument here that a breaking change would be justified.

@erezrokah
Copy link
Contributor

erezrokah commented Sep 16, 2025

Opened #510 as draft for feedback, tried to do it in a way that doesn't breaking existing code

BTW another way to go about this, is to release the breaking change under v19.0.0

zeroshade pushed a commit that referenced this pull request Sep 18, 2025
…#510)

### Rationale for this change

See discussion in #450

### What changes are included in this PR?

Allow setting the formatting of timestamps when using `ValueStr` (or
JSON marshalling)

### Are these changes tested?

Yes - re-added the original test that was modified in
#450

### Are there any user-facing changes?

Yes - users will now be able to revert back to the old format before
#450, to avoid unintentional
breakage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

timestamps are marshaled as non-RFC3339 format despite Go convention of using that format

5 participants