Skip to content

Moonblade date functions#893

Open
solesby wants to merge 6 commits intomedialab:masterfrom
solesby:date-functions
Open

Moonblade date functions#893
solesby wants to merge 6 commits intomedialab:masterfrom
solesby:date-functions

Conversation

@solesby
Copy link

@solesby solesby commented Jan 19, 2026

Add basic date functions to moonblade. This enables doing things like:

xan map '"days( date_hire, now() ) / 365" as years_tenure'

xan map '"date_add( date_married, "1 year" )" as first_anniversary'

now() — returns the current date/time

date_add( dt, span ) — return the dt plus the time span

date_sub( dt, span ) — return the dt minus the time span

days( dt1, dt2 ) — return the number of days between the two dates (positive if dt1 is before dt2)

A span is an ISO 8601 format, e.g. "P5DT8H1M" or friendly format, e.g. "10 days". See more in Span documentation

@Yomguithereal
Copy link
Member

@solesby I totally agree those addition should happen, one way or the other. Let me just check with @bmaz if we agree about the proposition, API-wise.

@mvanleeu
Copy link

@solesby I totally agree those addition should happen, one way or the other. Let me just check with @bmaz if we agree about the proposition, API-wise.

Hello Guillaume,
Did you have a chance to exchange with Béatrice on this? Having the ability to calculate for example the differences between dates would be very welcome I believe, and a good complement to what is already available in xan.

@Yomguithereal
Copy link
Member

The crux of the discussion is the following: should we go the python way and keep dates that always have timezone info, albeit often an implicit one (usually the timezone of your computer) or should we go the rust/jiff way and separate between datetime without a timezone and datetime with a timezone.

First solution has less friction but more potential footguns, second one has no footguns but more friction.

For now I think I will distinguish types internally, but keep solution one which is current solution. Any opinion about this?

@mvanleeu
Copy link

Let's first all agree that proper time handling is really hard !

If I followed your question correctly Guillaume, I would argue I'd lean more in the jiff logic than the python one. Implicit behaviour with time only leads to hard to spot/reproduce issues
(i.e. processing via xan may yield different outcomes depending on the timezone of the computer processing the data because of "TZ assumptions" to pick the "local" one...

Rationale:
Taking into account the use case of xan which is (I'm simplifying) consuming / transforming pre-existing data, the user&usage are rather dictated by the available precision of the data. If time or date&times are present in the data without proper timezone information for example, xan users need to accept that the resulting time arithmetic will not be perfectly accurate. ex: "how many days between two TZ-less timestamp?" may not be accurate as it has to neglect daylight saving information that, if present, may have potentially influenced the outcome ex: 2024-03-10 in America/New_York was only 23 hours long (the region entered daylight saving time, while 2024-11-03 was 25 hours long...

I would thus argue that a "simple variant" is perfectly valid in case no TZ is present, but ideally, xan would need to be able to be super accurate if provided with all the needed information. It would be perfectly acceptable for xan to refuse to bridge the no-TZ / with-TZ divide, unless it would be somehow possible in the api to explictely provide a timezone for TZ-less data.

To illustrate days( dt1, dt2 )

  1. with both dt1 and dt2 being parsed from data like "2023-10-25T14:30:00" (both no timezone) should be accepted.
  2. with both dt1 and dt2 being parsed from data like "2023-10-25T14:30:00Z[atimezone]" (both with a timezone, but not necessarily the same one) should be accepted, and use TZ information to accurately calculate
  3. with dt1 having but dt2 not having TZ info, should lead to an error. (there should be a way to explicitly specify a TZ for a TS-less datetime to put us back in the second scenario)

Not sure if all of the above is clear :-)

@Yomguithereal Yomguithereal force-pushed the master branch 2 times, most recently from 8d639a3 to 99422cd Compare February 16, 2026 11:03
@mvanleeu
Copy link

Just a note on the PR proposed by @solesby . When I was testing the days() function, I got some results I didn't understand at first:

xan eval "days('2019-08-08','2019-12-16')"

returns 130.04166666666666 instead of the plain 130 I was expecting.

Without deep diving in the code I suspect it has to do with the math done by jiff operating on more granular units that "days" and some daylight saving kicking in during that period.
From an API perspective, not really sure anymore what would be the "right" answer here, but getting a non-integer result from a function called days() while passing dates without timestamp nor timezone information feels weird....

@solesby
Copy link
Author

solesby commented Feb 17, 2026

I agree with @mvanleeu on both comments. I think the key is to keep it very easy to use on command line without being "wrong". Dates will generally be timezone naive, or have timezones. Operating on columns where there is a mismatch should throw exception.

When/if necessary, the default timezone should be UTC so that a command/script will always behave the same on any machine. There should be an optional parameter to set/override this default for naive dates, e.g. --timezone US/Eastern.

days() and similar functions should return integer days by default. Note: this is using the jiff days_are_24_hours so that may be something to factor in.

While I wouldn't oppose, additional functions for more precise or granular time math, I would keep them as additional functions or with defaults so that the basic "about how long was that" is easy without reading documentation. The naive result of days() divided by 365 or 7 or 30 is often good enough to answer the question, without falling into details of leap seconds, timezones, or historical preferences for which rules applied by which country.

Balancing ease vs precision might be simply splitting API into simple functions like days() and a very powerful, precise time delta function with parameters to full configure precision and expectation that times are fully specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments