Skip to content

feat!: require window function bounds and support multiple order-by columns#932

Open
benbellick wants to merge 3 commits intomainfrom
benbellick/allow-multiple-orderby-window
Open

feat!: require window function bounds and support multiple order-by columns#932
benbellick wants to merge 3 commits intomainfrom
benbellick/allow-multiple-orderby-window

Conversation

@benbellick
Copy link
Copy Markdown
Member

@benbellick benbellick commented Jan 8, 2026

Closes #930

BREAKING CHANGE: Window function bindings no longer have default values.


This change is Reviewable

@benbellick benbellick marked this pull request as ready for review January 8, 2026 14:37
Copy link
Copy Markdown

@asolimando asolimando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mostly but there are a couple of open questions

| Lower Bound | Bound Following(int64), Bound Trailing(int64) or CurrentRow. | False, defaults to start of partition |
| Upper Bound | Bound Following(int64), Bound Trailing(int64) or CurrentRow. | False, defaults to end of partition |
| Order By | A list of ordering expressions with sort directions. | False, defaults to unordered |
| Bounds Type | ROWS or RANGE. ROWS bounds count physical rows. RANGE bounds consider value equivalence based on ordering columns. | False, defaults to RANGE |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also GROUP, even if not implemented at the moment, would it be worth mentioning it and specifying is not supported yet?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asolimando could you cite the reference of GROUP you are referring to?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's described in "Section 4.15.14 Windowed tables" of the "Part 2: Foundation (SQL/Foundation)" in my SQL standard 2011 (draft version):

A window ordering group is a maximal set of rows in a window partition that are peers according to the window
ordering. Although the ordering of rows within a window ordering group is implementation-dependent, it is
possible to totally order the window ordering groups of a window partition, as follows: if WOG1 and WOG2
are two window ordering groups contained in the same window partition P, then WOG1 precedes WOG2 if
and only if some row of WOG1 precedes some row of WOG2.

Optionally, a window may define a window frame for each row R. A window frame is always defined relative
to the current row. A window frame is specified by up to four syntactic elements:

- The choice of RANGE, to indicate a logical definition of the window frame by offsetting forward or
backward from the current row by an increment or decrement to the value of the <sort key>; GROUPS,
to indicate a logical definition of the window frame by counting forward or backward a number of window
ordering groups (as defined by the <sort key>s) from the window ordering group containing the current
row; or ROWS, to indicate a physical definition of the window frame, by counting rows forward or backward
from the current row.
- ...
[...]

For what I could check, and up to my best understanding, Postgres' follows the standard in that respect, so its documentation can be referred to.

Please note that this is an optional feature, citing from conformance rules from the SQL:2011 standard:

Without Feature T620, “WINDOW clause: GROUPS option”, conforming SQL language shall not contain
<window frame units> that specifies GROUPS.

Comment on lines +23 to +24
| Lower Bound | Preceding(int64), Following(int64), CurrentRow, or Unbounded. | False, defaults to Unbounded (start of partition) |
| Upper Bound | Preceding(int64), Following(int64), CurrentRow, or Unbounded. | False, defaults to Unbounded (end of partition) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For defaults, it actually depends on a few additional factors:

Quoting Postgres doc:

The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With ORDER BY, this sets the frame to be all rows from the partition start up through the current row's last ORDER BY peer. Without ORDER BY, this means all rows of the partition are included in the window frame, since all rows become peers of the current row.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to match Postgres default behaviours here, especially given that we've already assigned default behaviours for these and changing these would have implications for systems that currently implement the defaults correctly as given.

Not for this PR, but my actual inclination for window functions would be to get rid of all of the defaults. This stuff is already complicated enough, and trying to reason about it in the presence of defaults just makes it harder. Forcing plan producers to be explicit about this seems easier than trying to pick the correct defaults for everyone to use.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to remove default.

Comment on lines +23 to +24
| Lower Bound | Preceding(int64), Following(int64), CurrentRow, or Unbounded. | False, defaults to Unbounded (start of partition) |
| Upper Bound | Preceding(int64), Following(int64), CurrentRow, or Unbounded. | False, defaults to Unbounded (end of partition) |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to match Postgres default behaviours here, especially given that we've already assigned default behaviours for these and changing these would have implications for systems that currently implement the defaults correctly as given.

Not for this PR, but my actual inclination for window functions would be to get rid of all of the defaults. This stuff is already complicated enough, and trying to reason about it in the presence of defaults just makes it harder. Forcing plan producers to be explicit about this seems easier than trying to pick the correct defaults for everyone to use.

@vbarua vbarua changed the title feat: allow multiple order-by in number-less window fns feat: RANGE bounds type MAY permit multiple ordering columns Jan 13, 2026
@github-actions
Copy link
Copy Markdown

ACTION NEEDED

Substrait follows the Conventional Commits
specification
for
release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@vbarua
Copy link
Copy Markdown
Member

vbarua commented Jan 13, 2026

I would normally finagle the title to pass the conventional commit CI check, but in this case it's complaining about the capitalization of RANGE and I think that actually is needed / useful for clarity.

@benbellick benbellick changed the title feat: RANGE bounds type MAY permit multiple ordering columns feat: permit multiple ordering columns in RANGE bounds type Jan 14, 2026
@benbellick benbellick changed the title feat: permit multiple ordering columns in RANGE bounds type feat!: require explicit window function properties and support multiple order-by columns Jan 14, 2026
@benbellick benbellick changed the title feat!: require explicit window function properties and support multiple order-by columns feat!: require explicit window function properties and support multiple order-by columns Jan 14, 2026
@benbellick benbellick changed the title feat!: require explicit window function properties and support multiple order-by columns feat!: require window function bounds and support multiple order-by columns Jan 14, 2026
@github-actions
Copy link
Copy Markdown

This PR has been automatically marked as stale because it has not had
recent activity. It will be closed in 7 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Mar 18, 2026
@benbellick
Copy link
Copy Markdown
Member Author

This is still relevant, not stale

@github-actions github-actions bot removed the Stale label Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RANGE bounds spec is overly restrictive for UNBOUNDED/CURRENT ROW

4 participants