Skip to content

feat(extensions): add subscript_operator and index_of functions#1020

Open
benbellick wants to merge 6 commits intomainfrom
benbellick/element-at
Open

feat(extensions): add subscript_operator and index_of functions#1020
benbellick wants to merge 6 commits intomainfrom
benbellick/element-at

Conversation

@benbellick
Copy link
Copy Markdown
Member

@benbellick benbellick commented Mar 23, 2026

Adding two common list functions to the core spec.

subscript_operator

Returns the element at a 1-based index, or NULL for out-of-bounds access. Matches PostgreSQL and CockroachDB semantics.

In an earlier iteration, I tried to get this to cover more databases (e.g. Trino) but it was becoming an hodgepodge of configuration parameters, so I opted to just have the simple impl for now. We can add more complexity later if necessary.

index_of

Returns the 1-based index of the first occurrence of a value in a list, or NULL if not found. Matches PostgreSQL, DuckDB, DataFusion, and CockroachDB.

Trino returns 0 instead of NULL when not found, which is... strange. Just going to not have Trino's behavior represented here.

Verification script
# subscript_operator - PostgreSQL
docker run --rm postgres:latest su postgres -c \
  "initdb -D /tmp/d >/dev/null 2>&1 && pg_ctl start -D /tmp/d -l /tmp/l -o '-k /tmp' >/dev/null 2>&1 && sleep 2 && psql -h /tmp -t -c \"
    SELECT (ARRAY[1,2,3])[2], (ARRAY[1,2,3])[0], (ARRAY[1,2,3])[5], (ARRAY[1,2,3])[-1];
  \""
# Output: 2 | | |
# (empty = NULL)

# subscript_operator - CockroachDB
docker run --rm cockroachdb/cockroach:latest demo --no-example-database --insecure -e "
  SELECT (ARRAY[1,2,3])[2], (ARRAY[1,2,3])[0], (ARRAY[1,2,3])[5], (ARRAY[1,2,3])[-1];
" 2>&1 | tail -3
# Output: 2 | NULL | NULL | NULL

# index_of - PostgreSQL
docker run --rm postgres:latest su postgres -c \
  "initdb -D /tmp/d >/dev/null 2>&1 && pg_ctl start -D /tmp/d -l /tmp/l -o '-k /tmp' >/dev/null 2>&1 && sleep 2 && psql -h /tmp -t -c \"
    SELECT array_position(ARRAY[1,2,3], 2), array_position(ARRAY[1,2,3], 5);
  \""
# Output: 2 |
# (empty = NULL for not found)

# index_of - DuckDB
docker run --rm duckdb/duckdb:latest duckdb -noheader -list -c "
  SELECT list_position([1,2,3], 2), list_position([1,2,3], 5);
"
# Output: 2|NULL

# index_of - CockroachDB  
docker run --rm cockroachdb/cockroach:latest demo --no-example-database --insecure -e "
  SELECT array_position(ARRAY[1,2,3], 2), array_position(ARRAY[1,2,3], 5);
" 2>&1 | tail -3
# Output: 2 | NULL

Closes #967


This change is Reviewable

@benbellick benbellick added the 1.0 Tracking work we consider required before releasing 1.0 label Mar 23, 2026
@benbellick benbellick force-pushed the benbellick/element-at branch from 45af49b to 32319f4 Compare March 23, 2026 21:48
@benbellick benbellick removed the 1.0 Tracking work we consider required before releasing 1.0 label Mar 23, 2026
@benbellick benbellick changed the title feat(extensions): add element_at list function feat(extensions): add list_subscript function Mar 24, 2026
@benbellick benbellick changed the title feat(extensions): add list_subscript function feat(extensions): add subscript_operator function Mar 24, 2026
@benbellick benbellick marked this pull request as ready for review March 24, 2026 16:27
@benbellick benbellick changed the title feat(extensions): add subscript_operator function feat(extensions): add subscript_operator and index_of functions Mar 24, 2026
Comment on lines +137 to +139
to the common array subscript operator [].

Index is 1-based (i.e., the first element is at index 1). Out-of-bounds
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe #917 is proposing 0-based index 😆 I believe I commented whether we should offer an option to do either zero-based index or one-based index... So, we'd better be consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add common list functions to core substrait

2 participants