Skip to content

Adding _slice metadata to ES|QL#152212

Open
benwtrent wants to merge 7 commits into
elastic:mainfrom
benwtrent:feature/esql-slice-metadata
Open

Adding _slice metadata to ES|QL#152212
benwtrent wants to merge 7 commits into
elastic:mainfrom
benwtrent:feature/esql-slice-metadata

Conversation

@benwtrent

Copy link
Copy Markdown
Member

This allows for METADATA _slice for ESQL. This doesn't provide optimized routing or vector search for _slice yet.

Only basic filtering and access to the values.

i admittedly don't know all the tests I should add, but the field aliasing done in search seems to work just fine.

@benwtrent benwtrent requested a review from ioanatia June 25, 2026 18:49
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jun 25, 2026
@elasticsearchmachine

Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

if (Build.current().isSnapshot()) {
entries.addAll(snapshotOnlyAttributes);
}
if (SliceIndexing.SLICE_FEATURE_FLAG.isEnabled()) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check is a bit fragile, in case we ever want to add more metadata attributes that depend of feature flags in the near feature - which might not be the case and we'll enable _slice soon enough.
If it was me - I would actually refactor this whole bit.

I wonder if we could either define ATTRIBUTES_MAP in a static block, or remove the arguments from
createMetadataAttributes so we don't pass these buckets of snapshot-only and feature flag enabled metadata attributes.

And then in that static block or createMetadataAttributes

     // all the metadata attributes that are always enabled
      List<Map.Entry<String, MetadataAttributeConfiguration>> entries = List.of(
            Map.entry("_version", new MetadataAttributeConfiguration(DataType.LONG, false)),
           ...);

     // add every metadata attribute depending on whether its Cap is enabled
        if (EsqlCapabilities.Cap.METADATA_TIER_FIELD.isEnabled()) {
            entries.add(Map.entry(DataTierFieldMapper.NAME, new MetadataAttributeConfiguration(DataType.KEYWORD, true)));
        }

        if (EsqlCapabilities.Cap.METADATA_SLICE.enabled()) {
          entries.add(Map.entry(SliceIndexing.PARAM_NAME, new MetadataAttributeConfiguration(DataType.KEYWORD, true)));
       }

The other aspect is that the pattern we usually have when we deal with feature flag enabled features, is to define a capability and use that instead of the feature flag in the esql implementation.

* (with Lucene pushdown) against indices with {@code index.slice.enabled: true}.
*/
@ThreadLeakFilters(filters = TestClustersThreadFilter.class)
public class MetadataSliceIT extends ESRestTestCase {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good for now - and we can also use this test in the future to get the profile of the query to check how _slice is being pushed down (similar to PushQueriesIT).
But as a follow up we need to add CSV tests with a new dataset that actually uses _slice.
CSV tests allow us to test in multiple scenarios - multi-node, multi-cluster, bwc tests when the versions of nodes/cluster differ etc.

To add a dataset, I'd start in:
x-pack/plugin/esql/qa/testFixtures/src/main/java/org/elasticsearch/xpack/esql/CsvTestsDataLoader.java

there's a withRequiredCapabilities that's going to prove useful, because we don't want to create an index using _slice when the capability (feature flag) for _slice is not enabled.

But that's something we can definitely do in a separate PR.

@ioanatia

Copy link
Copy Markdown
Member

We have some failing tests:

org.elasticsearch.xpack.esql.datasources.ExternalMetadataColumnsTests

java.lang.AssertionError: metadata name [_slice] is bindable on external relations but missing from STANDARD_NAMES

I think we just need to add _slice in ExternalMetadataColumns. we can follow up on the same pattern that we have for the DataTierFieldMapper.NAME metadata attribute that sits in the same boat -> it's an Elasticsearch specific construct and for now it's not always enabled, but behind a cap check.

@ioanatia ioanatia left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to approve this one.
we don't need the CSV dataset + tests - let's add that separately.
The small refactor in MetadataAttribute and fixing the tests in ExternalMetadataColumnsTests should be easy and don't require another review IMO.
but let me know if there's something I should still be looking into after you made those changes.

Slice-enabled index shards require DiskBBQPlugin.IndexSettingProvider to
inject index.slice.validated=true at creation time. Without it the IndexSettings
constructor throws during shard init, leaving shards permanently unassigned.
Some CI distributions (e.g. the serverless stateful check) do not bundle diskbbq,
causing all 10 tests to fail with no_shard_available_action_exception.

Add a @before guard that inspects /_nodes/plugins for the diskbbq module and
calls assumeTrue to skip the whole class gracefully when it is absent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants