Skip to content

Return the _source of inner hit nested as is without wrapping it into its full path context#26982

Merged
martijnvg merged 1 commit into
elastic:masterfrom
martijnvg:nested_inner_hit_source
Oct 19, 2017
Merged

Return the _source of inner hit nested as is without wrapping it into its full path context#26982
martijnvg merged 1 commit into
elastic:masterfrom
martijnvg:nested_inner_hit_source

Conversation

@martijnvg

Copy link
Copy Markdown
Member

Due to a change happened via #26102 to make the nested source consistent
with or without source filtering, the _source of a nested inner hit was
always wrapped in the parent path. This turned out to be not ideal for
users relying on the nested source, as it would require additional parsing
on the client side. This change fixes this, the _source of nested inner hits
is now no longer wrapped by parent json objects, irregardless of whether
the _source is included as is or source filtering is used.

Internally source filtering and highlighting relies on the fact that the
_source of nested inner hits are accessible by its full field path, so
in order to now break this, the conversion of the _source into its binary
form is performed in FetchSourceSubPhase, after any potential source filtering
is performed to make sure the structure of _source of the nested inner hit
is consistent irregardless if source filtering is performed.

PR for #26944

@colings86 colings86 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check source.internalSourceRef() == null before returning to ensure that for non-root hits its actually been set by the end of this method?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. In case of regular hits the source has already been set in the FetchPhase and I guess this check exists to ensure that the binary representation of the _source has really been set.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source.internalSourceRef() could be null here now?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is null in the case rootHit is false, which can't be the case here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this nestedHit and make it final

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like it duplicates the logic of creating a XContentBuilder in a given type and then write the filtered source as map. Could it be something like this?

...
Object value = source.filter(fetchSourceContext);
try {
    if (nestedHit) {
        value = getNestedSource((Map<String, Object>) value, hitContext);
    }
    final int initialCapacity = Math.min(1024, source.internalSourceRef().length()); // deal with null here
    try (BytesStreamOutput streamOutput = new BytesStreamOutput(initialCapacity)) {
        XContentBuilder builder = new XContentBuilder(source.sourceContentType().xContent(), streamOutput);
        builder.value(value);
        hitContext.hit().sourceRef(builder.bytes());
    }
...

@tlrx tlrx left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the changes

… its full path context

Due to a change happened via elastic#26102 to make the nested source consistent
with or without source filtering, the _source of a nested inner hit was
always wrapped in the parent path. This turned out to be not ideal for
users relying on the nested source, as it would require additional parsing
on the client side. This change fixes this, the _source of nested inner hits
is now no longer wrapped by parent json objects, irregardless of whether
the _source is included as is or source filtering is used.

Internally source filtering and highlighting relies on the fact that the
_source of nested inner hits are accessible by its full field path, so
in order to now break this, the conversion of the _source into its binary
form is performed in FetchSourceSubPhase, after any potential source filtering
is performed to make sure the structure of _source of the nested inner hit
is consistent irregardless if source filtering is performed.

PR for elastic#26944

Closes elastic#26944
@martijnvg martijnvg force-pushed the nested_inner_hit_source branch from ec316e9 to 87c9b79 Compare October 19, 2017 10:05
@martijnvg martijnvg merged commit 87c9b79 into elastic:master Oct 19, 2017
@lcawl lcawl added v6.0.0-rc2 and removed v6.0.0 labels Oct 30, 2017
@clintongormley clintongormley added :Search/Search DO NOT USE DEPRECATED - DO NOT USE and removed :Inner Hits labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants