-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Describe the bug
Terms aggregations with match_all queries on high-cardinality fields have experienced significant performance regression since PR #11643 introduced the tryCollectFromTermFrequencies optimization in GlobalOrdinalsStringTermsAggregator. This went unnoticed until recent ClickBench query performance testing revealed severe slowdowns.
Related component
Search:Performance
To Reproduce
Affected Query Pattern
{
"size": 0,
"timeout": "1m",
"aggregations": {
"URL": {
"terms": {
"field": "URL",
"size": 10
}
}
}
}Where the field has high cardinality (e.g., 10 million unique values).
Root Cause
The tryCollectFromTermFrequencies method introduced in #11643 uses a leap frogging algorithm that iterates through all global ordinals to match them with segment terms:
Location: GlobalOrdinalsStringTermsAggregator.java:
TermsEnum indexTermsEnum = segmentTerms.iterator();
BytesRef indexTerm = indexTermsEnum.next();
final SortedSetDocValues globalOrds = this.getGlobalOrds(ctx);
TermsEnum globalOrdinalTermsEnum = globalOrds.termsEnum();
BytesRef ordinalTerm = globalOrdinalTermsEnum.next();
while (indexTerm != null && ordinalTerm != null) {
int compare = indexTerm.compareTo(ordinalTerm);
if (compare == 0) {
if (acceptedGlobalOrdinals.test(globalOrdinalTermsEnum.ord())) {
ordCountConsumer.accept(globalOrdinalTermsEnum.ord(), indexTermsEnum.docFreq());
}
indexTerm = indexTermsEnum.next();
ordinalTerm = globalOrdinalTermsEnum.next();
} else if (compare < 0) {
indexTerm = indexTermsEnum.next();
} else {
ordinalTerm = globalOrdinalTermsEnum.next(); // Iterates through millions of non-matching ordinals
}
}Performance Impact
For an index with:
- 10 million unique terms globally
- 10 segments
- Each segment containing ~2 million unique terms
Current behavior
- Iterates through 10M global ordinals per segment
- Total: 10 segments × 10M iterations = 100M+ global ordinal iterations
- Each iteration involves
BytesReffetching and comparison
The compare > 0 branch skips millions of global ordinals that don't exist in the current segment
Expected behavior
Should only iterate through segment terms (~2M per segment)
Total: 10 segments × 2M lookups = 20M operations
Why This Happens
The leap frogging approach assumes both iterators need to be advanced together, but this is inefficient when:
Global ordinals contain terms from all segments (superset) - (which is always the case? even with deleted docs)
Solution
Replace the leap frogging with direct term lookup?
TermsEnum indexTermsEnum = segmentTerms.iterator();
final SortedSetDocValues globalOrds = this.getGlobalOrds(ctx);
BytesRef indexTerm;
while ((indexTerm = indexTermsEnum.next()) != null) {
long globalOrd = globalOrds.lookupTerm(indexTerm);
if (globalOrd >= 0 && acceptedGlobalOrdinals.test(globalOrd)) {
ordCountConsumer.accept(globalOrd, indexTermsEnum.docFreq());
}
}
- Only iterates through segment terms (not all global ordinals)
- Eliminates unnecessary BytesRef comparisons
- Scales better with high cardinality fields
Additionally, Add Cardinality Threshold: which @bowenlan-amzn is addressing as part of #20623
Thanks @rishabh6788 for sharing cpu profile and bringing this issue to attention.
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots

streaming-profile-q34 (1).html
Host/Environment (please complete the following information):
- OS: [e.g. iOS]
- Version OS 2.13+
Additional context
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status