Is your feature request related to a problem?
While working on aggregations performance, I encountered a gap in workloads.
Presently, all the workloads do not have single term aggregation request as part of their runs. This is one of the common use case which we should definitely be benchmarking.
Example search requests which is missing:
GET /my_index/_search
{
"size": 0,
"aggs": {
"response_codes": {
"terms": {
"field" : "response_code"
}
}
}
}
Task in custom workload I used temporarily:
{
"name": "country_term_aggregation",
"operation-type": "search",
"body": {
"size": 0,
"aggs": {
"country_population": {
"terms": {
"field": "country_code.raw"
}
}
}
}
}
What solution would you like?
- Identify the workloads for which it would make sense to include the term aggregations. 2 of the obvious inclusions I see is geonames & http_logs
- The existing workload tasks which have
term in their name should be renamed to term_query to create a distinction between term queries and term aggregations.
- Include single term aggregations in the identified workloads.
What alternatives have you considered?
None.
Do you have any additional context?
Cases in terms aggregations when the fielddata is indexed or not should be accounted separately. For example, with geonames workload, if you trigger the above query with "field": "country_code.raw" - then low cardinality workflow is triggered, however, if you run with "field": "country_code" - then the regular dense cardinality workflow is triggered.
Is your feature request related to a problem?
While working on aggregations performance, I encountered a gap in workloads.
Presently, all the workloads do not have single term aggregation request as part of their runs. This is one of the common use case which we should definitely be benchmarking.
Example search requests which is missing:
Task in custom workload I used temporarily:
What solution would you like?
termin their name should be renamed toterm_queryto create a distinction between term queries and term aggregations.What alternatives have you considered?
None.
Do you have any additional context?
Cases in terms aggregations when the fielddata is indexed or not should be accounted separately. For example, with geonames workload, if you trigger the above query with
"field": "country_code.raw"- then low cardinality workflow is triggered, however, if you run with"field": "country_code"- then the regular dense cardinality workflow is triggered.