Summary
Weaviate v1.36 added query profiling, which returns per-shard execution timing breakdowns for search queries. The client needs to support enabling profiling and exposing the returned data.
API surface
Enabling profiling
- gRPC: Set
MetadataRequest.query_profile = true (field 11 in search_get.proto)
- GraphQL: Request
_additional { queryProfile } in the field selector
Response structure
SearchReply.query_profile: QueryProfile (optional, field 6)
└── shards: []ShardProfile
├── name: string # shard identifier
├── node: string # cluster node that executed this shard's search
└── searches: map<string, SearchProfile>
│ # key is search type (see below)
└── details: map<string, string>
# metric name → value as string
Search types (map keys in searches)
| Key |
When |
"vector" |
Vector/nearText/nearObject searches |
"keyword" |
BM25 keyword searches |
"object" |
Object lookups (Get with filters, no ranking) |
Hybrid searches produce both "vector" and "keyword" entries per shard.
Metric details
The details map contains timing and diagnostic metrics as string values. Keys are dynamic and should not be hardcoded. Treat this as an opaque map<string, string>. Common keys include:
Always present:
total_took: total time for this shard/search-type (Go duration format, e.g. "15.234ms")
Vector searches:
vector_search_took: time in the vector index
hnsw_flat_search: "true" / "false", whether flat search was used
knn_search_layer_N_took: per-layer HNSW traversal time
knn_search_rescore_took: PQ/BQ rescore time
flat_search_iteration_took, flat_search_rescore_took
Keyword (BM25) searches:
kwd_method, kwd_time, kwd_filter_size
kwd_1_tok_time: tokenization time
kwd_3_term_time: term lookup time
kwd_4_wand_time / kwd_4_bmw_time: scoring algorithm time
kwd_5_objects_time: object fetch time
kwd_6_res_count: result count
Filtered searches (any type):
filters_build_allow_list_took: filter evaluation time
filters_ids_matched: number of IDs matching the filter
sort_took, objects_took
New metrics may be added in future versions without a breaking change.
gRPC proto definition
// In MetadataRequest
bool query_profile = 11;
// In SearchReply
optional QueryProfile query_profile = 6;
message QueryProfile {
message SearchProfile {
map<string, string> details = 1;
}
message ShardProfile {
string name = 1;
string node = 2;
map<string, SearchProfile> searches = 3;
}
repeated ShardProfile shards = 1;
}
Expected behavior
- Vector search:
searches has "vector" key with total_took, vector_search_took, and index-specific metrics
- BM25 keyword search:
searches has "keyword" key with total_took and BM25-specific metrics
- Hybrid search:
searches has both "vector" and "keyword" keys per shard
- Filtered search: adds
filters_build_allow_list_took and filters_ids_matched to the relevant search profile
- Multi-node cluster: coordinator aggregates profiles from all shards across all nodes; each
ShardProfile has a distinct node value
- Not requested:
query_profile is nil/absent, no performance overhead
Client implementation guidance
- Add a
query_profile boolean option to the search/query metadata request
- Parse the
QueryProfile response into a structured type with shards -> searches -> details
- Do not hardcode metric keys. Expose
details as a string-to-string map/dictionary. The set of keys varies by search type, index type, and server version
- Metric values are human-readable strings (Go duration format for timings, integers/booleans as strings)
References
Summary
Weaviate v1.36 added query profiling, which returns per-shard execution timing breakdowns for search queries. The client needs to support enabling profiling and exposing the returned data.
API surface
Enabling profiling
MetadataRequest.query_profile = true(field 11 insearch_get.proto)_additional { queryProfile }in the field selectorResponse structure
Search types (map keys in
searches)"vector""keyword""object"Hybrid searches produce both
"vector"and"keyword"entries per shard.Metric details
The
detailsmap contains timing and diagnostic metrics as string values. Keys are dynamic and should not be hardcoded. Treat this as an opaquemap<string, string>. Common keys include:Always present:
total_took: total time for this shard/search-type (Go duration format, e.g."15.234ms")Vector searches:
vector_search_took: time in the vector indexhnsw_flat_search:"true"/"false", whether flat search was usedknn_search_layer_N_took: per-layer HNSW traversal timeknn_search_rescore_took: PQ/BQ rescore timeflat_search_iteration_took,flat_search_rescore_tookKeyword (BM25) searches:
kwd_method,kwd_time,kwd_filter_sizekwd_1_tok_time: tokenization timekwd_3_term_time: term lookup timekwd_4_wand_time/kwd_4_bmw_time: scoring algorithm timekwd_5_objects_time: object fetch timekwd_6_res_count: result countFiltered searches (any type):
filters_build_allow_list_took: filter evaluation timefilters_ids_matched: number of IDs matching the filtersort_took,objects_tookNew metrics may be added in future versions without a breaking change.
gRPC proto definition
Expected behavior
searcheshas"vector"key withtotal_took,vector_search_took, and index-specific metricssearcheshas"keyword"key withtotal_tookand BM25-specific metricssearcheshas both"vector"and"keyword"keys per shardfilters_build_allow_list_tookandfilters_ids_matchedto the relevant search profileShardProfilehas a distinctnodevaluequery_profileisnil/absent, no performance overheadClient implementation guidance
query_profileboolean option to the search/query metadata requestQueryProfileresponse into a structured type withshards -> searches -> detailsdetailsas a string-to-string map/dictionary. The set of keys varies by search type, index type, and server versionReferences