Expose max_distance as an optional setting in multi vector queries#478
Expose max_distance as an optional setting in multi vector queries#478justin-cechmanek wants to merge 4 commits intomainfrom
Conversation
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
There was a problem hiding this comment.
Pull request overview
This pull request adds support for configurable max_distance parameters in multi-vector queries, allowing users to specify different distance thresholds for each vector field. Previously, the max_distance was hardcoded to 2.0.
Changes:
- Added
max_distancefield to theVectorclass with default value of 2.0 and validation range [0.0, 2.0] - Modified
MultiVectorQueryto use per-vector max_distance values in VECTOR_RANGE queries - Changed query operator from
|(OR) toANDfor combining multiple vector range queries - Added comprehensive unit and integration tests for the new max_distance functionality
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| redisvl/query/aggregate.py | Added max_distance field to Vector class with validation; updated query building to use per-vector max_distance values; changed operator from OR to AND |
| tests/unit/test_aggregation_types.py | Added test coverage for max_distance parameter including validation tests and query string generation tests |
| tests/integration/test_aggregation.py | Added parameterized integration test to verify max_distance filtering behavior with various distance combinations |
| tests/integration/test_search_index.py | Minor formatting: added blank line for code style consistency |
| tests/integration/test_redis_cluster_support.py | Minor formatting: added blank lines for code style consistency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if not isinstance(max_distance, (float, int)): | ||
| raise ValueError("max_distance must be a value between 0.0 and 2.0") |
There was a problem hiding this comment.
The type check isinstance(max_distance, (float, int)) on line 43 is redundant because Pydantic's field validation already ensures the field is of type float. Additionally, this check will always pass since the field is declared as max_distance: float = 2.0. The more important validation is the range check on line 45, which should remain.
| if not isinstance(max_distance, (float, int)): | |
| raise ValueError("max_distance must be a value between 0.0 and 2.0") |
| field_name: The name of the vector field to search | ||
| dtype: The data type of the vector (default: "float32") | ||
| weight: The weight for this vector in the combined score (default: 1.0) | ||
| max_distance: The maximum distance for vector range search (default: 2.0, range: [0.0, 2.0]) |
There was a problem hiding this comment.
Thanks for providing the range!
vishal-bala
left a comment
There was a problem hiding this comment.
LGTM! Just one note where you can use built-in Pydantic validation instead of custom
| field_name: str | ||
| dtype: str = "float32" | ||
| weight: float = 1.0 | ||
| max_distance: float = 2.0 |
There was a problem hiding this comment.
Instead of a custom validation method, you could validate this with Pydantic's built-in Field (you might have to from pydantic import Field)
| max_distance: float = 2.0 | |
| max_distance: float = Field(default=2.0, ge=0.0, le=2.0) |
No description provided.