Add MUVERA processors for multi-vector search by praveenMprasad · Pull Request #3164 · opensearch-project/k-NN

praveenMprasad · 2026-03-13T17:02:13Z

Description

Add MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings) ingest and search request processors to enable ANN prefetch for multi-vector models like ColBERT and ColPali.

The MUVERA algorithm converts variable-length multi-vector embeddings into a single fixed-dimensional encoding (FDE) vector using SimHash clustering and random projections. This enables approximate nearest neighbor search on multi-vector representations, which previously required brute-force scoring via lateInteractionScore.

Two new processors:

muvera ingest processor — reads multi-vectors from a source field, produces an FDE vector, writes it to a knn_vector target field. Original multi-vectors remain in _source for reranking.
muvera_query search request processor — intercepts script_score queries containing query_vectors in script params, MUVERA-encodes them, and replaces the inner match_all with a knn query on the FDE field. The lateInteractionScore script wrapper stays intact for MaxSim reranking.

Key design decisions:

All hyperparameters have sensible defaults (k_sim=4, dim_proj=8, r_reps=20, seed=42)
FDE dimension is computed and logged at pipeline creation time
Optional fde_dimension parameter validates against computed value
Tested end-to-end on a live cluster with correct MaxSim reranking scores

Reference: MUVERA paper, fastembed implementation

Related Issues

Resolves #3163

Check List

New functionality includes testing.
- 34 unit tests across 3 test classes, stable across 10 iterations with random seeds
New functionality has been documented.
- Documentation PR to follow after initial review
API changes companion pull request created.
- No REST API changes — processors are configured via existing ingest/search pipeline APIs
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.
- Will create after initial review

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Introduces MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings) as ingest and search request processors in the k-NN plugin. Documents are encoded into fixed-size FDE vectors at ingest time for ANN prefetch, while the original multi-vectors are preserved for MaxSim reranking via lateInteractionScore. Resolves opensearch-project#3163 Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Signed-off-by: Navneet Verma <navneev@amazon.com>

navneet1v · 2026-03-14T18:54:56Z

@praveenMprasad please add the IT for this new Processor.

codecov · 2026-03-14T21:04:37Z

Codecov Report

❌ Patch coverage is 86.45418% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.59%. Comparing base (5419381) to head (eb2b695).

Files with missing lines	Patch %	Lines
...processor/muvera/MuveraSearchRequestProcessor.java	78.94%	8 Missing and 12 partials ⚠️
...ch/knn/processor/muvera/MuveraIngestProcessor.java	83.78%	7 Missing and 5 partials ⚠️
...opensearch/knn/processor/muvera/MuveraEncoder.java	97.50%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3164      +/-   ##
============================================
+ Coverage     82.52%   82.59%   +0.06%     
- Complexity     3935     4001      +66     
============================================
  Files           426      429       +3     
  Lines         14646    14897     +251     
  Branches       1869     1935      +66     
============================================
+ Hits          12087    12304     +217     
- Misses         1796     1811      +15     
- Partials        763      782      +19

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add MuveraProcessorIT with 7 REST-based integration tests covering the end-to-end MUVERA ingest and search pipeline flow. Tests verify pipeline creation, document indexing with FDE encoding, ANN prefetch via search pipeline, passthrough for non-script_score queries, and error handling for dimension mismatches and invalid fde_dimension config. Signed-off-by: Praveen M Prasad <prasadnu@amazon.com> Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

praveenMprasad · 2026-03-14T21:45:56Z

@praveenMprasad please add the IT for this new Processor.

@navneet1v Added MuveraProcessorIT with 7 integration tests covering the end-to-end flow - ingest pipeline creation, FDE encoding on indexed documents, search pipeline with ANN prefetch via muvera_query, passthrough for non-script_score queries, and error handling for dimension mismatches and invalid fde_dimension config. Please take a look.

navneet1v · 2026-03-14T22:49:38Z

Thank you @praveenMprasad . Will start reviewing the PR. I would need some to review it, since I had to first understand what MUVERA is.

navneet1v · 2026-03-17T03:41:59Z

@praveenMprasad before we can kickoff the review lets ensure that all CIs are passing.

cc: @Gankris96 and @kotwanikunal

The deleteIndex(String) method in MuveraProcessorIT conflicted with the static deleteIndex(String) in OpenSearchRestTestCase. Renamed to deleteTestIndex to avoid the override error. Also applied spotless formatting. Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

praveenMprasad · 2026-03-17T16:09:17Z

@navneet1v I've fixed the compilation issue. The CI workflows need maintainer approval to run, could you please approve them ?

Use Processor import instead of full package path in KNNPlugin.getProcessors(). Use ArrayList import instead of inline java.util.ArrayList in MuveraIngestProcessor. Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Add upper bound check on FDE dimension (rReps * 2^kSim * dimProj) against the k-NN engine max dimension limit of 16,000. This catches invalid kSim values early instead of failing at index time. Add oversample_factor > 0 validation in the search request processor factory. Add unit tests for both validations. Signed-off-by: Praveen Prasad <praveenMprasad@users.noreply.github.com> Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

mingshl · 2026-03-25T20:58:59Z

+     * Maximum allowed FDE dimension. Matches the k-NN engine max dimension limit (16,000)
+     * to ensure the FDE vector can be indexed in any supported engine.
+     */
+    static final int MAX_FDE_DIMENSION = 16_000;


is there chances people want to configure the max dimension? this can be a configuration setting in the processor ?

@mingshl The 16,000 cap mirrors the k-NN engine's max vector dimension, going above it would fail at indexing regardless. Users already control the actual FDE size through the MUVERA parameters (dim, k_sim, dim_proj, r_reps). Adding a configurable max would let users set a tighter bound, but it feels like an edge case since the engine enforces the real ceiling. Happy to add it as an optional max_fde_dimension processor config if you think it's worth it though.

praveenMprasad · 2026-03-27T00:13:47Z

OpenSearch MUVERA Benchmark Results

MUVERA params: k_sim=5, dim_proj=16, r_reps=20, FDE dim=10,240
Index: faiss HNSW, m=16, ef_construction=512, ef_search=512, 1 shard, 1 segment

Text retrieval

nfcorpus — ColBERTv2 (3,633 docs, 323 queries)

Approach	NDCG@1	NDCG@5	NDCG@10	% of Exact
Exact MaxSim (brute-force)	0.483	0.386	0.344	100%
MUVERA + MaxSim rerank (4x)	0.449	0.349	0.311	90.3%
Mean pool + MaxSim rerank	0.251	0.184	0.144	42.0%

SciFact — ColBERTv2 (5,183 docs, 300 queries)

Approach	NDCG@1	NDCG@5	NDCG@10	% of Exact
Exact MaxSim (brute-force)	0.597	0.674	0.692	100%
MUVERA + MaxSim rerank (4x)	0.590	0.655	0.671	97.1%
Mean pool + MaxSim rerank	0.350	0.359	0.363	52.4%

Multimodal retrieval

IRPAPERS — ColModernVBERT (3,230 pages, 180 queries, ~1,011 vectors/page)

Approach	R@1	R@5	R@10	R@20	% of Exact (R@1)
Exact MaxSim (brute-force)	40.6%	76.7%	83.9%	89.4%	100%
MUVERA + MaxSim rerank (4x)	37.2%	66.7%	73.3%	78.3%	91.7%

MUVERA+rerank recovers 90-97% of exact MaxSim quality across all three datasets, including visual documents with ~1,000 multi-vectors per page. Mean pooling retains only 42-52%, demonstrating why MUVERA is needed.

Full results: https://github.com/praveenMprasad/OpenSearch-k-NN/blob/feature/muvera-benchmark/benchmarks/muvera/BENCHMARK_RESULTS.md

navneet1v · 2026-03-27T00:39:05Z

@0ctopus13prime , @Gankris96 , @mingshl can we please review this PR.

mingshl · 2026-04-01T23:30:46Z

+        Map<String, Object> config = new HashMap<>();
+        config.put("source_field", SOURCE_FIELD);
+        config.put("target_field", TARGET_FIELD);
+        config.put("dim", 4);


every processor has ignoreMissing and ignoreFailure flag, would you add the UT and verify the edge cases?

for example,
in two documents, one has the source field, and the the second document doesn't have field.

when setting muvera pipeline when ignoreMissing is true, the pipeline should success even when one document has missing source field. For two documents, one document has muvera embedding target field, the second document doesn't have the embedding target field

when setting muvera pipeline when ignoreMissing is false, the second document should fail.

you might want to write down these four combinations down for ingest and search pipeline in the PR description. so that it make sense for us to review in these different configurations. And I would love to see the logic verified in the Unit tests.

@mingshl Thanks for the review! Added ignore_missing support to the ingest processor with unit tests covering the combinations.

Ingest Pipeline (muvera processor):

ignore_missing Source field present Behavior

false (default) Yes Produces FDE in target field

false No / null Throws IllegalArgumentException

true Yes Produces FDE in target field

true No / null Skips silently, no target field added

Two-document scenario is also tested: with ignore_missing=true, doc1 (has source) gets FDE, doc2 (missing source) is skipped. With ignore_missing=false, doc2 throws.

Search Pipeline (muvera_query processor):

ignore_failure Query has script_score with query_vectors Behavior

false (default) Yes Rewrites query with FDE KNN

false No Returns request unchanged (no error)

true Yes Rewrites query with FDE KNN

true No / malformed Exception caught by framework, original request passed through

The search processor already handles ignore_failure via OpenSearch's AbstractProcessor base class — the pipeline framework catches exceptions when ignore_failure=true. The processor itself returns the request unchanged if the query isn't a script_score type, so missing/non-matching queries don't error.

Two-document scenario (unit tested + integration tested):

ignore_missing=true: Doc1 (has colbert_vectors) → gets muvera_fde target field. Doc2 (missing colbert_vectors) → indexed successfully, no muvera_fde field added.

ignore_missing=false: Doc1 → gets muvera_fde. Doc2 → throws IllegalArgumentException, indexing fails.

Unit tests: testTwoDocuments_IgnoreMissingTrue_OneWithSourceOneWithout, testTwoDocuments_IgnoreMissingFalse_SecondDocFails

Integration tests: testMuveraIngest_whenIgnoreMissingTrue_thenMixedDocsSucceed, testMuveraIngest_whenIgnoreMissingFalse_thenMissingSourceFails

See commit: e1e4046

kotwanikunal · 2026-04-01T23:12:21Z

+    private final int dimProj;
+    private final int rReps;
+    private final int numPartitions;
+    private final double[][] simhashVectors;


Any particular reason to use double instead of float? The return type signatures are all floats.
The operations for similarity match can benefit with SIMD/Panama calculations and the memory consumption can also reduce by half.

Good point on the memory savings. The internal arrays use double for numerical precision during accumulation — with ~1000 vectors per document, the cluster center sums can get large and float accumulation could introduce meaningful rounding errors. The output is cast to float at the end since that's what the knn_vector field stores.

That said, for the simhash and projection matrices (which are generated once at init), float would be fine. I'll convert those to float in a follow-up to reduce the encoder's memory footprint. The cluster center accumulation should stay double to avoid precision loss during summation.

Open to converting everything to float if you think the precision tradeoff is acceptable — happy to benchmark the quality impact.

for the simhash and projection matrices (which are generated once at init), float would be fine. I'll convert those to float in a follow-up to reduce the encoder's memory footprint. The cluster center accumulation should stay double to avoid precision loss during summation.

I think this makes sense. Let's reduce the footprint as much as we can

Converted simhash and projection matrices to float. Cluster center accumulation stays double to avoid precision loss during summation. See latest commit.

kotwanikunal · 2026-04-01T23:20:38Z

+
+        // Determine result size from the request
+        int resultSize = request.source().size() > 0 ? request.source().size() : 10;
+        int prefetchK = resultSize * oversampleFactor;


Can you limit this to the some considerable size?

int prefetchK = Math.min(resultSize * oversampleFactor, 10_000);

Good catch — added the cap. int prefetchK = Math.min(resultSize * oversampleFactor, 10_000);

kotwanikunal · 2026-04-01T23:24:51Z

+
+    @Override
+    @SuppressWarnings("unchecked")
+    public SearchRequest processRequest(SearchRequest request) throws Exception {


Please break this method down into smaller methods

validateUserRequest() extractRequestParams() createKnnRequest()

Refactored into three methods: validateUserRequest(), extractRequestParams(), createKnnRequest().

kotwanikunal · 2026-04-01T23:25:26Z

+            throw new IllegalArgumentException("[" + QUERY_VECTORS_PARAM + "] in script params must be a list of vectors");
+        }
+
+        List<List<Number>> queryVectorsList = (List<List<Number>>) queryVectorsObj;


This is unsafe. This will lead to a cast exception - can you add validations before raw conversions?

Fixed — now validates each element type before casting: outer list → inner list → Number. No more unchecked cast.

kotwanikunal · 2026-04-01T23:26:37Z

+            return new MuveraIngestProcessor(tag, description, sourceField, targetField, encoder, dim, computedDimension);
+        }
+
+        private static long readLongProperty(


This method seems to be duplicated across classes. Move this to a utility class.

Moved to MuveraProcessorUtils utility class, both processors now reference it.

kotwanikunal · 2026-04-01T23:32:16Z

+     * Creates pipelines, indexes documents with multi-vectors, and searches
+     * using script_score with lateInteractionScore reranking over ANN prefetch.
+     */
+    public void testMuveraEndToEnd_whenIngestAndSearch_thenReturnsResults() throws Exception {


Test cases are lacking. Can you create an at scale test and ensure the complete functionality (ordering etc) is appropriate?

Added two new integration tests:

testMuveraEndToEnd_whenSearchWithKnownVectors_thenOrderingIsCorrect: indexes 3 docs with known vectors (close/medium/far), queries, verifies scores are in descending order and the closest doc ranks first.

testMuveraIngest_whenIgnoreMissingTrue_thenMixedDocsSucceed: indexes 2 docs through pipeline with ignore_missing=true — one with vectors (gets FDE), one without (skipped silently). Verifies both are indexed.

Vikasht34 · 2026-04-02T02:30:13Z

I will start taking deep look on PR from tomorrow

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

…/createKnnRequest, add type validation before casts Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

…eMissing integration tests Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

kotwanikunal · 2026-04-15T22:06:49Z

+    private final int dimProj;
+    private final int rReps;
+    private final int numPartitions;
+    private final double[][] simhashVectors;


for the simhash and projection matrices (which are generated once at init), float would be fine. I'll convert those to float in a follow-up to reduce the encoder's memory footprint. The cluster center accumulation should stay double to avoid precision loss during summation.

I think this makes sense. Let's reduce the footprint as much as we can

kotwanikunal · 2026-04-15T22:08:19Z

+        double scale = 1.0 / Math.sqrt(dimProj);
+
+        for (int r = 0; r < rReps; r++) {
+            double[][] centers = new double[numPartitions][dim];


This seems too heavy. Can we optimize this by reusing the arrays somehow? Allocation for every loop can get expensive.

Pre-allocated centers, counts, and clusterVecIndices arrays outside the loop and reset them per repetition. No more per-loop allocation.

kotwanikunal · 2026-04-15T22:09:30Z

+        }
+
+        int resultSize = request.source().size() > 0 ? request.source().size() : 10;
+        int prefetchK = Math.min(resultSize * oversampleFactor, 10_000);


Thanks for adding this check in. Please add comments explaining why it's restricted.

Added comment explaining the cap

…d prefetchK cap comment Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

…iables Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

…directly Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Removes the trace logging that was added to verify template query processing flow. The factory-level FDE dimension log is kept since it helps users diagnose mapping mismatches. Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Rewrites unit tests to cover the new template query approach where the processor extracts query_vectors from a TemplateQueryBuilder and sets the encoded FDE as a PipelineProcessingContext attribute, instead of mutating the inner query of a ScriptScoreQueryBuilder. Covers: context attribute set, template without query_vectors passes through, non-template query passes through, dimension mismatch, empty vectors, non-list query_vectors, missing context overload, and factory validation (required dim, fde_dimension mismatch). Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Rewrites the IT to use the template query search format the processor now expects. The test body is now a 'template' query wrapping a script_score with a knn inner query that uses ${muvera_fde} as the vector placeholder. The search request processor resolves the placeholder at rewrite time using the PipelineProcessingContext attribute set during request processing. Also removes references to the removed oversample_factor parameter and relaxes the ignore_failure assertion to focus on whether the processor itself propagates its own dimension-mismatch failure. Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

praveenMprasad · 2026-04-19T18:22:19Z

Thanks for the suggestion @mingshl / @vamshin. I've switched the search processor to use a template query instead of the script_score + match_all pattern. The user now sends:

{
  "query": {
    "template": {
      "script_score": {
        "query": { "knn": { "muvera_fde": { "vector": "${muvera_fde}", "k": 40 } } },
        "script": {
          "source": "lateInteractionScore(params.query_vectors, 'colbert_vectors', params._source, params.space_type)",
          "params": { "query_vectors": [[...]], "space_type": "innerproduct" }
        }
      }
    }
  }
}

The processor extracts query_vectors from the template content, encodes the FDE, and sets it as a PipelineProcessingContext attribute keyed on target_field. The template query resolves ${muvera_fde} during doRewrite using those context attributes, so the KNN query runs with the encoded FDE and script_score reranks the prefetched candidates.

Unit tests (MuveraSearchRequestProcessorTests) and integration tests (MuveraProcessorIT) have been updated for the new format and are passing.

praveenMprasad requested review from 0ctopus13prime, VijayanB, Vikasht34, heemin32, jmazanec15, junqiu-lei, luyuncheng, martin-gaievski, naveentatikonda, navneet1v, ryanbogan, shatejas and vamshin as code owners March 13, 2026 17:02

Merge branch 'main' into feature/muvera-processors

eb2b695

Signed-off-by: Navneet Verma <navneev@amazon.com>

navneet1v and others added 2 commits March 16, 2026 20:42

Merge branch 'main' into feature/muvera-processors

29f46f3

0ctopus13prime reviewed Mar 17, 2026

View reviewed changes

Comment thread src/main/java/org/opensearch/knn/plugin/KNNPlugin.java Outdated

Comment thread src/main/java/org/opensearch/knn/processor/muvera/MuveraIngestProcessor.java Outdated

Address review nits: clean up imports

bab7366

Use Processor import instead of full package path in KNNPlugin.getProcessors(). Use ArrayList import instead of inline java.util.ArrayList in MuveraIngestProcessor. Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Gankris96 reviewed Mar 20, 2026

View reviewed changes

Comment thread src/main/java/org/opensearch/knn/processor/muvera/MuveraEncoder.java

Comment thread src/main/java/org/opensearch/knn/processor/muvera/MuveraSearchRequestProcessor.java Outdated

prasadnu added 2 commits March 20, 2026 08:49

Fix spotless formatting in MuveraEncoderTests

f0e3b4b

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

mingshl reviewed Mar 25, 2026

View reviewed changes

Merge branch 'main' into feature/muvera-processors

3e75b93

mingshl reviewed Apr 1, 2026

View reviewed changes

kotwanikunal reviewed Apr 1, 2026

View reviewed changes

prasadnu added 5 commits April 9, 2026 22:50

Add ignoreMissing support to MuveraIngestProcessor with unit tests

8483eee

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Cap prefetchK to 10,000 in MuveraSearchRequestProcessor

8d1a2a4

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Refactor processRequest into validateUserRequest/extractRequestParams…

beb6241

…/createKnnRequest, add type validation before casts Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Move readLongProperty to MuveraProcessorUtils, add ordering and ignor…

13917a3

…eMissing integration tests Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Add ignore_failure search pipeline IT and ignore_missing=false ingest IT

14ec585

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

praveenMprasad force-pushed the feature/muvera-processors branch from d2c36eb to 14ec585 Compare April 9, 2026 21:51

kotwanikunal reviewed Apr 15, 2026

View reviewed changes

prasadnu added 9 commits April 17, 2026 17:07

Convert simhash/projection to float, reuse arrays in process loop, ad…

6ea3388

…d prefetchK cap comment Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Refactor search processor to template query approach with context var…

8a3073b

…iables Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Fix template query extraction: use TemplateQueryBuilder.getContent() …

1379664

…directly Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Add debug logging to trace template query processing

fdfb796

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Fix missing closing brace in MuveraIngestProcessor

f9cc137

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Add MuveraProcessorUtils with shared readLongProperty

979eb1b

Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

Remove debug logging from search request processor

dd2b92f

Removes the trace logging that was added to verify template query processing flow. The factory-level FDE dimension log is kept since it helps users diagnose mapping mismatches. Signed-off-by: Praveen Mohan Prasad <prasadnu@amazon.com>

`ignore_missing`	Source field present	Behavior
`false` (default)	Yes	Produces FDE in target field
`false`	No / null	Throws `IllegalArgumentException`
`true`	Yes	Produces FDE in target field
`true`	No / null	Skips silently, no target field added

`ignore_failure`	Query has `script_score` with `query_vectors`	Behavior
`false` (default)	Yes	Rewrites query with FDE KNN
`false`	No	Returns request unchanged (no error)
`true`	Yes	Rewrites query with FDE KNN
`true`	No / malformed	Exception caught by framework, original request passed through

Conversation

praveenMprasad commented Mar 13, 2026

Description

Related Issues

Check List

Uh oh!

navneet1v commented Mar 14, 2026

Uh oh!

codecov Bot commented Mar 14, 2026

Codecov Report

Uh oh!

praveenMprasad commented Mar 14, 2026

Uh oh!

navneet1v commented Mar 14, 2026

Uh oh!

navneet1v commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

praveenMprasad commented Mar 17, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praveenMprasad commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenSearch MUVERA Benchmark Results

Text retrieval

nfcorpus — ColBERTv2 (3,633 docs, 323 queries)

SciFact — ColBERTv2 (5,183 docs, 300 queries)

Multimodal retrieval

IRPAPERS — ColModernVBERT (3,230 pages, 180 queries, ~1,011 vectors/page)

Uh oh!

navneet1v commented Mar 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praveenMprasad Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praveenMprasad Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Vikasht34 commented Apr 2, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

navneet1v commented Mar 17, 2026 •

edited

Loading

praveenMprasad commented Mar 27, 2026 •

edited

Loading

praveenMprasad Apr 2, 2026 •

edited

Loading

praveenMprasad Apr 2, 2026 •

edited

Loading