Skip to content

Import diskann-garnet and vectorset#800

Merged
metajack merged 9 commits intomainfrom
metajack/diskann-garnet
Mar 10, 2026
Merged

Import diskann-garnet and vectorset#800
metajack merged 9 commits intomainfrom
metajack/diskann-garnet

Conversation

@metajack
Copy link
Contributor

This imports the diskann-garnet provider and a benchmarking utility for Garnet vector set workloads, vectorset, to the repo.

diskann-garnet: This is a DataProvider implementation for the Garnet cache service. It currently supports u8 and f32 full precision vector set using cosine distance. Quantization and other features will be added soon.

vectorset: is a benchmarking tool for Redis vector sets. Since Garnet speaks the Redis protocol, this allows for testing both Garnet and Redis vector workloads.

Both of these were previously developed in their own repos in their early stages.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR imports two new crates into the diskann workspace:

  • diskann-garnet: A DataProvider implementation for the Garnet cache service with FFI bindings
  • vectorset: A benchmarking CLI tool for Redis/Garnet vector set workloads

Changes:

  • Adds FFI-based integration between DiskANN and Garnet cache service supporting u8 and f32 full-precision vector sets with cosine distance
  • Implements free space map for ID management, label filtering via bitmaps, and comprehensive test coverage
  • Provides vectorset benchmarking utility for ingesting vectors, running similarity queries, and measuring recall

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
Cargo.toml Adds diskann-garnet and vectorset as workspace members
diskann-garnet/Cargo.toml Defines diskann-garnet crate as cdylib with dependencies
diskann-garnet/src/lib.rs Main FFI interface with unsafe C exports for index creation, insertion, search, and deletion
diskann-garnet/src/provider.rs GarnetProvider implementing DataProvider trait with accessor patterns
diskann-garnet/src/garnet.rs Garnet callback wrappers and types for read/write/delete/RMW operations
diskann-garnet/src/fsm.rs Free space map for ID allocation and reuse tracking
diskann-garnet/src/labels.rs Bitmap-based label filtering for filtered vector search
diskann-garnet/src/dyn_index.rs Type-erased DynIndex trait for runtime polymorphism
diskann-garnet/src/alloc.rs Custom 8-byte aligned allocator for Garnet data
diskann-garnet/src/test_utils.rs Test utilities providing in-memory storage callbacks
diskann-garnet/src/ffi_tests.rs Comprehensive FFI tests for all operations
diskann-garnet/diskann-garnet.nuspec NuGet package specification
vectorset/Cargo.toml Defines vectorset CLI tool with Redis and Azure dependencies
vectorset/src/main.rs CLI application with ping, ingest, delete, and query commands
vectorset/src/loader.rs Dataset loading utilities supporting batched iteration and multiple files
vectorset/config.toml.example Example configuration file
Cargo.lock Dependency resolution with many new transitive dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter
Copy link

codecov-commenter commented Feb 25, 2026

Codecov Report

❌ Patch coverage is 68.02811% with 728 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.96%. Comparing base (4db2797) to head (667c981).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
vectorset/src/main.rs 0.00% 332 Missing ⚠️
diskann-garnet/src/lib.rs 66.82% 140 Missing ⚠️
diskann-garnet/src/provider.rs 80.66% 105 Missing ⚠️
vectorset/src/loader.rs 0.00% 79 Missing ⚠️
diskann-garnet/src/fsm.rs 86.92% 40 Missing ⚠️
diskann-garnet/src/dyn_index.rs 64.78% 25 Missing ⚠️
diskann-garnet/src/garnet.rs 97.64% 6 Missing ⚠️
diskann-garnet/src/test_utils.rs 99.31% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (68.02%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #800      +/-   ##
==========================================
- Coverage   90.64%   88.96%   -1.69%     
==========================================
  Files         432      442      +10     
  Lines       79629    81906    +2277     
==========================================
+ Hits        72182    72868     +686     
- Misses       7447     9038    +1591     
Flag Coverage Δ
miri 88.96% <68.02%> (-1.69%) ⬇️
unittests 88.82% <68.02%> (-1.80%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-garnet/src/alloc.rs 100.00% <100.00%> (ø)
diskann-garnet/src/labels.rs 100.00% <100.00%> (ø)
diskann-garnet/src/test_utils.rs 99.31% <99.31%> (ø)
diskann-garnet/src/garnet.rs 97.64% <97.64%> (ø)
diskann-garnet/src/dyn_index.rs 64.78% <64.78%> (ø)
diskann-garnet/src/fsm.rs 86.92% <86.92%> (ø)
vectorset/src/loader.rs 0.00% <0.00%> (ø)
diskann-garnet/src/provider.rs 80.66% <80.66%> (ø)
diskann-garnet/src/lib.rs 66.82% <66.82%> (ø)
vectorset/src/main.rs 0.00% <0.00%> (ø)

... and 43 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hildebrandmw
Copy link
Contributor

Hey @metajack - one quick question for you. Are we publishing vectorset to crates.io? If not - can it live as a binary in diskann-garnet? If it's not getting published, it kind of breaks the cargo publish --workspace flow. Granted, excluding packages from cargo publish is pretty easy, but it does add another step to the process.

@metajack
Copy link
Contributor Author

Hey @metajack - one quick question for you. Are we publishing vectorset to crates.io? If not - can it live as a binary in diskann-garnet? If it's not getting published, it kind of breaks the cargo publish --workspace flow. Granted, excluding packages from cargo publish is pretty easy, but it does add another step to the process.

That is what publish = false is for. I'll add that.

@metajack metajack force-pushed the metajack/diskann-garnet branch 2 times, most recently from 66c9ebc to 4b23e22 Compare February 26, 2026 16:03
@metajack
Copy link
Contributor Author

@hailangx Can you answer the questions from mark and add a commit to address the issues to this PR?

@hailangx
Copy link
Member

hailangx commented Mar 2, 2026

I have another PR to resolve some of the comments #808

@metajack
Copy link
Contributor Author

metajack commented Mar 5, 2026

I think this is ready for final review. We've addressed all the comments I think and I have added bunch of documentation.

@metajack metajack force-pushed the metajack/diskann-garnet branch 2 times, most recently from 2511c0a to 03a8a56 Compare March 9, 2026 19:57
Jack Moffitt and others added 6 commits March 9, 2026 18:46
### Removal of Obsolete Wrapper

* Removed the `FilteredSearchResults` wrapper, which previously handled
conversion between internal and external IDs for search results. This is
no longer needed with previous change to expose generic external id in
the beta search strategy.

---------

Co-authored-by: Haiyang Xu <haixu@microsoft.com>
@metajack metajack force-pushed the metajack/diskann-garnet branch from 4a7ef84 to ce74a82 Compare March 9, 2026 23:47
@metajack metajack merged commit 5e0e49d into main Mar 10, 2026
24 checks passed
@metajack metajack deleted the metajack/diskann-garnet branch March 10, 2026 00:30
@arkrishn94 arkrishn94 mentioned this pull request Mar 11, 2026
arkrishn94 added a commit that referenced this pull request Mar 11, 2026
## What's Changed
* [quantization] 8bit distance kernels and ZipUnzip by @arkrishn94 in
#798
* bumping up bf-tree version by @backurs in
#819
* Import diskann-garnet and vectorset by @metajack in
#800
* Saving and loading in-memory btrees to disk by @backurs in
#820

   ## New Contributors
* @metajack made their first contribution in
#800

**Full Changelog**:
v0.48.0...v0.49.0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants