test(analyzer): add unit tests for analyzer module#1726
Hidden character warning
test(analyzer): add unit tests for analyzer module#1726LHT129 wants to merge 2 commits intoantgroup:mainfrom
Conversation
- Add analyzer_test.cpp to test CreateAnalyzer factory method - Add hgraph_analyzer_test.cpp to test HGraphAnalyzer functionality - Update CMakeLists.txt to enable test compilation - Test coverage includes: - CreateAnalyzer with null index (error handling) - CreateAnalyzer with HGraph index (normal case) - HGraphAnalyzer creation - HGraphAnalyzer GetComponentCount - HGraphAnalyzer GetDuplicateRatio - HGraphAnalyzer GetBaseAvgDistance - HGraphAnalyzer GetNeighborRecall Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the test coverage for the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds valuable unit test coverage for the analyzer module, which is a great improvement. The tests cover error handling and normal functionality for CreateAnalyzer and various methods of HGraphAnalyzer. My review focuses on improving the maintainability of the new test code by reducing duplication and removing hardcoded values. I've suggested refactoring the tests in hgraph_analyzer_test.cpp to use Catch2's SECTION feature to eliminate repeated setup code, and using fmt::format in analyzer_test.cpp to avoid hardcoding parameters.
src/analyzer/analyzer_test.cpp
Outdated
| auto params = vsag::JsonType::Parse(R"( | ||
| { | ||
| "dtype": "float32", | ||
| "metric_type": "l2", | ||
| "dim": 64, | ||
| "hgraph": { | ||
| "max_degree": 16, | ||
| "ef_construction": 100, | ||
| "base_codes": { | ||
| "codes_type": "flatten_codes", | ||
| "io_params": { "type": "block_memory_io" }, | ||
| "quantization_params": { "type": "fp32" } | ||
| }, | ||
| "graph": { | ||
| "io_params": { "type": "block_memory_io" } | ||
| } | ||
| } | ||
| } | ||
| )"); |
There was a problem hiding this comment.
The dim value is hardcoded in the JSON string. This can lead to inconsistencies if the dim variable at line 32 is changed without updating the string. It's better to use fmt::format to construct the JSON string, ensuring the dimension value is always in sync. This will improve maintainability.
| auto params = vsag::JsonType::Parse(R"( | |
| { | |
| "dtype": "float32", | |
| "metric_type": "l2", | |
| "dim": 64, | |
| "hgraph": { | |
| "max_degree": 16, | |
| "ef_construction": 100, | |
| "base_codes": { | |
| "codes_type": "flatten_codes", | |
| "io_params": { "type": "block_memory_io" }, | |
| "quantization_params": { "type": "fp32" } | |
| }, | |
| "graph": { | |
| "io_params": { "type": "block_memory_io" } | |
| } | |
| } | |
| } | |
| )"); | |
| auto params = vsag::JsonType::Parse(fmt::format(R"( | |
| {{ | |
| "dtype": "float32", | |
| "metric_type": "l2", | |
| "dim": {}, | |
| "hgraph": {{ | |
| "max_degree": 16, | |
| "ef_construction": 100, | |
| "base_codes": {{ | |
| "codes_type": "flatten_codes", | |
| "io_params": {{ "type": "block_memory_io" }}, | |
| "quantization_params": {{ "type": "fp32" }} | |
| }}, | |
| "graph": {{ | |
| "io_params": {{ "type": "block_memory_io" }} | |
| }} | |
| }} | |
| }} | |
| )", dim)); |
| TEST_CASE("HGraphAnalyzer creation", "[ut][hgraph_analyzer]") { | ||
| int64_t dim = 32; | ||
| int64_t num_vectors = 50; | ||
|
|
||
| auto index = create_test_hgraph_index(dim, num_vectors); | ||
| auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index); | ||
| REQUIRE(inner_index != nullptr); | ||
|
|
||
| auto allocator = vsag::Engine::CreateDefaultAllocator(); | ||
| vsag::AnalyzerParam param(allocator.get()); | ||
| param.topk = 10; | ||
| param.base_sample_size = 5; | ||
|
|
||
| auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>( | ||
| vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param)); | ||
| REQUIRE(hgraph_analyzer != nullptr); | ||
| } | ||
|
|
||
| TEST_CASE("HGraphAnalyzer GetComponentCount", "[ut][hgraph_analyzer]") { | ||
| int64_t dim = 32; | ||
| int64_t num_vectors = 50; | ||
|
|
||
| auto index = create_test_hgraph_index(dim, num_vectors); | ||
| auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index); | ||
| REQUIRE(inner_index != nullptr); | ||
|
|
||
| auto allocator = vsag::Engine::CreateDefaultAllocator(); | ||
| vsag::AnalyzerParam param(allocator.get()); | ||
| param.topk = 10; | ||
| param.base_sample_size = 5; | ||
|
|
||
| auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>( | ||
| vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param)); | ||
| REQUIRE(hgraph_analyzer != nullptr); | ||
|
|
||
| auto components = hgraph_analyzer->GetComponentCount(); | ||
|
|
||
| REQUIRE_FALSE(components.empty()); | ||
| REQUIRE(*std::max_element(components.begin(), components.end()) > 0); | ||
| } | ||
|
|
||
| TEST_CASE("HGraphAnalyzer GetDuplicateRatio", "[ut][hgraph_analyzer]") { | ||
| int64_t dim = 32; | ||
| int64_t num_vectors = 50; | ||
|
|
||
| auto index = create_test_hgraph_index(dim, num_vectors); | ||
| auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index); | ||
| REQUIRE(inner_index != nullptr); | ||
|
|
||
| auto allocator = vsag::Engine::CreateDefaultAllocator(); | ||
| vsag::AnalyzerParam param(allocator.get()); | ||
| param.topk = 10; | ||
| param.base_sample_size = 5; | ||
|
|
||
| auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>( | ||
| vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param)); | ||
| REQUIRE(hgraph_analyzer != nullptr); | ||
|
|
||
| auto ratio = hgraph_analyzer->GetDuplicateRatio(); | ||
| REQUIRE(ratio >= 0.0F); | ||
| REQUIRE(ratio <= 1.0F); | ||
| } | ||
|
|
||
| TEST_CASE("HGraphAnalyzer GetBaseAvgDistance", "[ut][hgraph_analyzer]") { | ||
| int64_t dim = 32; | ||
| int64_t num_vectors = 50; | ||
|
|
||
| auto index = create_test_hgraph_index(dim, num_vectors); | ||
| auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index); | ||
| REQUIRE(inner_index != nullptr); | ||
|
|
||
| auto allocator = vsag::Engine::CreateDefaultAllocator(); | ||
| vsag::AnalyzerParam param(allocator.get()); | ||
| param.topk = 10; | ||
| param.base_sample_size = 5; | ||
|
|
||
| auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>( | ||
| vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param)); | ||
| REQUIRE(hgraph_analyzer != nullptr); | ||
|
|
||
| auto avg_distance = hgraph_analyzer->GetBaseAvgDistance(); | ||
| REQUIRE(avg_distance >= 0.0F); | ||
| } | ||
|
|
||
| TEST_CASE("HGraphAnalyzer GetNeighborRecall", "[ut][hgraph_analyzer]") { | ||
| int64_t dim = 32; | ||
| int64_t num_vectors = 50; | ||
|
|
||
| auto index = create_test_hgraph_index(dim, num_vectors); | ||
| auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index); | ||
| REQUIRE(inner_index != nullptr); | ||
|
|
||
| auto allocator = vsag::Engine::CreateDefaultAllocator(); | ||
| vsag::AnalyzerParam param(allocator.get()); | ||
| param.topk = 10; | ||
| param.base_sample_size = 5; | ||
|
|
||
| auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>( | ||
| vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param)); | ||
| REQUIRE(hgraph_analyzer != nullptr); | ||
|
|
||
| auto recall = hgraph_analyzer->GetNeighborRecall(); | ||
| REQUIRE(recall >= 0.0F); | ||
| REQUIRE(recall <= 1.0F); | ||
| } No newline at end of file |
There was a problem hiding this comment.
There is significant code duplication across the test cases in this file. The setup for creating the index and the HGraphAnalyzer is repeated in each test. This can be refactored by combining them into a single TEST_CASE and using SECTIONs for each specific test. This will make the tests more concise and easier to maintain.
TEST_CASE("HGraphAnalyzer functionality", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;
auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);
auto allocator = vsag::Engine::CreateDefaultAllocator();
vsag::AnalyzerParam param(allocator.get());
param.topk = 10;
param.base_sample_size = 5;
auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
REQUIRE(hgraph_analyzer != nullptr);
SECTION("GetComponentCount") {
auto components = hgraph_analyzer->GetComponentCount();
REQUIRE_FALSE(components.empty());
REQUIRE(*std::max_element(components.begin(), components.end()) > 0);
}
SECTION("GetDuplicateRatio") {
auto ratio = hgraph_analyzer->GetDuplicateRatio();
REQUIRE(ratio >= 0.0F);
REQUIRE(ratio <= 1.0F);
}
SECTION("GetBaseAvgDistance") {
auto avg_distance = hgraph_analyzer->GetBaseAvgDistance();
REQUIRE(avg_distance >= 0.0F);
}
SECTION("GetNeighborRecall") {
auto recall = hgraph_analyzer->GetNeighborRecall();
REQUIRE(recall >= 0.0F);
REQUIRE(recall <= 1.0F);
}
}There was a problem hiding this comment.
Pull request overview
Adds Catch2 unit tests for src/analyzer/ and wires them into the test build so the analyzer module gets coverage.
Changes:
- Added analyzer factory tests (
CreateAnalyzer) including error handling + happy path. - Added
HGraphAnalyzerunit tests for several metric/accessor methods. - Updated CMake to build/link the new
analyzer_testlibrary into theunitteststarget.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
| tests/CMakeLists.txt | Links the new analyzer_test static lib into the unit test executable (including macOS force_load). |
| src/analyzer/CMakeLists.txt | Builds analyzer_test when ENABLE_TESTS is on, and links required dependencies. |
| src/analyzer/analyzer_test.cpp | New unit tests for CreateAnalyzer null-index error and HGraph index creation. |
| src/analyzer/hgraph_analyzer_test.cpp | New unit tests validating HGraphAnalyzer creation and several analyzer metric methods. |
| #include "hgraph_analyzer.h" | ||
|
|
||
| #include <catch2/catch_test_macros.hpp> | ||
|
|
||
| #include "algorithm/hgraph.h" | ||
| #include "fixtures.h" | ||
| #include "index/index_impl.h" | ||
| #include "vsag/vsag.h" |
|
|
||
| static vsag::IndexPtr | ||
| create_test_hgraph_index(int64_t dim, int64_t num_vectors, int64_t max_degree = 16) { | ||
| auto params = vsag::JsonType::Parse(fmt::format(R"( |
| auto components = hgraph_analyzer->GetComponentCount(); | ||
|
|
||
| REQUIRE_FALSE(components.empty()); | ||
| REQUIRE(*std::max_element(components.begin(), components.end()) > 0); |
src/analyzer/analyzer_test.cpp
Outdated
| int64_t dim = 64; | ||
| int64_t num_vectors = 100; | ||
|
|
||
| auto params = vsag::JsonType::Parse(R"( |
src/analyzer/analyzer_test.cpp
Outdated
| auto params = vsag::JsonType::Parse(R"( | ||
| { | ||
| "dtype": "float32", | ||
| "metric_type": "l2", | ||
| "dim": 64, | ||
| "hgraph": { | ||
| "max_degree": 16, | ||
| "ef_construction": 100, | ||
| "base_codes": { | ||
| "codes_type": "flatten_codes", | ||
| "io_params": { "type": "block_memory_io" }, | ||
| "quantization_params": { "type": "fp32" } | ||
| }, | ||
| "graph": { | ||
| "io_params": { "type": "block_memory_io" } | ||
| } | ||
| } | ||
| } | ||
| )"); |
| REQUIRE(hgraph_analyzer != nullptr); | ||
| } | ||
|
|
||
| TEST_CASE("HGraphAnalyzer GetComponentCount", "[ut][hgraph_analyzer]") { |
| int64_t dim = 32; | ||
| int64_t num_vectors = 50; | ||
|
|
||
| auto index = create_test_hgraph_index(dim, num_vectors); |
| auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index); | ||
| REQUIRE(inner_index != nullptr); | ||
|
|
||
| auto allocator = vsag::Engine::CreateDefaultAllocator(); |
| auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>( | ||
| vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param)); |
| add_library (analyzer OBJECT ${ANALYZER_SRC}) | ||
|
|
||
| if (ENABLE_TESTS) | ||
| file (GLOB ANALYZER_TESTS "*_test.cpp") |
- Add missing headers (#include <algorithm>, #include <fmt/format.h>) - Use fmt::format to avoid hardcoded dimension values - Rename variables to avoid shadowing (allocator -> analyzer_allocator) - Consolidate multiple TEST_CASEs into one with SECTIONs for better performance Signed-off-by: LHT129 <tianlan.lht@antgroup.com> Co-authored-by: Kimi-K2.5 <assistant@example.com>
Summary
Add unit tests for the
src/analyzer/module which previously had no test coverage.Changes
analyzer_test.cpp: TestCreateAnalyzerfactory method with error handling and normal casehgraph_analyzer_test.cpp: TestHGraphAnalyzerfunctionalityCMakeLists.txtto enable test compilation for analyzer moduleTest Coverage
CreateAnalyzerwith null index (error handling)CreateAnalyzerwith HGraph index (normal case)HGraphAnalyzer::GetComponentCount()HGraphAnalyzer::GetDuplicateRatio()HGraphAnalyzer::GetBaseAvgDistance()HGraphAnalyzer::GetNeighborRecall()Testing
All 7 test cases pass (32 assertions).
Files Changed
src/analyzer/CMakeLists.txtsrc/analyzer/analyzer_test.cpp(new)src/analyzer/hgraph_analyzer_test.cpp(new)tests/CMakeLists.txtChecklist