Skip to content

test(analyzer): add unit tests for analyzer module#1726

Open
LHT129 wants to merge 2 commits intoantgroup:mainfrom
LHT129:2026-03-13-添加-analyzer-模块测试

Hidden character warning

The head ref may contain hidden characters: "2026-03-13-\u6dfb\u52a0-analyzer-\u6a21\u5757\u6d4b\u8bd5"
Open

test(analyzer): add unit tests for analyzer module#1726
LHT129 wants to merge 2 commits intoantgroup:mainfrom
LHT129:2026-03-13-添加-analyzer-模块测试

Conversation

@LHT129
Copy link
Copy Markdown
Collaborator

@LHT129 LHT129 commented Mar 19, 2026

Summary

Add unit tests for the src/analyzer/ module which previously had no test coverage.

Changes

  • Add analyzer_test.cpp: Test CreateAnalyzer factory method with error handling and normal case
  • Add hgraph_analyzer_test.cpp: Test HGraphAnalyzer functionality
  • Update CMakeLists.txt to enable test compilation for analyzer module

Test Coverage

  • CreateAnalyzer with null index (error handling)
  • CreateAnalyzer with HGraph index (normal case)
  • HGraphAnalyzer::GetComponentCount()
  • HGraphAnalyzer::GetDuplicateRatio()
  • HGraphAnalyzer::GetBaseAvgDistance()
  • HGraphAnalyzer::GetNeighborRecall()

Testing

All 7 test cases pass (32 assertions).

Files Changed

  • src/analyzer/CMakeLists.txt
  • src/analyzer/analyzer_test.cpp (new)
  • src/analyzer/hgraph_analyzer_test.cpp (new)
  • tests/CMakeLists.txt

Checklist

  • Code follows VSAG coding style
  • All tests pass
  • PR description is clear

- Add analyzer_test.cpp to test CreateAnalyzer factory method
- Add hgraph_analyzer_test.cpp to test HGraphAnalyzer functionality
- Update CMakeLists.txt to enable test compilation
- Test coverage includes:
  - CreateAnalyzer with null index (error handling)
  - CreateAnalyzer with HGraph index (normal case)
  - HGraphAnalyzer creation
  - HGraphAnalyzer GetComponentCount
  - HGraphAnalyzer GetDuplicateRatio
  - HGraphAnalyzer GetBaseAvgDistance
  - HGraphAnalyzer GetNeighborRecall

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
@LHT129 LHT129 self-assigned this Mar 19, 2026
Copilot AI review requested due to automatic review settings March 19, 2026 09:44
@LHT129 LHT129 added kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) version/1.0 labels Mar 19, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the test coverage for the src/analyzer/ module by introducing a comprehensive suite of unit tests. The changes ensure the robustness of the CreateAnalyzer factory method and validate the analytical capabilities of the HGraphAnalyzer. This addition improves code quality and provides confidence in the analyzer's behavior across various scenarios.

Highlights

  • New Unit Tests for Analyzer Module: Added analyzer_test.cpp to test the CreateAnalyzer factory method, including error handling for null indices and successful creation with HGraph indices.
  • HGraph Analyzer Functionality Tests: Introduced hgraph_analyzer_test.cpp to cover the core functionalities of HGraphAnalyzer, such as GetComponentCount, GetDuplicateRatio, GetBaseAvgDistance, and GetNeighborRecall.
  • CMake Build System Updates: Modified CMakeLists.txt files to enable the compilation and linking of the newly added analyzer unit tests, ensuring they are part of the test suite when ENABLE_TESTS is active.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds valuable unit test coverage for the analyzer module, which is a great improvement. The tests cover error handling and normal functionality for CreateAnalyzer and various methods of HGraphAnalyzer. My review focuses on improving the maintainability of the new test code by reducing duplication and removing hardcoded values. I've suggested refactoring the tests in hgraph_analyzer_test.cpp to use Catch2's SECTION feature to eliminate repeated setup code, and using fmt::format in analyzer_test.cpp to avoid hardcoding parameters.

Comment on lines +35 to +53
auto params = vsag::JsonType::Parse(R"(
{
"dtype": "float32",
"metric_type": "l2",
"dim": 64,
"hgraph": {
"max_degree": 16,
"ef_construction": 100,
"base_codes": {
"codes_type": "flatten_codes",
"io_params": { "type": "block_memory_io" },
"quantization_params": { "type": "fp32" }
},
"graph": {
"io_params": { "type": "block_memory_io" }
}
}
}
)");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The dim value is hardcoded in the JSON string. This can lead to inconsistencies if the dim variable at line 32 is changed without updating the string. It's better to use fmt::format to construct the JSON string, ensuring the dimension value is always in sync. This will improve maintainability.

Suggested change
auto params = vsag::JsonType::Parse(R"(
{
"dtype": "float32",
"metric_type": "l2",
"dim": 64,
"hgraph": {
"max_degree": 16,
"ef_construction": 100,
"base_codes": {
"codes_type": "flatten_codes",
"io_params": { "type": "block_memory_io" },
"quantization_params": { "type": "fp32" }
},
"graph": {
"io_params": { "type": "block_memory_io" }
}
}
}
)");
auto params = vsag::JsonType::Parse(fmt::format(R"(
{{
"dtype": "float32",
"metric_type": "l2",
"dim": {},
"hgraph": {{
"max_degree": 16,
"ef_construction": 100,
"base_codes": {{
"codes_type": "flatten_codes",
"io_params": {{ "type": "block_memory_io" }},
"quantization_params": {{ "type": "fp32" }}
}},
"graph": {{
"io_params": {{ "type": "block_memory_io" }}
}}
}}
}}
)", dim));

Comment on lines +66 to +170
TEST_CASE("HGraphAnalyzer creation", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;

auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);

auto allocator = vsag::Engine::CreateDefaultAllocator();
vsag::AnalyzerParam param(allocator.get());
param.topk = 10;
param.base_sample_size = 5;

auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
REQUIRE(hgraph_analyzer != nullptr);
}

TEST_CASE("HGraphAnalyzer GetComponentCount", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;

auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);

auto allocator = vsag::Engine::CreateDefaultAllocator();
vsag::AnalyzerParam param(allocator.get());
param.topk = 10;
param.base_sample_size = 5;

auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
REQUIRE(hgraph_analyzer != nullptr);

auto components = hgraph_analyzer->GetComponentCount();

REQUIRE_FALSE(components.empty());
REQUIRE(*std::max_element(components.begin(), components.end()) > 0);
}

TEST_CASE("HGraphAnalyzer GetDuplicateRatio", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;

auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);

auto allocator = vsag::Engine::CreateDefaultAllocator();
vsag::AnalyzerParam param(allocator.get());
param.topk = 10;
param.base_sample_size = 5;

auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
REQUIRE(hgraph_analyzer != nullptr);

auto ratio = hgraph_analyzer->GetDuplicateRatio();
REQUIRE(ratio >= 0.0F);
REQUIRE(ratio <= 1.0F);
}

TEST_CASE("HGraphAnalyzer GetBaseAvgDistance", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;

auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);

auto allocator = vsag::Engine::CreateDefaultAllocator();
vsag::AnalyzerParam param(allocator.get());
param.topk = 10;
param.base_sample_size = 5;

auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
REQUIRE(hgraph_analyzer != nullptr);

auto avg_distance = hgraph_analyzer->GetBaseAvgDistance();
REQUIRE(avg_distance >= 0.0F);
}

TEST_CASE("HGraphAnalyzer GetNeighborRecall", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;

auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);

auto allocator = vsag::Engine::CreateDefaultAllocator();
vsag::AnalyzerParam param(allocator.get());
param.topk = 10;
param.base_sample_size = 5;

auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
REQUIRE(hgraph_analyzer != nullptr);

auto recall = hgraph_analyzer->GetNeighborRecall();
REQUIRE(recall >= 0.0F);
REQUIRE(recall <= 1.0F);
} No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is significant code duplication across the test cases in this file. The setup for creating the index and the HGraphAnalyzer is repeated in each test. This can be refactored by combining them into a single TEST_CASE and using SECTIONs for each specific test. This will make the tests more concise and easier to maintain.

TEST_CASE("HGraphAnalyzer functionality", "[ut][hgraph_analyzer]") {
    int64_t dim = 32;
    int64_t num_vectors = 50;

    auto index = create_test_hgraph_index(dim, num_vectors);
    auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
    REQUIRE(inner_index != nullptr);

    auto allocator = vsag::Engine::CreateDefaultAllocator();
    vsag::AnalyzerParam param(allocator.get());
    param.topk = 10;
    param.base_sample_size = 5;

    auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
        vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
    REQUIRE(hgraph_analyzer != nullptr);

    SECTION("GetComponentCount") {
        auto components = hgraph_analyzer->GetComponentCount();

        REQUIRE_FALSE(components.empty());
        REQUIRE(*std::max_element(components.begin(), components.end()) > 0);
    }

    SECTION("GetDuplicateRatio") {
        auto ratio = hgraph_analyzer->GetDuplicateRatio();
        REQUIRE(ratio >= 0.0F);
        REQUIRE(ratio <= 1.0F);
    }

    SECTION("GetBaseAvgDistance") {
        auto avg_distance = hgraph_analyzer->GetBaseAvgDistance();
        REQUIRE(avg_distance >= 0.0F);
    }

    SECTION("GetNeighborRecall") {
        auto recall = hgraph_analyzer->GetNeighborRecall();
        REQUIRE(recall >= 0.0F);
        REQUIRE(recall <= 1.0F);
    }
}

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Catch2 unit tests for src/analyzer/ and wires them into the test build so the analyzer module gets coverage.

Changes:

  • Added analyzer factory tests (CreateAnalyzer) including error handling + happy path.
  • Added HGraphAnalyzer unit tests for several metric/accessor methods.
  • Updated CMake to build/link the new analyzer_test library into the unittests target.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.

File Description
tests/CMakeLists.txt Links the new analyzer_test static lib into the unit test executable (including macOS force_load).
src/analyzer/CMakeLists.txt Builds analyzer_test when ENABLE_TESTS is on, and links required dependencies.
src/analyzer/analyzer_test.cpp New unit tests for CreateAnalyzer null-index error and HGraph index creation.
src/analyzer/hgraph_analyzer_test.cpp New unit tests validating HGraphAnalyzer creation and several analyzer metric methods.

Comment on lines +15 to +22
#include "hgraph_analyzer.h"

#include <catch2/catch_test_macros.hpp>

#include "algorithm/hgraph.h"
#include "fixtures.h"
#include "index/index_impl.h"
#include "vsag/vsag.h"

static vsag::IndexPtr
create_test_hgraph_index(int64_t dim, int64_t num_vectors, int64_t max_degree = 16) {
auto params = vsag::JsonType::Parse(fmt::format(R"(
auto components = hgraph_analyzer->GetComponentCount();

REQUIRE_FALSE(components.empty());
REQUIRE(*std::max_element(components.begin(), components.end()) > 0);
Comment on lines +32 to +35
int64_t dim = 64;
int64_t num_vectors = 100;

auto params = vsag::JsonType::Parse(R"(
Comment on lines +35 to +53
auto params = vsag::JsonType::Parse(R"(
{
"dtype": "float32",
"metric_type": "l2",
"dim": 64,
"hgraph": {
"max_degree": 16,
"ef_construction": 100,
"base_codes": {
"codes_type": "flatten_codes",
"io_params": { "type": "block_memory_io" },
"quantization_params": { "type": "fp32" }
},
"graph": {
"io_params": { "type": "block_memory_io" }
}
}
}
)");
REQUIRE(hgraph_analyzer != nullptr);
}

TEST_CASE("HGraphAnalyzer GetComponentCount", "[ut][hgraph_analyzer]") {
int64_t dim = 32;
int64_t num_vectors = 50;

auto index = create_test_hgraph_index(dim, num_vectors);
auto inner_index = std::dynamic_pointer_cast<vsag::IndexImpl<vsag::HGraph>>(index);
REQUIRE(inner_index != nullptr);

auto allocator = vsag::Engine::CreateDefaultAllocator();
Comment on lines +97 to +98
auto hgraph_analyzer = std::dynamic_pointer_cast<vsag::HGraphAnalyzer>(
vsag::CreateAnalyzer(inner_index->GetInnerIndex().get(), param));
add_library (analyzer OBJECT ${ANALYZER_SRC})

if (ENABLE_TESTS)
file (GLOB ANALYZER_TESTS "*_test.cpp")
- Add missing headers (#include <algorithm>, #include <fmt/format.h>)
- Use fmt::format to avoid hardcoded dimension values
- Rename variables to avoid shadowing (allocator -> analyzer_allocator)
- Consolidate multiple TEST_CASEs into one with SECTIONs for better performance

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Co-authored-by: Kimi-K2.5 <assistant@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/improvement Code improvements (variable/function renaming, refactoring, etc. ) module/testing size/L version/1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants