Skip to content

fix(hgraph): add null check in GetVectorByInnerId to prevent crash during Tune#1761

Open
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃

Hidden character warning

The head ref may contain hidden characters: "2026-03-26-\u4fee\u590d-hgraph-tune-\u65f6-getvectorbyinnerid-\u5d29\u6e83"
Open

fix(hgraph): add null check in GetVectorByInnerId to prevent crash during Tune#1761
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃

Conversation

@LHT129
Copy link
Copy Markdown
Collaborator

@LHT129 LHT129 commented Mar 26, 2026

Summary

This PR fixes a crash bug in HGraph::GetVectorByInnerId when Tune transitions from no-reorder to reorder mode. The crash occurs because use_reorder_ is set to true before high_precise_codes_ is populated, leading to a null pointer dereference.

Changes

  • Added null check for high_precise_codes_ before using it in GetVectorByInnerId
  • Added null check for raw_vector_ before using it in GetVectorByInnerId
  • Added unit test for Tune with ignore_reorder scenario

Root Cause

When Tune transitions from no-reorder to reorder mode:

  1. use_reorder_ is set to true (line 576)
  2. GetVectorByInnerId is called to collect training data (line 594)
  3. GetVectorByInnerId uses use_reorder_ to select codes (line 2072)
  4. Since high_precise_codes_ is still nullptr, accessing it causes SIGSEGV

Testing

  • Added new test case: "(PR) HGraph Tune with ignore_reorder"
  • All existing HGraph PR tests pass

Related Issues

Fixes #1760

Checklist

  • Code follows VSAG coding style
  • All tests pass
  • PR description is clear

Copilot AI review requested due to automatic review settings March 26, 2026 07:33
@LHT129 LHT129 self-assigned this Mar 26, 2026
@LHT129 LHT129 added kind/bug Something isn't working version/1.0 labels Mar 26, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical segmentation fault in the HGraph algorithm. The issue stemmed from an incorrect sequence of operations during the Tune process, where the system attempted to use reordered codes before they were properly initialized, leading to a crash. The solution ensures that data access is safe by verifying pointer validity, thereby enhancing the robustness of the HGraph tuning mechanism.

Highlights

  • Crash Fix: Addressed a crash bug in HGraph::GetVectorByInnerId that occurred during Tune when transitioning from no-reorder to reorder mode, caused by accessing an uninitialized high_precise_codes_.
  • Null Checks: Implemented null checks for high_precise_codes_ and raw_vector_ within GetVectorByInnerId to prevent null pointer dereferences.
  • Unit Testing: Introduced a new unit test case specifically for HGraph Tune with ignore_reorder to validate the fix and ensure stability in this scenario.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a SIGSEGV crash in HGraph::Tune by adding null checks for high_precise_codes_ and raw_vector_ within the HGraph::GetVectorByInnerId function. This prevents dereferencing null pointers when use_reorder_ is true but high_precise_codes_ has not yet been initialized. A new test case, HGraph Tune with ignore_reorder, has been added to validate the fix. The reviewer suggests improving the readability of the codes selection logic in GetVectorByInnerId by using an if-else if-else structure and recommends applying similar null checks to CalcDistanceById to prevent potential crashes in related code paths.

Comment on lines +2072 to +2074
auto codes = (use_reorder_ and high_precise_codes_ != nullptr) ? high_precise_codes_
: basic_flatten_codes_;
codes = (create_new_raw_vector_ and raw_vector_ != nullptr) ? raw_vector_ : codes;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While the added null checks fix the crash, the logic for selecting codes can be made more readable and less error-prone by avoiding the overwrite pattern. A single if-else if-else block would clarify the priority of which codes implementation to use.

This makes the code easier to understand and maintain.

Additionally, similar potentially unsafe logic exists in CalcDistanceById (lines 1486-1492 and 1500-1506) where high_precise_codes_ or raw_vector_ could be assigned without a null check. It would be beneficial to apply a similar fix in those locations as well to prevent potential crashes.

    FlattenInterfacePtr codes;
    if (create_new_raw_vector_ and raw_vector_ != nullptr) {
        codes = raw_vector_;
    } else if (use_reorder_ and high_precise_codes_ != nullptr) {
        codes = high_precise_codes_;
    } else {
        codes = basic_flatten_codes_;
    }

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in HGraph::GetVectorByInnerId during Tune when transitioning to reorder mode before high_precise_codes_ is initialized, and adds a regression test for the ignore_reorder Tune scenario.

Changes:

  • Add nullptr-aware selection of code storage in GetVectorByInnerId to avoid dereferencing high_precise_codes_ / raw_vector_ when unset.
  • Add a unit test covering Tune with use_reorder=true + ignore_reorder=true.
  • Add planning/task markdown files documenting the fix approach.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/algorithm/hgraph.cpp Prevents null dereference by falling back to basic codes/raw vectors only when available.
tests/test_hgraph.cpp Adds a regression test for Tune with ignore_reorder and validates search behavior afterwards.
plan.md Documents analysis and implementation plan for the crash fix.
TASK.md Points to a local task file path used during development.
Comments suppressed due to low confidence (1)

TASK.md:1

  • This file hard-codes an absolute local filesystem path, which is not portable and can leak developer environment details into the repo. Suggest removing TASK.md from the PR, or replacing it with a repo-relative reference (or a short description) that works for all contributors.

Comment on lines +998 to +1000
auto origin_size = vsag::Options::Instance().block_size_limit();
auto size = 1024 * 1024 * 2;
vsag::Options::Instance().set_block_size_limit(size);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_block_size_limit(origin_size) will not run if a REQUIRE(...) fails (Catch2 aborts the test case immediately), which can leak global state into subsequent tests and cause cascading failures. Use an RAII guard (e.g., a small local struct with destructor, or a scope-exit helper) to restore block_size_limit regardless of assertion outcomes.

Copilot uses AI. Check for mistakes.
Comment on lines +1032 to +1034
auto tune_result = index->Tune(param2, true);
REQUIRE(tune_result.has_value());
REQUIRE(tune_result.value());
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_block_size_limit(origin_size) will not run if a REQUIRE(...) fails (Catch2 aborts the test case immediately), which can leak global state into subsequent tests and cause cascading failures. Use an RAII guard (e.g., a small local struct with destructor, or a scope-exit helper) to restore block_size_limit regardless of assertion outcomes.

Copilot uses AI. Check for mistakes.
auto search_param = fmt::format(fixtures::search_param_tmp, 200, false);
HGraphTestIndex::TestGeneral(index, dataset, search_param, 0.7);

vsag::Options::Instance().set_block_size_limit(origin_size);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_block_size_limit(origin_size) will not run if a REQUIRE(...) fails (Catch2 aborts the test case immediately), which can leak global state into subsequent tests and cause cascading failures. Use an RAII guard (e.g., a small local struct with destructor, or a scope-exit helper) to restore block_size_limit regardless of assertion outcomes.

Copilot uses AI. Check for mistakes.
plan.md Outdated
- hgraph.cpp:520-630 (Tune 函数)
- hgraph.cpp:2071-2080 (GetVectorByInnerId 函数)
- **关联Issue**: https://github.com/antgroup/vsag/issues/1760
- **原始任务**: /home/tianlan.lht/code/workspace/agent-hive/tasks/in_progress/2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃.md No newline at end of file
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan.md includes an absolute local path under “原始任务”, which has the same portability and information disclosure concerns as TASK.md. Remove the absolute path or replace it with a repo-relative location (or omit it entirely).

Suggested change
- **原始任务**: /home/tianlan.lht/code/workspace/agent-hive/tasks/in_progress/2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃.md
- **原始任务**: tasks/in_progress/2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃.md

Copilot uses AI. Check for mistakes.
Comment on lines +2072 to +2074
auto codes = (use_reorder_ and high_precise_codes_ != nullptr) ? high_precise_codes_
: basic_flatten_codes_;
codes = (create_new_raw_vector_ and raw_vector_ != nullptr) ? raw_vector_ : codes;
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses the alternative token and instead of the more commonly used && in C++ codebases; if the surrounding file/project consistently uses &&, switching to && here will improve consistency and reduce surprise for readers/tools.

Suggested change
auto codes = (use_reorder_ and high_precise_codes_ != nullptr) ? high_precise_codes_
: basic_flatten_codes_;
codes = (create_new_raw_vector_ and raw_vector_ != nullptr) ? raw_vector_ : codes;
auto codes = (use_reorder_ && high_precise_codes_ != nullptr) ? high_precise_codes_
: basic_flatten_codes_;
codes = (create_new_raw_vector_ && raw_vector_ != nullptr) ? raw_vector_ : codes;

Copilot uses AI. Check for mistakes.
@LHT129 LHT129 force-pushed the 2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃 branch from a66a9ff to bc1ccf6 Compare March 27, 2026 02:47
@pull-request-size pull-request-size bot added size/M and removed size/L labels Mar 27, 2026
Copilot AI review requested due to automatic review settings March 27, 2026 02:51
@LHT129 LHT129 force-pushed the 2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃 branch 2 times, most recently from 2480a76 to 5c881b4 Compare March 27, 2026 02:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comment on lines +998 to +1001
auto origin_size = vsag::Options::Instance().block_size_limit();
auto size = 1024 * 1024 * 2;
vsag::Options::Instance().set_block_size_limit(size);

Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test mutates a global singleton option (block_size_limit) and restores it only at the end of the test. If any REQUIRE(...) before line 1062 fails, the restore line will not run, potentially leaking global state into subsequent tests and causing flaky failures. Prefer an RAII guard (scope-exit / destructor-based restorer) that resets block_size_limit even on early returns/assert failures.

Copilot uses AI. Check for mistakes.
Comment on lines +574 to +577
bool need_enable_reorder = false;
if (not use_reorder_ and inner_parameter->use_reorder) {
// [case 4] assign new precise_code
use_reorder_ = true;
need_enable_reorder = true;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says null checks were added in GetVectorByInnerId for high_precise_codes_ and raw_vector_, but the provided diff only shows deferring use_reorder_ enabling in Tune(...) (plus a test). If null checks are part of the intended fix, they appear to be missing from this change set; otherwise, please update the PR title/description to match the actual fix.

Copilot uses AI. Check for mistakes.
Comment on lines +1060 to +1061
REQUIRE(search_result.value()->GetDim() > 0);

Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final assertion GetDim() > 0 is very weak for validating the post-tune search path, and it may pass even if the result shape/top-k is incorrect. Consider asserting more specific invariants such as returned topk == 5, NumElements == 1 (if applicable in this API), and/or that returned IDs fall within GetMinAndMaxId() range. This makes the regression test more robust and easier to diagnose when it fails.

Suggested change
REQUIRE(search_result.value()->GetDim() > 0);
auto result = search_result.value();
// Validate that the search result has the expected shape and IDs.
REQUIRE(result->GetDim() == 5);
REQUIRE(result->GetNumElements() == 1);
auto id_range = base_range.value();
auto* ids = result->GetIds();
REQUIRE(ids != nullptr);
for (int64_t i = 0; i < result->GetDim(); ++i) {
REQUIRE(ids[i] >= id_range.first);
REQUIRE(ids[i] <= id_range.second);
}

Copilot uses AI. Check for mistakes.
@LHT129 LHT129 force-pushed the 2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃 branch from 5c881b4 to fead75d Compare March 27, 2026 06:11
@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 27, 2026
Copilot AI review requested due to automatic review settings March 27, 2026 06:12
@LHT129 LHT129 force-pushed the 2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃 branch from fead75d to 2b29f12 Compare March 27, 2026 06:12
- Delay use_reorder_ assignment until after high_precise_codes_ is populated
- Add null check for high_precise_codes_ and raw_vector_ in GetVectorByInnerId
- Add unit test for Tune with ignore_reorder scenario
- Sync param->use_reorder when enabling reorder in Tune

Fixes antgroup#1760

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
@LHT129 LHT129 force-pushed the 2026-03-26-修复-hgraph-tune-时-getvectorbyinnerid-崩溃 branch from 2b29f12 to af7f4cc Compare March 27, 2026 06:14
@pull-request-size pull-request-size bot added size/M and removed size/L labels Mar 27, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment on lines 574 to +623
@@ -616,6 +617,11 @@ HGraph::Tune(const std::string& parameters, bool disable_future_tuning) {
high_precise_codes_ =
tune_and_rebuild(is_tune_precise_code, high_precise_codes_, new_precise_code);

if (need_enable_reorder) {
use_reorder_ = true;
param->use_reorder = true;
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions adding null checks in HGraph::GetVectorByInnerId for high_precise_codes_/raw_vector_, but the change here only defers setting use_reorder_ until after high_precise_codes_ is rebuilt. If the intent is defensive hardening beyond this Tune transition, please either (a) add the promised null checks in GetVectorByInnerId (and define the expected fallback/error behavior), or (b) update the PR description to match the actual fix.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

crash when Tune index to rabitq

2 participants