Skip to content

feat(dataset): add FP16/BF16 input vector support#1731

Open
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-20-支持-fp16-bf16-输入向量格式

Hidden character warning

The head ref may contain hidden characters: "2026-03-20-\u652f\u6301-fp16-bf16-\u8f93\u5165\u5411\u91cf\u683c\u5f0f"
Open

feat(dataset): add FP16/BF16 input vector support#1731
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-20-支持-fp16-bf16-输入向量格式

Conversation

@LHT129
Copy link
Copy Markdown
Collaborator

@LHT129 LHT129 commented Mar 20, 2026

Summary

Add support for FP16 and BF16 input vector formats in VSAG library, enabling direct indexing of half-precision vectors without application-level conversion. This reduces memory usage by 50% compared to FP32.

Changes

Infrastructure

  • Add DATATYPE_BFLOAT16 and FLOAT16_VECTORS constants in constants.h/cpp
  • Add DATA_TYPE_BF16 enum value and ToString support in data_type.h
  • Add Float16Vectors/GetFloat16Vectors Dataset API (shared for FP16/BF16)
  • Extend DatasetImpl variant to support uint16_t* data type
  • Support float16/bfloat16 dtype parameters in index_common_param.cpp
  • Extend QuantizerAdapter to support uint16_t data type with FP16/BF16 conversion

HGraph Index Support

  • Add get_query_data() function to handle query vectors (always FP32)
  • Add get_data() support for FP16/BF16 base vectors
  • Fix build_by_odescent to use correct pointer arithmetic for 16-bit data

IVF Index Support

  • Add get_data() function to handle FP16/BF16 data retrieval
  • Add convert_to_fp32() helper function for partition strategy
  • Modify Train() and Add() to convert FP16/BF16 data for partition classification
  • Fix factory_router_index to use FP32 data type for router index creation

Technical Details

  • Data carrier: uint16_t* (shared for FP16 and BF16)
  • Index support: HGraph, IVF
  • Distance metrics: L2, IP, Cosine
  • Query vectors: Always FP32
  • Quantizer: Wrapper layer converts FP16/BF16 to FP32 before processing

Testing

  • Dataset unit tests: 98330 assertions ✅
  • FP16 functional tests: 12 assertions in 4 test cases ✅
  • BF16 functional tests: 6 assertions in 2 test cases ✅
  • Code format check passes with clang-format-15 ✅

Related Issues

Checklist

  • Code follows VSAG coding style
  • All tests pass
  • PR description is clear

Copilot AI review requested due to automatic review settings March 20, 2026 04:46
@LHT129 LHT129 self-assigned this Mar 20, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the VSAG library by integrating native support for FP16 and BF16 input vector formats. This allows for more memory-efficient storage and processing of vector data, particularly beneficial for large-scale applications. The changes span across core components, including the Dataset API, data type definitions, and the quantization pipeline, ensuring seamless handling of half-precision vectors from input to index processing.

Highlights

  • Half-Precision Vector Support: Added comprehensive support for FP16 (half-precision float) and BF16 (bfloat16) input vector formats within the VSAG library, enabling direct processing of these data types.
  • Memory Efficiency: Enabled a 50% reduction in memory usage for half-precision vectors compared to FP32, as they can now be directly indexed without application-level conversion.
  • Dataset API Extension: Introduced new Float16Vectors and GetFloat16Vectors APIs to the Dataset class, allowing uint16_t* to serve as the data carrier for both FP16 and BF16 vectors.
  • Quantizer Adapter Enhancement: Extended the QuantizerAdapter to support uint16_t data types, including logic for converting FP16/BF16 to FP32 before internal quantization processing.
  • Configuration and Constants: Added new constants (DATATYPE_BFLOAT16, FLOAT16_VECTORS) and updated data type enumeration (DATA_TYPE_BF16) to integrate the new formats into the system's configuration.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@mergify mergify bot added the module/api label Mar 20, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces support for FP16 and BF16 input vector formats by adding new constants, extending the Dataset API to handle uint16_t vectors, and updating the QuantizerAdapter to process these new data types, including conversions between uint16_t (FP16/BF16) and float. The review highlights a critical bug in the QuantizerAdapter's DecodeOneImpl (and implicitly DecodeBatchImpl) where the conversion from float back to uint16_t for FP16/BF16 is incorrect, potentially causing data corruption, and suggests using dedicated conversion functions. Additionally, it recommends refactoring the duplicated uint16_t to float conversion logic across multiple QuantizerAdapter methods into a shared helper function to improve maintainability.

Comment on lines +143 to +151
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
Vector<DataType> vec(this->dim_, this->allocator_);
if (!this->inner_quantizer_->DecodeOneImpl(codes, vec.data())) {
return false;
}
for (int64_t i = 0; i < this->dim_; i++) {
reinterpret_cast<DataT*>(data)[i] = static_cast<DataT>(std::round(vec[i]));
}
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The conversion from the decoded float vector back to uint16_t (for FP16/BF16) is incorrect. static_cast<DataT>(std::round(vec[i])) does not correctly convert a float to the FP16 or BF16 bit-level representation. This will result in corrupted data upon decoding. You should use the FloatToFP16() and FloatToBF16() utility functions based on the data_type_ member to perform the correct conversion.

Suggested change
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
Vector<DataType> vec(this->dim_, this->allocator_);
if (!this->inner_quantizer_->DecodeOneImpl(codes, vec.data())) {
return false;
}
for (int64_t i = 0; i < this->dim_; i++) {
reinterpret_cast<DataT*>(data)[i] = static_cast<DataT>(std::round(vec[i]));
}
return true;
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
Vector<DataType> vec(this->dim_, this->allocator_);
if (!this->inner_quantizer_->DecodeOneImpl(codes, vec.data())) {
return false;
}
auto* out_data = reinterpret_cast<uint16_t*>(data);
for (int64_t i = 0; i < this->dim_; i++) {
if (data_type_ == DataTypes::DATA_TYPE_FP16) {
out_data[i] = FloatToFP16(vec[i]);
} else {
out_data[i] = FloatToBF16(vec[i]);
}
}
return true;

Comment on lines +53 to +63
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
const auto* data_fp16 = reinterpret_cast<const uint16_t*>(data);
Vector<DataType> vec(this->dim_ * count, this->allocator_);
for (int64_t i = 0; i < this->dim_ * count; ++i) {
if (data_type_ == DataTypes::DATA_TYPE_FP16) {
vec[i] = FP16ToFloat(data_fp16[i]);
} else {
vec[i] = BF16ToFloat(data_fp16[i]);
}
}
return this->inner_quantizer_->TrainImpl(vec.data(), count);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to convert uint16_t data (representing FP16/BF16) to float is duplicated in TrainImpl, EncodeOneImpl, EncodeBatchImpl, and ProcessQueryImpl. To improve maintainability and reduce redundancy, consider extracting this conversion logic into a private helper function. For example, you could create a function void convert_from_half(const uint16_t* src, float* dst, int64_t size) that encapsulates this loop.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for ingesting FP16/BF16 (stored as uint16_t) vectors into VSAG datasets and quantization flow, avoiding app-level conversion to FP32.

Changes:

  • Extends dataset APIs and internal storage to carry FP16/BF16 vectors via uint16_t*.
  • Adds BF16 datatype plumbing (DATATYPE_BFLOAT16, enum value, parsing and ToString).
  • Extends QuantizerAdapter to accept uint16_t inputs and convert FP16/BF16 → FP32 for inner quantizer operations.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/quantization/quantizer_adapter.h Allows uint16_t as adapter input type and stores runtime data_type_.
src/quantization/quantizer_adapter.cpp Implements FP16/BF16 → FP32 conversion for train/encode/query processing and instantiates adapter for uint16_t.
src/index_common_param.cpp Parses "float16" / "bfloat16" dtype strings and updates validation error list.
src/dataset_impl.h Adds uint16_t* to variant and implements Float16Vectors getters/setters.
src/dataset_impl.cpp Adds deep copy / append / destructor handling for float16 vectors.
src/data_type.h Adds DATA_TYPE_BF16 and ToString mapping.
src/constants.cpp Defines FLOAT16_VECTORS and DATATYPE_BFLOAT16.
include/vsag/dataset.h Adds public Dataset API for Float16Vectors / GetFloat16Vectors.
include/vsag/constants.h Declares new constants for BF16 and float16 vectors key.
plan.md Implementation plan document added.
TASK.md Adds a local absolute path reference (likely accidental).
Comments suppressed due to low confidence (1)

TASK.md:1

  • This commits a machine-local absolute path into the repository, which can leak internal environment details and is not usable for other developers/CI. Recommended: remove TASK.md from the PR, or replace it with a repo-relative link (or an issue/PR reference) that works across environments.

Comment on lines +143 to +151
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
Vector<DataType> vec(this->dim_, this->allocator_);
if (!this->inner_quantizer_->DecodeOneImpl(codes, vec.data())) {
return false;
}
for (int64_t i = 0; i < this->dim_; i++) {
reinterpret_cast<DataT*>(data)[i] = static_cast<DataT>(std::round(vec[i]));
}
return true;
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the uint16_t path, DecodeOneImpl/DecodeBatchImpl currently rounds floats and casts to uint16_t, which produces integer magnitudes (0–65535) rather than FP16/BF16 bit patterns. If these APIs are intended to return FP16/BF16 vectors as uint16_t storage, this needs float→FP16 / float→BF16 conversion (selected by data_type_). If decode is not meant to support FP16/BF16 output, consider disallowing Decode* for uint16_t (return false / throw) to avoid silently returning invalid data.

Copilot uses AI. Check for mistakes.
Comment on lines +174 to +182
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
Vector<DataType> vec(this->dim_ * count, this->allocator_);
if (!this->inner_quantizer_->DecodeBatchImpl(codes, vec.data(), count)) {
return false;
}
for (int64_t i = 0; i < this->dim_ * count; i++) {
reinterpret_cast<DataT*>(data)[i] = static_cast<DataT>(std::round(vec[i]));
}
return true;
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the uint16_t path, DecodeOneImpl/DecodeBatchImpl currently rounds floats and casts to uint16_t, which produces integer magnitudes (0–65535) rather than FP16/BF16 bit patterns. If these APIs are intended to return FP16/BF16 vectors as uint16_t storage, this needs float→FP16 / float→BF16 conversion (selected by data_type_). If decode is not meant to support FP16/BF16 output, consider disallowing Decode* for uint16_t (return false / throw) to avoid silently returning invalid data.

Copilot uses AI. Check for mistakes.
Comment on lines +220 to +231
} else if constexpr (std::is_same_v<DataT, uint16_t>) {
const auto* query_fp16 = reinterpret_cast<const uint16_t*>(query);
Vector<DataType> vec(this->dim_, this->allocator_);
for (int64_t i = 0; i < this->dim_; i++) {
if (data_type_ == DataTypes::DATA_TYPE_FP16) {
vec[i] = FP16ToFloat(query_fp16[i]);
} else {
vec[i] = BF16ToFloat(query_fp16[i]);
}
}
auto& inner_computer = reinterpret_cast<Computer<QuantT>&>(computer);
this->inner_quantizer_->ProcessQueryImpl(vec.data(), inner_computer);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description states query vectors are always FP32, but the uint16_t specialization reinterprets query as uint16_t* and converts it as FP16/BF16. If queries are indeed FP32, this will misinterpret FP32 memory and produce incorrect results/UB. Suggested fix: for the uint16_t adapter, keep ProcessQueryImpl consuming FP32 (forward query directly to inner_quantizer_ without reinterpret/copy), and only convert for base vectors (train/encode). If the intent is to support FP16/BF16 queries as well, the PR description should be updated and the API should make the query dtype explicit.

Copilot uses AI. Check for mistakes.
Comment on lines +57 to +61
if (data_type_ == DataTypes::DATA_TYPE_FP16) {
vec[i] = FP16ToFloat(data_fp16[i]);
} else {
vec[i] = BF16ToFloat(data_fp16[i]);
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the uint16_t conversion path, the else branch treats any non-FP16 data_type_ as BF16. If data_type_ is accidentally DATA_TYPE_FLOAT, DATA_TYPE_INT8, etc., this will silently decode the payload as BF16. Consider making the branch explicit (if FP16 ... else if BF16 ... else error) to fail fast and avoid hard-to-debug incorrect indexing.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to 51
fmt::format("parameters[{}] must in [{}, {}, {}, {}, {}], now is {}",
PARAMETER_DTYPE,
DATATYPE_FLOAT32,
DATATYPE_FLOAT16,
DATATYPE_BFLOAT16,
DATATYPE_INT8,
DATATYPE_SPARSE,
datatype));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation message is grammatically incorrect/unclear: "must in" → "must be in" (or "must be one of"). Updating this improves usability when users pass an unsupported dtype.

Copilot uses AI. Check for mistakes.
Comment on lines +248 to +259
* @brief Sets the float16/bfloat16 vector array for the dataset.
*
* @param vectors Pointer to the array of float16/bfloat16 vectors (uint16_t*).
* @return DatasetPtr A shared pointer to the dataset with updated vectors.
*/
virtual DatasetPtr
Float16Vectors(const uint16_t* vectors) = 0;

/**
* @brief Retrieves the float16/bfloat16 vector array of the dataset.
*
* @return const uint16_t* Pointer to the array of float16/bfloat16 vectors.
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API name Float16Vectors suggests FP16 only, but the comment says it also carries BF16. Since this is new public API, consider either (a) renaming to a format-neutral name (e.g., HalfVectors/Uint16Vectors) or (b) explicitly documenting that these are raw FP16 or BF16 bit-patterns and that the interpretation is controlled by dtype/DataTypes elsewhere. Without that clarification, consumers may pass BF16 data but assume FP16 semantics (or vice versa).

Suggested change
* @brief Sets the float16/bfloat16 vector array for the dataset.
*
* @param vectors Pointer to the array of float16/bfloat16 vectors (uint16_t*).
* @return DatasetPtr A shared pointer to the dataset with updated vectors.
*/
virtual DatasetPtr
Float16Vectors(const uint16_t* vectors) = 0;
/**
* @brief Retrieves the float16/bfloat16 vector array of the dataset.
*
* @return const uint16_t* Pointer to the array of float16/bfloat16 vectors.
* @brief Sets the 16-bit floating-point vector array for the dataset.
*
* @param vectors Pointer to the array of 16-bit floating-point encodings (uint16_t*).
*
* @details
* The memory pointed to by @p vectors is treated as an array of raw 16-bit
* floating-point bit patterns (either IEEE 754 half-precision FP16 or
* bfloat16 BF16). This method does not itself distinguish between FP16
* and BF16; the actual interpretation of these values is controlled by
* the dataset's data-type configuration (for example via a dtype/DataTypes
* setting elsewhere in the API).
*
* @return DatasetPtr A shared pointer to the dataset with updated vectors.
*/
virtual DatasetPtr
Float16Vectors(const uint16_t* vectors) = 0;
/**
* @brief Retrieves the 16-bit floating-point vector array of the dataset.
*
* @return const uint16_t* Pointer to the array of raw 16-bit floating-point
* bit patterns (FP16 or BF16), whose interpretation is determined by the
* dataset's data-type configuration (e.g., dtype/DataTypes).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 20, 2026 06:41
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch 2 times, most recently from 1b45efd to 40b229c Compare March 20, 2026 06:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (1)

TASK.md:1

  • This file hard-codes an absolute local filesystem path that won’t exist for other developers/CI and is not useful as repository content. Suggest removing TASK.md from the PR (or replacing it with a repo-relative link / brief task description).

Comment on lines +26 to +30
CreateFP16Dataset(int64_t dim, int64_t count, const std::vector<float>& fp32_data) {
std::vector<uint16_t> fp16_data(dim * count);
for (int64_t i = 0; i < dim * count; ++i) {
fp16_data[i] = generic::FloatToFP16(fp32_data[i]);
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateFP16Dataset (and the BF16 equivalent) returns a Dataset that points at fp16_data owned by a local std::vector. With Owner(false), the dataset will retain a dangling pointer after the function returns, leading to use-after-free during Build()/Add(). Fix: ensure the vector memory outlives the dataset (e.g., allocate buffers on heap and set Owner(true) appropriately, or return a struct that holds both the dataset and backing storage).

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +43
auto dataset = Dataset::Make();
dataset->NumElements(count)
->Dim(dim)
->Ids(ids.data())
->Float16Vectors(fp16_data.data())
->Owner(false);
return dataset;
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateFP16Dataset (and the BF16 equivalent) returns a Dataset that points at fp16_data owned by a local std::vector. With Owner(false), the dataset will retain a dangling pointer after the function returns, leading to use-after-free during Build()/Add(). Fix: ensure the vector memory outlives the dataset (e.g., allocate buffers on heap and set Owner(true) appropriately, or return a struct that holds both the dataset and backing storage).

Copilot uses AI. Check for mistakes.
Comment on lines +316 to +324
vsag::Vector<uint16_t> sampled_fp16_buffer(sample_count * dim, safe_allocator);
for (int64_t i = 0; i < sample_count; ++i) {
std::copy(original_fp16_data + sampled_indices[i] * dim,
original_fp16_data + (sampled_indices[i] + 1) * dim,
sampled_fp16_buffer.data() + i * dim);
}
auto* new_data_buffer =
static_cast<uint16_t*>(safe_allocator->Allocate(sample_count * dim * sizeof(uint16_t)));
std::copy(sampled_fp16_buffer.begin(), sampled_fp16_buffer.end(), new_data_buffer);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does an extra full-buffer copy: first into sampled_fp16_buffer, then into new_data_buffer. You can copy directly into new_data_buffer inside the loop, avoiding an additional allocation and memory pass (same applies to the FP32 branch below). This matters because sampling can run on large training sets.

Suggested change
vsag::Vector<uint16_t> sampled_fp16_buffer(sample_count * dim, safe_allocator);
for (int64_t i = 0; i < sample_count; ++i) {
std::copy(original_fp16_data + sampled_indices[i] * dim,
original_fp16_data + (sampled_indices[i] + 1) * dim,
sampled_fp16_buffer.data() + i * dim);
}
auto* new_data_buffer =
static_cast<uint16_t*>(safe_allocator->Allocate(sample_count * dim * sizeof(uint16_t)));
std::copy(sampled_fp16_buffer.begin(), sampled_fp16_buffer.end(), new_data_buffer);
auto* new_data_buffer =
static_cast<uint16_t*>(safe_allocator->Allocate(sample_count * dim * sizeof(uint16_t)));
for (int64_t i = 0; i < sample_count; ++i) {
std::copy(original_fp16_data + sampled_indices[i] * dim,
original_fp16_data + (sampled_indices[i] + 1) * dim,
new_data_buffer + i * dim);
}

Copilot uses AI. Check for mistakes.
// limitations under the License.

#include <catch2/catch_test_macros.hpp>
#include <cstdint>
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses std::abs on floating-point values but doesn’t include <cmath>. Depending on platform/standard library, this can fail to compile or pick an unintended overload. Add #include <cmath>.

Suggested change
#include <cstdint>
#include <cstdint>
#include <cmath>

Copilot uses AI. Check for mistakes.

uint16_t fp16 = generic::FloatToFP16(test_val);
float back_fp16 = generic::FP16ToFloat(fp16);
REQUIRE(std::abs(back_fp16 - test_val) < 0.01f);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses std::abs on floating-point values but doesn’t include <cmath>. Depending on platform/standard library, this can fail to compile or pick an unintended overload. Add #include <cmath>.

Copilot uses AI. Check for mistakes.

uint16_t bf16 = generic::FloatToBF16(test_val);
float back_bf16 = generic::BF16ToFloat(bf16);
REQUIRE(std::abs(back_bf16 - test_val) < 0.01f);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses std::abs on floating-point values but doesn’t include <cmath>. Depending on platform/standard library, this can fail to compile or pick an unintended overload. Add #include <cmath>.

Copilot uses AI. Check for mistakes.
} else {
throw VsagException(ErrorType::INVALID_ARGUMENT,
fmt::format("parameters[{}] must in [{}, {}, {}], now is {}",
fmt::format("parameters[{}] must in [{}, {}, {}, {}, {}], now is {}",
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new validation message is grammatically unclear ('must in'). Consider changing it to something like 'must be one of' to make the error easier to understand for API consumers.

Copilot uses AI. Check for mistakes.
@LHT129 LHT129 added the kind/feature New feature or request label Mar 20, 2026
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 20, 2026
The DecodeOneImpl and DecodeBatchImpl for uint16_t path incorrectly
used static_cast<uint16_t>(std::round(float)), which produces integer
magnitudes (0-65535) instead of FP16/BF16 bit patterns.

Fixed by using FloatToFP16/FloatToBF16 for proper conversion based on
data_type_.

Addresses review feedback from gemini-code-assist and Copilot on PR antgroup#1731.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
Copilot AI review requested due to automatic review settings March 20, 2026 08:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 9 comments.

Comments suppressed due to low confidence (1)

TASK.md:1

  • This file (and similar content in plan.md) embeds an absolute local filesystem path that likely won’t exist for other developers and may leak internal environment details. Suggested fix: remove TASK.md from the repo (or replace it with a relative link / issue reference), and avoid committing machine-specific absolute paths in documentation.

Comment on lines 57 to +64
FP16Quantizer<metric>::EncodeOneImpl(const DataType* data, uint8_t* codes) const {
auto* codes_fp16 = reinterpret_cast<uint16_t*>(codes);

if (input_data_type_ == DataTypes::DATA_TYPE_FP16) {
const auto* input_fp16 = reinterpret_cast<const uint16_t*>(data);
std::memcpy(codes_fp16, input_fp16, this->dim_ * sizeof(uint16_t));
return true;
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FP16 fast-path copies raw FP16 inputs directly into codes, bypassing cosine normalization entirely (and any other preprocessing in the float path). This will produce incorrect results for METRIC_TYPE_COSINE, where the encoded representation must reflect a normalized vector. Suggested fix: only take the memcpy fast-path when metric != METRIC_TYPE_COSINE (and when the input is known to already be in the expected encoded space), otherwise convert FP16→FP32, normalize if needed, then encode.

Copilot uses AI. Check for mistakes.
Comment on lines 66 to 72
const DataType* cur = data;
Vector<float> tmp(this->allocator_);
if constexpr (metric == MetricType::METRIC_TYPE_COSINE) {
tmp.resize(this->dim_);
Normalize(data, tmp.data(), this->dim_);
cur = tmp.data();
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When input_data_type_ is not FP16, this code assumes data points to FP32 (DataType) and will normalize/dereference it as such. If the dataset dtype is BF16 but the quantization type is configured as FP16 (i.e., FP16Quantizer constructed with common_param.data_type_ == DATA_TYPE_BF16), data will actually point to uint16_t storage and this path will read invalid floats. Suggested fix: explicitly handle DATA_TYPE_BF16 here by reinterpreting data as uint16_t*, converting BF16→FP32 into a temp buffer (and normalizing if cosine), then encoding to FP16 codes.

Copilot uses AI. Check for mistakes.
Comment on lines 57 to +64
BF16Quantizer<metric>::EncodeOneImpl(const DataType* data, uint8_t* codes) const {
auto* codes_bf16 = reinterpret_cast<uint16_t*>(codes);

if (input_data_type_ == DataTypes::DATA_TYPE_BF16) {
const auto* input_bf16 = reinterpret_cast<const uint16_t*>(data);
std::memcpy(codes_bf16, input_bf16, this->dim_ * sizeof(uint16_t));
return true;
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as FP16Quantizer: this BF16 memcpy fast-path bypasses cosine normalization, which will yield incorrect encoded vectors for METRIC_TYPE_COSINE. Suggested fix: gate the memcpy optimization to non-cosine metrics (or only when inputs are guaranteed normalized), and otherwise convert to FP32, normalize if needed, then encode to BF16.

Copilot uses AI. Check for mistakes.
Comment on lines +412 to +416
// Use original data format for bucket quantizer
void* data_ptr = nullptr;
uint64_t data_size = 0;
get_vectors(this->data_type_, dim, train_data, &data_ptr, &data_size);
this->bucket_->Train(reinterpret_cast<const float*>(data_ptr), sample_count);
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For FP16/BF16 datasets get_vectors() returns a uint16_t* buffer, but this code passes it to bucket_->Train() by reinterpreting it as const float*. Unless bucket_->Train() is explicitly designed to treat the pointer as untyped storage (and never dereference as float), this will break training (wrong reads / UB). Suggested fix: either (1) pass a real FP32 buffer (e.g., the fp32_train_data already created above) to bucket_->Train(), or (2) update the bucket quantizer interface to accept (const void* data, DataTypes dtype, ...) (or (const uint8_t* data, size_t stride, ...)) so FP16/BF16 can be handled without type-punning.

Suggested change
// Use original data format for bucket quantizer
void* data_ptr = nullptr;
uint64_t data_size = 0;
get_vectors(this->data_type_, dim, train_data, &data_ptr, &data_size);
this->bucket_->Train(reinterpret_cast<const float*>(data_ptr), sample_count);
// Use FP32 data for bucket quantizer training to avoid type-punning issues
this->bucket_->Train(fp32_train_data.data(), sample_count);

Copilot uses AI. Check for mistakes.

#include <catch2/catch_test_macros.hpp>
#include <cstdint>
#include <random>
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new test file uses std::vector, std::copy, and std::abs but does not include the corresponding standard headers (<vector>, <algorithm>, <cmath>). Relying on transitive includes can cause non-deterministic build failures across toolchains; please add the direct includes.

Suggested change
#include <random>
#include <random>
#include <vector>
#include <algorithm>
#include <cmath>

Copilot uses AI. Check for mistakes.
*vectors_ptr = (void*)base->GetFloat16Vectors();
*data_size_ptr = dim * sizeof(uint16_t);
} else {
throw std::invalid_argument(fmt::format("no support for this metric: {}", (int)type));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is thrown from get_vectors(DataTypes type, ...), but the message says "metric" instead of "type". Updating it to "no support for this type" (and ideally printing ToString(type) if available) would make debugging dtype issues clearer.

Suggested change
throw std::invalid_argument(fmt::format("no support for this metric: {}", (int)type));
throw std::invalid_argument(fmt::format("no support for this type: {}", (int)type));

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +40
: FP16Quantizer<metric>(common_param.dim_, common_param.allocator_.get()) {
this->input_data_type_ = common_param.data_type_;
};
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing semicolon after the constructor definition (};) is extraneous at namespace scope and should be removed to match typical C++ style and avoid confusion.

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +40
: BF16Quantizer<metric>(common_param.dim_, common_param.allocator_.get()) {
this->input_data_type_ = common_param.data_type_;
};
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as FP16Quantizer: the constructor ends with an extraneous trailing semicolon (};). Please remove it for consistency and readability.

Copilot uses AI. Check for mistakes.
Comment on lines +113 to +115
TEST_CASE("HGraph with FP16 Test", "[ft][hgraph][fp16]") {
constexpr int64_t TEST_DIM = 32;
constexpr int64_t TEST_COUNT = 100;
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new FP16/BF16 functional tests exercise only \"metric_type\": \"l2\", but the PR description claims FP16/BF16 support for L2/IP/Cosine. Consider adding at least one FP16 or BF16 functional test that builds and searches with \"metric_type\": \"cosine\" (and/or \"ip\") to cover the normalization-sensitive paths and prevent regressions.

Copilot uses AI. Check for mistakes.
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 23, 2026
The DecodeOneImpl and DecodeBatchImpl for uint16_t path incorrectly
used static_cast<uint16_t>(std::round(float)), which produces integer
magnitudes (0-65535) instead of FP16/BF16 bit patterns.

Fixed by using FloatToFP16/FloatToBF16 for proper conversion based on
data_type_.

Addresses review feedback from gemini-code-assist and Copilot on PR antgroup#1731.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from 76c790e to 53814b1 Compare March 23, 2026 03:30
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 23, 2026
The DecodeOneImpl and DecodeBatchImpl for uint16_t path incorrectly
used static_cast<uint16_t>(std::round(float)), which produces integer
magnitudes (0-65535) instead of FP16/BF16 bit patterns.

Fixed by using FloatToFP16/FloatToBF16 for proper conversion based on
data_type_.

Addresses review feedback from gemini-code-assist and Copilot on PR antgroup#1731.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
Copilot AI review requested due to automatic review settings March 23, 2026 11:30
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from 53814b1 to 3fff8b8 Compare March 23, 2026 11:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

TASK.md:1

  • This file embeds an absolute local filesystem path (including a username). This is not portable and can leak developer environment details. Prefer removing TASK.md from source control or replacing it with a repo-relative link/issue reference (e.g., Fixes #1730) or a short task description.

Comment on lines +461 to +477
// Get original data pointer for bucket insertion
void* base_vectors = nullptr;
uint64_t base_data_size = 0;
get_vectors(this->data_type_, dim_, base, &base_vectors, &base_data_size);
const auto* raw_vectors = reinterpret_cast<const char*>(base_vectors);

auto add_func = [&](int64_t i) -> void {
for (int64_t j = 0; j < buckets_per_data_; ++j) {
const auto* data_ptr = vectors + i * dim_;
const auto* data_ptr =
raw_vectors + i * dim_ *
((this->data_type_ == DataTypes::DATA_TYPE_FLOAT)
? sizeof(float)
: sizeof(uint16_t));
auto idx = i * buckets_per_data_ + j;
InnerIdType offset_id = bucket_->InsertVector(
data_ptr, buckets[idx], idx + current_num * buckets_per_data_);
InnerIdType offset_id = bucket_->InsertVector(reinterpret_cast<const float*>(data_ptr),
buckets[idx],
idx + current_num * buckets_per_data_);
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as training: for FP16/BF16 data_ptr is a byte pointer into uint16_t data, but it is passed as const float*. This is prone to misalignment (2-byte aligned data cast to float*) and UB if any code uses it as float before converting. A safer approach is to avoid char* + reinterpret_cast<const float*> and instead compute offsets on the correct element type (const uint16_t* vs const float*) or change InsertVector to accept const void*/const uint8_t* with an element-size/type.

Copilot uses AI. Check for mistakes.
Comment on lines +313 to +315
const auto* fp32_data = data->GetFloat32Vectors();
memcpy(result.data(), fp32_data, num_elements * dim * sizeof(float));
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy is used in newly added code without an explicit <cstring> include and without std::memcpy. This can become a portability/build issue depending on transitive includes. Include <cstring> in this file (or use an existing project header that guarantees it) and call std::memcpy.

Copilot uses AI. Check for mistakes.
Comment on lines 124 to 126
} catch (const std::bad_alloc& e) {
throw VsagException(ErrorType::NO_ENOUGH_MEMORY, "bad alloc when init computer buf");
}
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caught exception variable e is unused and the rethrown VsagException drops the original error detail (e.what()). Either remove the variable name (catch (const std::bad_alloc&)) or include e.what() (or wrap it) in the thrown exception for better diagnostics.

Copilot uses AI. Check for mistakes.
plan.md Outdated
Comment on lines +149 to +150
- **需求文档**: `/home/tianlan.lht/code/workspace/fp16_and_bf16_support.md`
- **任务文件**: `/home/tianlan.lht/code/workspace/agent-hive/tasks/2026-03-20-支持-fp16-bf16-输入向量格式.md`
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan document includes absolute local paths. This is not usable for other contributors/CI and may disclose environment details. Replace these with repo-relative paths (if the docs exist in-repo), links to issues/PRs, or remove the local-path section.

Suggested change
- **需求文档**: `/home/tianlan.lht/code/workspace/fp16_and_bf16_support.md`
- **任务文件**: `/home/tianlan.lht/code/workspace/agent-hive/tasks/2026-03-20-支持-fp16-bf16-输入向量格式.md`
- **需求文档**: 详见项目文档《FP16/BF16 支持需求说明》
- **任务文件**: `tasks/2026-03-20-支持-fp16-bf16-输入向量格式.md`

Copilot uses AI. Check for mistakes.
auto param = R"(
{
"dtype": "float16",
"metric_type": "l2",
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims FP16/BF16 support for L2, IP, and Cosine, but the newly added functional tests only exercise metric_type: \"l2\". Add at least one FP16 and one BF16 functional test covering ip and/or cosine (especially cosine, since it exercises normalization paths in quantizers) to prevent regressions.

Copilot uses AI. Check for mistakes.
auto param = R"(
{
"dtype": "bfloat16",
"metric_type": "l2",
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims FP16/BF16 support for L2, IP, and Cosine, but the newly added functional tests only exercise metric_type: \"l2\". Add at least one FP16 and one BF16 functional test covering ip and/or cosine (especially cosine, since it exercises normalization paths in quantizers) to prevent regressions.

Copilot uses AI. Check for mistakes.
auto param = R"(
{
"dtype": "float16",
"metric_type": "l2",
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims FP16/BF16 support for L2, IP, and Cosine, but the newly added functional tests only exercise metric_type: \"l2\". Add at least one FP16 and one BF16 functional test covering ip and/or cosine (especially cosine, since it exercises normalization paths in quantizers) to prevent regressions.

Copilot uses AI. Check for mistakes.
auto param = R"(
{
"dtype": "bfloat16",
"metric_type": "l2",
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims FP16/BF16 support for L2, IP, and Cosine, but the newly added functional tests only exercise metric_type: \"l2\". Add at least one FP16 and one BF16 functional test covering ip and/or cosine (especially cosine, since it exercises normalization paths in quantizers) to prevent regressions.

Copilot uses AI. Check for mistakes.
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 25, 2026
- Fix get_query_data() to support INT8 query vectors (was returning nullptr)
- Fix FP16/BF16 quantizers to normalize for COSINE metric even with FP16/BF16 input
- Add FP16/BF16 data type handling in flatten_interface.cpp to create QuantizerAdapter<ProductQuantizer, uint16_t>

Fixes:
- Test Example Aarch64 SEGV in FP32ComputeL2Sqr (null query pointer)
- Test Aarch64 functests failures
- Review comment about COSINE normalization bypass in FP16/BF16 fast-path

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 26, 2026
The DecodeOneImpl and DecodeBatchImpl for uint16_t path incorrectly
used static_cast<uint16_t>(std::round(float)), which produces integer
magnitudes (0-65535) instead of FP16/BF16 bit patterns.

Fixed by using FloatToFP16/FloatToBF16 for proper conversion based on
data_type_.

Addresses review feedback from gemini-code-assist and Copilot on PR antgroup#1731.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 26, 2026
- Fix get_query_data() to support INT8 query vectors (was returning nullptr)
- Fix FP16/BF16 quantizers to normalize for COSINE metric even with FP16/BF16 input
- Add FP16/BF16 data type handling in flatten_interface.cpp to create QuantizerAdapter<ProductQuantizer, uint16_t>

Fixes:
- Test Example Aarch64 SEGV in FP32ComputeL2Sqr (null query pointer)
- Test Aarch64 functests failures
- Review comment about COSINE normalization bypass in FP16/BF16 fast-path

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Copilot AI review requested due to automatic review settings March 26, 2026 07:19
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from df15224 to 1f5cfee Compare March 26, 2026 07:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

TASK.md:1

  • This file contains an absolute path to a local machine, which won’t exist for other developers/CI and may leak internal environment details. Consider removing TASK.md from the repo or replacing it with a repository-relative link (or an issue/PR reference).

"QuantizerAdapter::ProcessQueryImpl only supports int8_t data type");
}
auto& inner_computer = reinterpret_cast<Computer<QuantT>&>(computer);
this->inner_quantizer_->ProcessQueryImpl(query, inner_computer);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes the int8→FP32 query conversion that existed previously. For QuantizerAdapter<..., int8_t>, callers may still supply int8 query buffers (e.g., HGraph’s get_query_data() returns GetInt8Vectors() for DATA_TYPE_INT8), which would cause inner_quantizer_->ProcessQueryImpl() to interpret int8 bytes as floats. Fix by keeping passthrough only for FP32 queries, but retaining the conversion path when DataT is int8_t (or otherwise ensuring queries are always FP32 for int8 indexes).

Suggested change
this->inner_quantizer_->ProcessQueryImpl(query, inner_computer);
if constexpr (std::is_same_v<DataT, int8_t>) {
// For int8 indexes, callers may pass int8 query buffers. Convert to FP32
// before forwarding to the inner quantizer, which expects float queries.
auto* buf = static_cast<float*>(this->allocator_->Allocate(sizeof(float) * this->dim_));
const auto* query_i8 = reinterpret_cast<const int8_t*>(query);
for (size_t i = 0; i < this->dim_; ++i) {
buf[i] = static_cast<float>(query_i8[i]);
}
this->inner_quantizer_->ProcessQueryImpl(buf, inner_computer);
this->allocator_->Deallocate(buf);
} else {
// For FP32 (and other supported DataT where query is already correctly typed),
// keep the direct passthrough behavior.
this->inner_quantizer_->ProcessQueryImpl(query, inner_computer);
}

Copilot uses AI. Check for mistakes.
Comment on lines +314 to +315
memcpy(result.data(), fp32_data, num_elements * dim * sizeof(float));
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy is used here but this file’s includes (in the provided diff) don’t show <cstring>. To avoid relying on transitive includes and namespace quirks, explicitly include <cstring> and prefer std::memcpy.

Copilot uses AI. Check for mistakes.
Comment on lines +1107 to +1111
if (search_param.ep == INVALID_ENTRY_POINT) {
return make_empty_dataset_with_stats();
}

const auto* raw_query = get_data(query);
const auto* raw_query = get_query_data(query);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if statement indentation appears to have been lost in this hunk, which reduces readability and is inconsistent with surrounding style. Please run clang-format (or restore indentation) for this block to keep formatting consistent.

Copilot uses AI. Check for mistakes.
Comment on lines +250 to +254
* @param vectors Pointer to the array of float16/bfloat16 vectors (uint16_t*).
* @return DatasetPtr A shared pointer to the dataset with updated vectors.
*/
virtual DatasetPtr
Float16Vectors(const uint16_t* vectors) = 0;
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says the parameter is uint16_t*, but the API takes const uint16_t*. Update the comment to match the signature (and emphasize that the buffer is read-only unless ownership is transferred via Owner(...)).

Copilot uses AI. Check for mistakes.
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 26, 2026
The DecodeOneImpl and DecodeBatchImpl for uint16_t path incorrectly
used static_cast<uint16_t>(std::round(float)), which produces integer
magnitudes (0-65535) instead of FP16/BF16 bit patterns.

Fixed by using FloatToFP16/FloatToBF16 for proper conversion based on
data_type_.

Addresses review feedback from gemini-code-assist and Copilot on PR antgroup#1731.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 26, 2026
- Fix get_query_data() to support INT8 query vectors (was returning nullptr)
- Fix FP16/BF16 quantizers to normalize for COSINE metric even with FP16/BF16 input
- Add FP16/BF16 data type handling in flatten_interface.cpp to create QuantizerAdapter<ProductQuantizer, uint16_t>

Fixes:
- Test Example Aarch64 SEGV in FP32ComputeL2Sqr (null query pointer)
- Test Aarch64 functests failures
- Review comment about COSINE normalization bypass in FP16/BF16 fast-path

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from 1f5cfee to f28f3f1 Compare March 26, 2026 08:21
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from f28f3f1 to 17d55a8 Compare March 26, 2026 08:25
Copilot AI review requested due to automatic review settings March 26, 2026 09:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.

Comment on lines +86 to +100
if (common_param.data_type_ == DataTypes::DATA_TYPE_FP16 ||
common_param.data_type_ == DataTypes::DATA_TYPE_BF16) {
if (quantization_string == QUANTIZATION_TYPE_VALUE_FP16) {
return make_instance_flatten<FP16Quantizer<metric>, IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_BF16) {
return make_instance_flatten<BF16Quantizer<metric>, IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_PQ) {
return make_instance_flatten<QuantizerAdapter<ProductQuantizer<metric>, uint16_t>,
IOTemp>(param, common_param);
}
throw VsagException(ErrorType::INVALID_ARGUMENT,
fmt::format("FP16/BF16 data type does not support {} quantization",
quantization_string));
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic allows mismatched configurations like dtype=bfloat16 with base_quantization_type=fp16 (and vice versa). In that case FP16Quantizer is constructed with input_data_type_ = BF16, but FP16Quantizer::EncodeOneImpl only treats input as FP16 when input_data_type_ == DATA_TYPE_FP16; otherwise it assumes the input pointer is FP32—leading to incorrect reads of BF16 payload. Fix by either (mandatory) rejecting mismatched dtype vs quantizer type for FP16/BF16 quantizers, or enhancing FP16/BF16 quantizers to correctly decode both DATA_TYPE_FP16 and DATA_TYPE_BF16 inputs before encoding.

Suggested change
if (common_param.data_type_ == DataTypes::DATA_TYPE_FP16 ||
common_param.data_type_ == DataTypes::DATA_TYPE_BF16) {
if (quantization_string == QUANTIZATION_TYPE_VALUE_FP16) {
return make_instance_flatten<FP16Quantizer<metric>, IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_BF16) {
return make_instance_flatten<BF16Quantizer<metric>, IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_PQ) {
return make_instance_flatten<QuantizerAdapter<ProductQuantizer<metric>, uint16_t>,
IOTemp>(param, common_param);
}
throw VsagException(ErrorType::INVALID_ARGUMENT,
fmt::format("FP16/BF16 data type does not support {} quantization",
quantization_string));
if (common_param.data_type_ == DataTypes::DATA_TYPE_FP16) {
if (quantization_string == QUANTIZATION_TYPE_VALUE_FP16) {
return make_instance_flatten<FP16Quantizer<metric>, IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_PQ) {
return make_instance_flatten<QuantizerAdapter<ProductQuantizer<metric>, uint16_t>,
IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_BF16) {
throw VsagException(
ErrorType::INVALID_ARGUMENT,
fmt::format("FP16 data type does not support {} quantization", quantization_string));
}
throw VsagException(
ErrorType::INVALID_ARGUMENT,
fmt::format("FP16 data type does not support {} quantization", quantization_string));
}
if (common_param.data_type_ == DataTypes::DATA_TYPE_BF16) {
if (quantization_string == QUANTIZATION_TYPE_VALUE_BF16) {
return make_instance_flatten<BF16Quantizer<metric>, IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_PQ) {
return make_instance_flatten<QuantizerAdapter<ProductQuantizer<metric>, uint16_t>,
IOTemp>(param, common_param);
}
if (quantization_string == QUANTIZATION_TYPE_VALUE_FP16) {
throw VsagException(
ErrorType::INVALID_ARGUMENT,
fmt::format("BF16 data type does not support {} quantization", quantization_string));
}
throw VsagException(
ErrorType::INVALID_ARGUMENT,
fmt::format("BF16 data type does not support {} quantization", quantization_string));

Copilot uses AI. Check for mistakes.
Comment on lines +469 to +473
const auto* data_ptr =
raw_vectors + i * dim_ *
((this->data_type_ == DataTypes::DATA_TYPE_FLOAT)
? sizeof(float)
: sizeof(uint16_t));
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointer arithmetic assumes any non-FP32 type uses 16-bit elements, which is incorrect for DATA_TYPE_INT8 (and would compute wrong offsets). Fix by using a switch/lookup for element size based on data_type_ (float=4, int8=1, fp16/bf16=2), or reuse the data_size_ptr/element-size information already computed by get_vectors() to avoid duplicating/incorrectly encoding this mapping.

Copilot uses AI. Check for mistakes.
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch 2 times, most recently from 755e822 to 185762a Compare March 27, 2026 02:57
Copilot AI review requested due to automatic review settings March 27, 2026 06:05
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from 185762a to 3e517c4 Compare March 27, 2026 06:05
- Add FP16/BF16 data types (DATA_TYPE_FP16, DATA_TYPE_BF16)
- Extend Dataset API with Float16Vectors/GetFloat16Vectors
- Add FP16Quantizer and BF16Quantizer for half-precision vectors
- Support FP16/BF16 in HGraph and IVF indexes
- Handle SPARSE data type in get_query_data for proper query processing
- Add QuantizerAdapter support for uint16_t data type
- Add FP16/BF16 HGraph examples

Fixes antgroup#1730

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
Co-authored-by: Kimi-K2.5 <assistant@example.com>
@LHT129 LHT129 force-pushed the 2026-03-20-支持-fp16-bf16-输入向量格式 branch from 3e517c4 to 1430589 Compare March 27, 2026 06:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.

Comment on lines +44 to +45
inline const void*
get_data(const DatasetPtr& data, int64_t index) const;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new private method IVF::get_data(...) is declared/defined but not used anywhere in IVF (only the definition exists). This adds dead code and can confuse future maintenance; either use it where you currently compute raw_vectors + ... or remove it until needed.

Suggested change
inline const void*
get_data(const DatasetPtr& data, int64_t index) const;

Copilot uses AI. Check for mistakes.
Comment on lines +642 to +647
size_t element_size = sizeof(float);
if (data_type_ == DataTypes::DATA_TYPE_FP16 || data_type_ == DataTypes::DATA_TYPE_BF16) {
element_size = sizeof(uint16_t);
} else if (data_type_ == DataTypes::DATA_TYPE_INT8) {
element_size = sizeof(int8_t);
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces size_t element_size in new code. Repo guidance asks to prefer uint64_t over size_t in code changes to avoid potential macOS compile issues (CLAUDE.md:103). Consider switching element_size to uint64_t (or reusing an existing uint64_t size variable) here.

Copilot uses AI. Check for mistakes.
} else if (type == DataTypes::DATA_TYPE_INT8) {
base->Int8Vectors((int8_t*)vectors_ptr)->Dim(dim)->Owner(false)->NumElements(num_element);
} else if (type == DataTypes::DATA_TYPE_FP16 || type == DataTypes::DATA_TYPE_BF16) {
base->Float16Vectors((uint16_t*)vectors_ptr)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_dataset casts vectors_ptr to uint16_t* when calling Float16Vectors(...), which discards const from the input pointer. Since the Dataset API takes const uint16_t*, prefer a static_cast<const uint16_t*> (or equivalent) to keep const-correctness and avoid accidental writes through this pointer later.

Suggested change
base->Float16Vectors((uint16_t*)vectors_ptr)
base->Float16Vectors(static_cast<const uint16_t*>(vectors_ptr))

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Add FP16 and BF16 input vector format support

2 participants