Skip to content

fix(io): check mremap return value to prevent potential crash#1725

Open
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-18-修复-mmapio-mremap-返回值未检查

Hidden character warning

The head ref may contain hidden characters: "2026-03-18-\u4fee\u590d-mmapio-mremap-\u8fd4\u56de\u503c\u672a\u68c0\u67e5"
Open

fix(io): check mremap return value to prevent potential crash#1725
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-18-修复-mmapio-mremap-返回值未检查

Conversation

@LHT129
Copy link
Copy Markdown
Collaborator

@LHT129 LHT129 commented Mar 19, 2026

Summary

Add return value check for mremap calls in MMapIO on Linux platform to prevent potential segmentation faults when memory remapping fails.

Background

In src/io/mmap_io.cpp, the Linux code path has three mremap calls that do not check the return value. When mremap fails, it returns MAP_FAILED. Directly assigning this to this->start_ causes subsequent memory operations to access invalid addresses, leading to segmentation faults.

The macOS branch already correctly checks for MAP_FAILED and throws an exception. This PR aligns Linux error handling with macOS.

Changes

  • Add return value check for mremap in WriteImpl (line 104)
  • Add return value check for mremap in ResizeImpl expansion path (line 142)
  • Add return value check for mremap in ResizeImpl shrink path (line 165)
  • Throw VsagException with meaningful error message using std::system_category() for proper OS error code handling
  • Save errno immediately after syscall failure to avoid accidental overwrite
  • Also fixed open() and mmap() error handling to use std::system_category()

Implementation Details

Why std::system_category() instead of std::generic_category()?

When constructing std::error_code from errno values, std::system_category() is the correct category for OS error codes (like errno from system calls). std::generic_category() is for portable error codes.

Why save errno immediately?

errno can be overwritten by subsequent function calls (including the error message formatting). We save errno into a local variable immediately after the failing syscall.

Files Changed

  • src/io/mmap_io.cpp

Testing

  • ✅ Build: make release passed
  • ✅ Unit tests: All 299 test cases passed

Related Issues

Checklist

  • Code follows VSAG coding style
  • All tests pass
  • PR description is clear

@LHT129 LHT129 self-assigned this Mar 19, 2026
Copilot AI review requested due to automatic review settings March 19, 2026 06:58
@LHT129 LHT129 added kind/bug Something isn't working version/1.0 labels Mar 19, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness of memory-mapped I/O operations on Linux by introducing critical error checking for mremap system calls. By verifying the return value of mremap and throwing specific exceptions upon failure, the changes prevent potential segmentation faults and improve the overall stability of the application, bringing Linux behavior in line with macOS.

Highlights

  • Error Handling: Added return value checks for mremap calls within the WriteImpl function to prevent potential crashes on Linux.
  • Memory Resizing: Implemented return value checks for mremap during both expansion and shrinking paths in the ResizeImpl function.
  • Exception Handling: Introduced VsagException with detailed error messages, including strerror(errno), to be thrown when mremap operations fail.
  • Platform Alignment: Aligned Linux error handling for mremap with the existing robust error checking present in the macOS implementation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a potential crash by adding return value checks for mremap calls on Linux. This prevents assigning MAP_FAILED to the memory pointer, which could lead to segmentation faults. The change is well-implemented and aligns the Linux error handling with the existing macOS implementation. I have one suggestion to improve code maintainability by refactoring the duplicated error handling logic.

Comment on lines +104 to +109
void* new_addr = mremap(this->start_, old_size, new_size, MREMAP_MAYMOVE);
if (new_addr == MAP_FAILED) {
throw VsagException(ErrorType::INTERNAL_ERROR,
fmt::format("mremap failed: {}", strerror(errno)));
}
this->start_ = static_cast<uint8_t*>(new_addr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This error handling logic for mremap is duplicated in three places in this file (WriteImpl and twice in ResizeImpl). To improve maintainability and reduce code duplication, consider extracting this logic into a private helper function or a free function in an anonymous namespace.

For example:

namespace {
void* checked_mremap(void* old_address, size_t old_size, size_t new_size, int flags) {
    void* new_addr = mremap(old_address, old_size, new_size, flags);
    if (new_addr == MAP_FAILED) {
        throw VsagException(ErrorType::INTERNAL_ERROR,
                            fmt::format("mremap failed: {}", strerror(errno)));
    }
    return new_addr;
}
} // anonymous namespace

Then you could simplify the call sites to:

this->start_ = static_cast<uint8_t*>(checked_mremap(this->start_, old_size, new_size, MREMAP_MAYMOVE));

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Linux-side error handling for mremap in MMapIO to prevent crashes when remapping fails, aligning behavior with the existing macOS path.

Changes:

  • Check mremap return values in WriteImpl and ResizeImpl (grow/shrink) Linux branches
  • Throw VsagException on MAP_FAILED with an errno-derived message

Comment on lines +106 to +107
throw VsagException(ErrorType::INTERNAL_ERROR,
fmt::format("mremap failed: {}", strerror(errno)));
void* new_addr = mremap(this->start_, old_size, new_size, MREMAP_MAYMOVE);
if (new_addr == MAP_FAILED) {
throw VsagException(ErrorType::INTERNAL_ERROR,
fmt::format("mremap failed: {}", strerror(errno)));
@LHT129 LHT129 force-pushed the 2026-03-18-修复-mmapio-mremap-返回值未检查 branch from 27d4d74 to 4922946 Compare March 19, 2026 11:40
@pull-request-size pull-request-size bot added size/M and removed size/S labels Mar 19, 2026
@LHT129
Copy link
Copy Markdown
Collaborator Author

LHT129 commented Mar 19, 2026

Review Response

I've addressed the feedback:

  1. Error message context (line 107): Added old_size, new_size, and errno to all mremap error messages for better debugging. Also added size to mmap error message.

  2. Thread-safe strerror: Replaced strerror(errno) with std::error_code(errno, std::generic_category()).message() which is thread-safe.

All changes are in commit 4922946.

Thanks for the suggestions!

Copilot AI review requested due to automatic review settings March 19, 2026 11:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds robust Linux error handling for mremap in MMapIO to avoid crashes when remapping fails, aligning behavior with the existing macOS checks.

Changes:

  • Add MAP_FAILED checks for Linux mremap calls in WriteImpl and ResizeImpl
  • Improve error messages for open, mmap, and mremap to include errno details via std::error_code

Comment on lines +42 to +45
fmt::format("open file {} failed (errno={}): {}",
this->filepath_,
errno,
std::error_code(errno, std::generic_category()).message()));
Comment on lines +102 to +110
void* new_addr = mremap(this->start_, old_size, new_size, MREMAP_MAYMOVE);
if (new_addr == MAP_FAILED) {
throw VsagException(
ErrorType::INTERNAL_ERROR,
fmt::format("mremap(old_size={}, new_size={}) failed (errno={}): {}",
old_size,
new_size,
errno,
std::error_code(errno, std::generic_category()).message()));
fmt::format("mmap(size={}) failed (errno={}): {}",
mmap_size,
errno,
std::error_code(errno, std::generic_category()).message()));
Comment on lines +140 to +149
void* new_addr = mremap(this->start_, old_size, new_size, MREMAP_MAYMOVE);
if (new_addr == MAP_FAILED) {
throw VsagException(
ErrorType::INTERNAL_ERROR,
fmt::format("mremap(old_size={}, new_size={}) failed (errno={}): {}",
old_size,
new_size,
errno,
std::error_code(errno, std::generic_category()).message()));
}
@LHT129 LHT129 force-pushed the 2026-03-18-修复-mmapio-mremap-返回值未检查 branch from ad11b5b to 83a87eb Compare March 20, 2026 02:58
Copy link
Copy Markdown
Collaborator Author

@LHT129 LHT129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gemini-code-assist 感谢详细的 review feedback。

我已经修复了所有指出的问题:

  1. std::system_category(): 将所有 std::generic_category() 替换为 std::system_category(),因为对于系统调用返回的 errno,system_category 是正确的错误类别。

  2. 立即保存 errno: 在所有系统调用失败后,立即将 errno 保存到局部变量 saved_errno,避免被后续操作覆盖。

  3. PR 描述更新: 已更新 PR 描述,明确说明使用 std::system_category() 和立即保存 errno 的设计。

关于错误处理逻辑重复的问题,虽然代码模式相似,但每处调用的上下文和错误信息略有不同(如不同的 old_size/new_size 值),提取为通用函数可能会增加复杂度。当前实现已经清晰且易于理解。

请查看更新后的代码和 PR 描述,如有进一步建议欢迎提出。

Copilot AI review requested due to automatic review settings March 20, 2026 05:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens MMapIO’s Linux mremap paths by checking for MAP_FAILED and surfacing syscall failures via VsagException, aligning behavior with the existing macOS error-handling approach to prevent crashes on remap failure.

Changes:

  • Add mremap return-value checks for Linux in WriteImpl and both expand/shrink branches of ResizeImpl.
  • Improve open() and mmap() failure messages by capturing errno immediately and formatting via std::error_code(..., std::system_category()).
  • Add <system_error> include to support the new error-code formatting.

Comment on lines 39 to +44
this->fd_ = open(filepath_.c_str(), O_CREAT | O_RDWR, 0644);
if (this->fd_ < 0) {
throw VsagException(ErrorType::INTERNAL_ERROR,
fmt::format("open file {} error {}", this->filepath_, strerror(errno)));
auto saved_errno = errno;
throw VsagException(
ErrorType::INTERNAL_ERROR,
fmt::format("open file {} failed (errno={}): {}",
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errno is used in this constructor (and in the new mremap checks), but this file does not include <cerrno>/<errno.h>. Relying on transitive includes for errno is non-portable and can break builds on some platforms/toolchains; please include <cerrno> explicitly.

Copilot uses AI. Check for mistakes.
Comment on lines 57 to +61
void* addr = mmap(nullptr, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, this->fd_, 0);
if (addr == MAP_FAILED) {
throw VsagException(ErrorType::INTERNAL_ERROR,
fmt::format("mmap failed: {}", strerror(errno)));
auto saved_errno = errno;
throw VsagException(
ErrorType::INTERNAL_ERROR,
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If mmap fails in the constructor, the file descriptor opened earlier is leaked because the constructor throws before ~MMapIO() can run. Consider closing fd_ (and setting it to -1) before throwing, or using an RAII wrapper for the descriptor so it is reliably released on constructor failure paths.

Copilot uses AI. Check for mistakes.
@LHT129 LHT129 force-pushed the 2026-03-18-修复-mmapio-mremap-返回值未检查 branch from 25af136 to 3b4cc24 Compare March 20, 2026 05:51
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 20, 2026
- Add #include <cerrno> for explicit errno usage
- Extract ThrowMremapError() helper to eliminate duplicated code
- Fix fd leak when mmap/ftruncate fails in constructor

Addresses review feedback from PR antgroup#1725:
- gemini-code-assist: extract mremap error handling to helper function
- Copilot: include <cerrno> explicitly, close fd on error paths

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
Copilot AI review requested due to automatic review settings March 20, 2026 08:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment on lines +71 to 72
close(this->fd_);
throw VsagException(ErrorType::INTERNAL_ERROR, "ftruncate failed");
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new open()/mmap() error paths include errno and an OS-provided message, but the ftruncate failure still throws a generic "ftruncate failed" without the error code/message. Capture errno before close() and include it (and ideally the target mmap_size) in the exception message using std::error_code(saved_errno, std::system_category()).message() for consistent and actionable diagnostics.

Suggested change
close(this->fd_);
throw VsagException(ErrorType::INTERNAL_ERROR, "ftruncate failed");
auto saved_errno = errno;
close(this->fd_);
throw VsagException(
ErrorType::INTERNAL_ERROR,
fmt::format("ftruncate(size={}) failed (errno={}): {}",
mmap_size,
saved_errno,
std::error_code(saved_errno, std::system_category()).message()));

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +43
ThrowMremapError(uint64_t old_size, uint64_t new_size) {
const int saved_errno = errno;
throw VsagException(
ErrorType::INTERNAL_ERROR,
fmt::format("mremap(old_size={}, new_size={}) failed (errno={}): {}",
old_size,
new_size,
saved_errno,
std::error_code(saved_errno, std::system_category()).message()));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThrowMremapError implicitly relies on global errno. To better match the stated intent of “save errno immediately after syscall failure” and make the helper robust if any code is inserted between the failed syscall and the throw, capture errno at the call site and pass it into ThrowMremapError (e.g., ThrowMremapError(old_size, new_size, saved_errno)). This avoids any accidental errno clobbering from future changes before the helper reads it.

Copilot uses AI. Check for mistakes.
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 23, 2026
- Add #include <cerrno> for explicit errno usage
- Extract ThrowMremapError() helper to eliminate duplicated code
- Fix fd leak when mmap/ftruncate fails in constructor

Addresses review feedback from PR antgroup#1725:
- gemini-code-assist: extract mremap error handling to helper function
- Copilot: include <cerrno> explicitly, close fd on error paths

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
@LHT129 LHT129 force-pushed the 2026-03-18-修复-mmapio-mremap-返回值未检查 branch from ad3323b to 725f883 Compare March 23, 2026 03:18
Copilot AI review requested due to automatic review settings March 23, 2026 06:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

Comment on lines +125 to +129
void* new_addr = mremap(this->start_, old_size, new_size, MREMAP_MAYMOVE);
if (new_addr == MAP_FAILED) {
throw_mremap_error(old_size, new_size);
}
this->start_ = static_cast<uint8_t*>(new_addr);
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says errno is saved “immediately after syscall failure”, but for mremap() it’s currently captured inside throw_mremap_error() rather than at the failure site. Even though the call is currently immediate, it’s fragile if any code is added between the failure check and the throw later. Consider saving errno in the caller right after mremap() returns MAP_FAILED, and pass the saved value into throw_mremap_error(...) so the intended invariant is explicit and matches the description.

Copilot uses AI. Check for mistakes.
LHT129 added a commit to LHT129/vsag that referenced this pull request Mar 26, 2026
- Add #include <cerrno> for explicit errno usage
- Extract ThrowMremapError() helper to eliminate duplicated code
- Fix fd leak when mmap/ftruncate fails in constructor

Addresses review feedback from PR antgroup#1725:
- gemini-code-assist: extract mremap error handling to helper function
- Copilot: include <cerrno> explicitly, close fd on error paths

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: Kimi-K2.5 <assistant@example.com>
@LHT129 LHT129 force-pushed the 2026-03-18-修复-mmapio-mremap-返回值未检查 branch from a726e0b to d255e4a Compare March 26, 2026 07:15
Add proper return value check for mremap syscall in MMapIO.
Extract error handling into ThrowMremapError helper function.

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>
@LHT129 LHT129 force-pushed the 2026-03-18-修复-mmapio-mremap-返回值未检查 branch from d255e4a to c880cf0 Compare March 26, 2026 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/M version/1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Check mremap return value to prevent potential crash in MMapIO

2 participants