Skip to content

GH-49626: [C++][Parquet] Fix encoding fuzzing failure#49627

Open
pitrou wants to merge 1 commit intoapache:mainfrom
pitrou:gh49626-pq-encoding-fuzz
Open

GH-49626: [C++][Parquet] Fix encoding fuzzing failure#49627
pitrou wants to merge 1 commit intoapache:mainfrom
pitrou:gh49626-pq-encoding-fuzz

Conversation

@pitrou
Copy link
Copy Markdown
Member

@pitrou pitrou commented Mar 31, 2026

Rationale for this change

The DecodeArrow decoder functions are expected to always return the number of requested values or error out, but the DELTA_BINARY_PACKED would not always satisfy this guarantee.

Issue found by OSS-Fuzz: https://issues.oss-fuzz.com/issues/492457225

What changes are included in this PR?

  1. Make sure DecodeArrow functions for DELTA_BINARY_PACKED error out if there are not enough values
  2. Improve DecodeArrow docstrings to clarify the meaning of the parameters and return value

Are these changes tested?

Only by fuzz regression file.

Are there any user-facing changes?

No.

The DecodeArrow decoder functions are expected to always return the number of requested values or error out, but the DELTA_BINARY_PACKED would not always satisfy this guarantee.
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Mar 31, 2026

@github-actions crossbow submit fuzz

@github-actions github-actions bot added the awaiting review Awaiting review label Mar 31, 2026
@github-actions
Copy link
Copy Markdown

Revision: 307a685

Submitted crossbow builds: ursacomputing/crossbow @ actions-4dc12a1f5e

Task Status
test-build-cpp-fuzz GitHub Actions

@pitrou pitrou marked this pull request as ready for review March 31, 2026 17:14
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Mar 31, 2026

@wgtmac Would you like to review this?

Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! I've left some minor comments.

}
std::vector<T> values(num_values);
int decoded_count = GetInternal(values.data(), num_values);
if (decoded_count < num_values) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add ARROW_PREDICT_FALSE?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because 1) this is not in a critical loop and 2) compilers tend to assume that exception-raising paths are unlikely.

By the way, here's an interesting presentation on the topic: https://cppcon.digital-medium.co.uk/wp-content/uploads/2022/09/C20-likely-Attribute-Optimizations-Pessimizations-and-unlikely-Consequences.pdf

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting committer review Awaiting committer review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants