Skip to content

GH-49918: [C++][Parquet] Catch std::vector allocation errors in encoding fuzzer#49919

Merged
pitrou merged 2 commits intoapache:mainfrom
pitrou:gh49918-pq-fuzz-encoding
May 6, 2026
Merged

GH-49918: [C++][Parquet] Catch std::vector allocation errors in encoding fuzzer#49919
pitrou merged 2 commits intoapache:mainfrom
pitrou:gh49918-pq-fuzz-encoding

Conversation

@pitrou
Copy link
Copy Markdown
Member

@pitrou pitrou commented May 5, 2026

Rationale for this change

The Parquet encoding fuzzer can allocate a std::vector of an arbitrary size. This can produce OOMs in the fuzzer.

Issue found by OSS-Fuzz: https://issues.oss-fuzz.com/issues/506741109

What changes are included in this PR?

  1. Use arrow::stl::allocator to delegate std::vector allocations to the fuzzing memory pool
  2. Catch any std::vector allocation exceptions and convert them to regular Status errors

Are these changes tested?

Yes, by new regression file.

Are there any user-facing changes?

No.

@pitrou pitrou requested a review from wgtmac as a code owner May 5, 2026 09:20
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented May 5, 2026

@github-actions crossbow submit -g cpp

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

Revision: 7c9ea2f

Submitted crossbow builds: ursacomputing/crossbow @ actions-679913723d

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-debian-13-cpp-amd64 GitHub Actions
test-debian-13-cpp-i386 GitHub Actions
test-debian-experimental-cpp-gcc-15 GitHub Actions
test-fedora-42-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@pitrou pitrou requested a review from adamreeve May 5, 2026 10:28
allocator() noexcept : pool_(default_memory_pool()) {}
/// \brief Construct an allocator from the given MemoryPool
explicit allocator(MemoryPool* pool) noexcept : pool_(pool) {}
allocator(MemoryPool* pool) noexcept : pool_(pool) {} // NOLINT: runtime/explicit
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why explicit is removed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it allows writing PoolVector<c_type> chunk_values(pool()); without having to spell out the allocator instantiation explicitly.

@github-actions github-actions Bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels May 5, 2026
@pitrou pitrou merged commit 9d545fb into apache:main May 6, 2026
55 of 57 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label May 6, 2026
@pitrou pitrou deleted the gh49918-pq-fuzz-encoding branch May 6, 2026 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants