Skip to content

GH-32381: [C++] Improve error handling for hash table merges#49512

Open
kris-gaudel wants to merge 3 commits intoapache:mainfrom
kris-gaudel:kris-gaudel/hash-table-merge-errors
Open

GH-32381: [C++] Improve error handling for hash table merges#49512
kris-gaudel wants to merge 3 commits intoapache:mainfrom
kris-gaudel:kris-gaudel/hash-table-merge-errors

Conversation

@kris-gaudel
Copy link

@kris-gaudel kris-gaudel commented Mar 14, 2026

Rationale for this change

Fixes #32381

What changes are included in this PR?

Proper error handling in util/hashing.h and unit test for said error handling in util/hashing_test.cc

Are these changes tested?

Yes, via included unit test

Are there any user-facing changes?

Yes, error handling in hashing utilities

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@kris-gaudel kris-gaudel changed the title Kris gaudel/hash table merge errors GH-32381: [C++] Improve error handling for hash table merges Mar 14, 2026
Copy link
Contributor

@zanmato1984 zanmato1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — the fix direction looks right to me overall, but I think there are two follow-ups worth addressing inline.

ASSERT_EQ(table.size(), map.size());
}

TEST(ScalarMemoTable, MergeTablePropagatesInsertError) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR changes both ScalarMemoTable::MergeTable() and BinaryMemoTable::MergeTable(), could we add a matching regression test for the binary path too? Right now the new coverage only protects the scalar merge case.

Status status = Status::OK();

other_hashtable.VisitEntries([this](const HashTableEntry* other_entry) {
other_hashtable.VisitEntries([this, &status](const HashTableEntry* other_entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have VisitEntries(...) propagate callback failures, so this can just be RETURN_NOT_OK(other_hashtable.VisitEntries(...))? That would avoid the manual Status status plumbing here. If changing VisitEntries is too broad, a status-aware variant/helper could work too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] Improve error handling for hash table merges

2 participants