-
Notifications
You must be signed in to change notification settings - Fork 76
feat: add PendingUpdate interface for table changes #334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add PendingUpdate<T> template class that provides a builder pattern API for constructing and committing table metadata changes. This interface follows the Java PendingUpdate design with C++ idioms: - Apply() returns Result<T> with uncommitted changes for validation - Commit() returns Status and atomically commits changes to the table - Template parameter T allows type-safe results (Snapshot, Schema, etc.) The interface is designed for future implementations like AppendFiles, UpdateSchema, UpdateProperties, and other table update operations.
| /// | ||
| /// \return Status::OK if the commit was successful, or an error status if | ||
| /// validation failed or the commit encountered conflicts | ||
| virtual Status Commit() = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define explicit Status codes for commit failure categories (validation failed, conflicts, commit-state-unknown) so callers can distinguish the same cases Java documents ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have already added some. If not, we can extend it: https://github.com/apache/iceberg-cpp/blob/main/src/iceberg/result.h#L31. Is this what you expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My meaning is that the status code returned by the Commit interface is in what scenario, and if the status code is clearly named, it can follow the definition in result.h.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
| /// | ||
| /// \tparam T The type of result returned by Apply() | ||
| template <typename T> | ||
| class ICEBERG_EXPORT PendingUpdate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both PendingUpdateBase/PendingUpdateTyped<T> and PendingUpdate<T> have advantages and disadvantages for the caller. Does iceberg-java have scenarios where PendingUpdate is used directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to use the same approach in the Transaction, we need to make PendingUpdate non-templated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we need to collect a list of PendingUpdate, we need PendingUpdate without template parameters. We can obtain the specific implementation class through type promotion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, make sense
bb42e85 to
b840e7a
Compare
Deleted member functions should be public for better error messages and to follow C++ best practices. Matches the pattern used in TableMetadataBuilder.
3c74603 to
9f8b0da
Compare
Jinchul81
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't forget to add unit tests for PendingUpdate in the follow up PRs.
| PendingUpdate() = default; | ||
|
|
||
| // Non-copyable, movable | ||
| PendingUpdate(const PendingUpdate&) = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move it to the public space.
|
|
||
| // Non-copyable, movable | ||
| PendingUpdate(const PendingUpdate&) = delete; | ||
| PendingUpdate& operator=(const PendingUpdate&) = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move it to the public space.
src/iceberg/pending_update.h
Outdated
|
|
||
| namespace iceberg { | ||
|
|
||
| /// \brief Base class for table metadata changes using builder pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't prefer you to delete the context that you're using builder pattern. This is the reason why the ctors of this class are in the protected region, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
Just added |
50d119f to
55044b3
Compare
src/iceberg/pending_update.h
Outdated
| /// | ||
| /// This matches the Java Iceberg pattern where BaseTransaction stores a | ||
| /// List<PendingUpdate> without type parameters. | ||
| class ICEBERG_EXPORT PendingUpdateBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| class ICEBERG_EXPORT PendingUpdateBase { | |
| class ICEBERG_EXPORT PendingUpdate { |
Should we use PendingUpdate as the base name? std::vector<std::unique_ptr<PendingUpdate>> looks nicer than std::vector<std::unique_ptr<PendingUpdateBase>> IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
src/iceberg/pending_update.h
Outdated
| /// | ||
| /// \tparam T The type of result returned by Apply() | ||
| template <typename T> | ||
| class ICEBERG_EXPORT PendingUpdate : public PendingUpdateBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rename it to BasePendingUpdate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename it ti PendingUpdateTyped
e1de414 to
4852fcb
Compare
Added comprehensive unit tests for the PendingUpdate interface including: - Basic Apply() success and validation failure scenarios - Commit() success and failure scenarios - Base class polymorphism to verify PendingUpdateBase works correctly These tests verify the interface compiles correctly and can be properly implemented by subclasses.
The builder pattern context explains why constructors are protected - subclasses should be created through their builder methods, not directly instantiated.
Renamed: - PendingUpdateBase -> PendingUpdate (non-template base class) - PendingUpdate<T> -> PendingUpdateTyped<T> (template class) This makes the API more intuitive: std::vector<std::unique_ptr<PendingUpdate>> is cleaner than std::vector<std::unique_ptr<PendingUpdateBase>>
4852fcb to
4d560bd
Compare
wgtmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. I've left some minor comments. Thanks @shangxinli!
@gty404 @Jinchul81 Do you have more comments?
src/iceberg/result.h
Outdated
| kDecompressError, | ||
| kInvalid, // For general invalid errors | ||
| kInvalidArgument, | ||
| kValidationFailed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please sort it alphabetically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
src/iceberg/result.h
Outdated
| DEFINE_ERROR_FUNCTION(DecompressError) | ||
| DEFINE_ERROR_FUNCTION(Invalid) | ||
| DEFINE_ERROR_FUNCTION(InvalidArgument) | ||
| DEFINE_ERROR_FUNCTION(ValidationFailed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
src/iceberg/test/CMakeLists.txt
Outdated
| schema_json_test.cc | ||
| table_metadata_builder_test.cc) | ||
| table_metadata_builder_test.cc | ||
| pending_update_test.cc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These files are unfortunately not sorted alphabetically. Could you help fix it as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
HeartLinked
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I only have a minor question and comment.
| if (should_fail_commit_) { | ||
| return CommitFailed("Mock commit failed"); | ||
| } | ||
| apply_called_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove apply_called_ = true here? I don't think we should assume that Apply has already been called by the user when Commit is invoked.I haven't seen any content in the comments that requires users to guarantee this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
- Move kValidationFailed to the end of ErrorKind enum after kUnknownError - Reorder kInvalidSchema to come after kInvalidManifestList - Sort DEFINE_ERROR_FUNCTION calls to match enum order - Sort table_test source files alphabetically in CMakeLists.txt - Fix apply_called_ tracking to be set in Apply() method instead of Commit()

Add PendingUpdate template class that provides a builder pattern API for constructing and committing table metadata changes. This interface follows the Java PendingUpdate design with C++ idioms:
The interface is designed for future implementations like AppendFiles, UpdateSchema, UpdateProperties, and other table update operations.