feat(table): Add commit pipeline with SnapshotCommit abstraction#233
Merged
JingsongLi merged 3 commits intoapache:mainfrom Apr 10, 2026
Merged
feat(table): Add commit pipeline with SnapshotCommit abstraction#233JingsongLi merged 3 commits intoapache:mainfrom
JingsongLi merged 3 commits intoapache:mainfrom
Conversation
982f180 to
6b120a5
Compare
Implement the table write and commit infrastructure, including: - SnapshotCommit trait with RenamingSnapshotCommit (filesystem) and RESTSnapshotCommit (REST catalog) implementations - TableCommit with retry logic, append/overwrite/truncate support, partition statistics generation, and row tracking - WriteBuilder as the entry point for creating TableCommit instances, with overwrite mode configured at construction time - RESTEnv to hold REST catalog context (API client, identifier, uuid) - CommitMessage, PartitionStatistics, and ManifestList types - SnapshotManager extensions for atomic snapshot commit and latest hint - BinaryRow write_datum and datums_to_binary_row utilities - CoreOptions accessors for bucket, commit retry, and row tracking Reference: pypaimon write/commit pipeline Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jerry-024
reviewed
Apr 10, 2026
jerry-024
left a comment
There was a problem hiding this comment.
commit_snapshot rename path silently overwrites on local filesystem
opendal's rename uses POSIX semantics — it silently overwrites the target if it already exists. Java Paimon's FileIO.rename() returns false when the target exists, and additionally wraps the operation in lock.runWithLock(() -> !fileIO.exists(newSnapshotPath) && callable.call()).
paimon-rust does neither check. On local filesystem with concurrent writers:
- Writer A writes temp-A, renames to
snapshot-1→ succeeds - Writer B writes temp-B, renames to
snapshot-1→ also succeeds (POSIX overwrite) - Both return
Ok(true),try_commitstops retrying for both - Writer A's snapshot content is gone, its manifest lists are orphaned — silent data loss
Suggested fix: Add an exists() check before rename, matching Java's pattern:
// Check before rename to avoid silent overwrite (opendal uses POSIX rename semantics)
if self.file_io.exists(&target_path).await? {
let _ = self.file_io.delete_file(&tmp_path).await;
return Ok(false);
}
match self.file_io.rename(&tmp_path, &target_path).await {
...
}This narrows the race window significantly. For full mutual exclusion on backends without atomic rename, an external lock mechanism (like Java's Lock interface) would be needed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Subtask of #232
Implement the table write and commit infrastructure, including:
Reference: pypaimon write/commit pipeline
Brief change log
Tests
API and Format
Documentation