Feature: Support TRUNCATE TABLE for Iceberg engine by il9ue · Pull Request #1529 · Altinity/ClickHouse

il9ue · 2026-03-16T05:41:59Z

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add TRUNCATE TABLE support for the Iceberg database engine with REST catalog backends.

CI/CD Options

Exclude tests:

Regression jobs to run:

This commit introduces native support for the TRUNCATE TABLE command for the Iceberg database engine. Execution no longer throws a NOT_IMPLEMENTED exception for DataLake engines. To align with Iceberg's architectural standards, this is a metadata-only operation. It creates a new snapshot with an explicitly generated, strictly typed empty Avro manifest list, increments the metadata version, and performs an atomic catalog update. File changes: - StorageObjectStorage.cpp: Remove hardcoded exception, delegate to data_lake_metadata->truncate(). - IDataLakeMetadata.h: Introduce supportsTruncate() and truncate() virtual methods. - IcebergMetadata.h/cpp: Implement the Iceberg-specific metadata truncation, empty manifest list generation via MetadataGenerator, and atomic catalog swap. - tests/integration/: Add PyIceberg integration tests. - tests/queries/0_stateless/: Add SQL stateless tests.

github-actions · 2026-03-16T05:43:08Z

Workflow [PR], commit [2155e32]

arthurpassos

I haven't implemented anything for Iceberg yet, tho it is in my todo list. I left two small comments for now.

I also looked at the transactional model and it looks ok (assuming I understood it correctly).

My understanding of the Iceberg + catalog transactional model is that updating the catalog is the commit marker, and if it fails, the transaction isn't complete even if the new metadata files were already uploaded. Those become orphan and must be ignored. This also implies an Iceberg table should always be read through a catalog if one exists, otherwise it becomes hard to determine the latest metadata snapshot.

I'll read the code more carefully later.

arthurpassos · 2026-03-16T16:18:22Z

src/Storages/ObjectStorage/StorageObjectStorage.cpp

    {
-        throw Exception(ErrorCodes::NOT_IMPLEMENTED,
-                        "Truncate is not supported for data lake engine");
+        if (isDataLake())


Isn't isDataLake() the same as the above configuration->isDataLakeconfiguration()? see

ClickHouse/src/Storages/ObjectStorage/StorageObjectStorage.h

Line 115 in b806ac5

bool isDataLake() const override { return configuration->isDataLakeConfiguration(); }

arthurpassos · 2026-03-16T16:19:53Z

src/Storages/ObjectStorage/DataLakes/IDataLakeMetadata.h

    virtual bool supportsParallelInsert() const { return false; }

-    virtual void modifyFormatSettings(FormatSettings &, const Context &) const {}
+    virtual void modifyFormatSettings(FormatSettings & /*format_settings*/, const Context & /*local_context*/) const {}


I would not change this method simply to avoid merge conflicts with upstream later on

- Addressed Arthur's review comments (removed redundant isDataLake check, reverted IDataLakeMetadata.h signature). - Removed the stateless SQL test entirely. Iceberg table bootstrapping requires external catalog initialization, which is fully covered by the PyIceberg integration tests.

il9ue · 2026-03-19T02:10:25Z

@arthurpassos
just pushed a quick fix for the integration test. I had accidentally passed the S3 credentials as key-value settings to DataLakeCatalog rather than positional arguments, which caused the UNKNOWN_SETTING crash in the previous run. The Iceberg truncation test should be completely green now.

Looking at the latest CI pipeline, there are a couple of red checks remaining, but they appear to be unrelated infrastructure flakes/upstream regressions:

aggregate_functions: Failed due to a runner CPU throttle (Code: 745... SERVER_OVERLOADED).
test_storage_delta/test_cdf.py: Failing deep inside the Rust FFI with a delta-kernel-rs panic (Generic delta kernel error: Expected the first commit to have version 2, got None). This seems to be an existing issue with the DeltaLake CDF mock data in this branch and doesn't overlap with the Iceberg metadata changes.
(Plus a few intermittent Docker Hub TLS handshake timeouts during environment setup).

Since the core logical truncation architecture is in place and the specific Iceberg integration tests are passing, I believe this is ready for another look whenever you have the time. Let me know if you need any further adjustments!

…fault

arthurpassos · 2026-03-19T12:10:27Z

@codex review

arthurpassos · 2026-03-19T12:17:57Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+        persistent_components.metadata_compression_method,
+        persistent_components.table_uuid);
+
+    Int64 parent_snapshot_id = actual_table_state_snapshot.snapshot_id.value_or(0);


Why 0? I have seen "-1" in a few places https://github.com/ClickHouse/ClickHouse/blob/e3064cae1e27cafb79d37215a8d1d13a24c7801f/src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp#L879 and https://github.com/ClickHouse/ClickHouse/blob/e3064cae1e27cafb79d37215a8d1d13a24c7801f/src/Databases/DataLake/RestCatalog.cpp#L1055.

Perhaps it is part of a specification?

arthurpassos · 2026-03-19T12:21:43Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+        const auto & [namespace_name, table_name] = DataLake::parseTableName(storage_id.getTableName());
+        bool success = catalog->updateMetadata(namespace_name, table_name, storage_metadata_name, metadata_object);
+        if (!success)
+            throw Exception(ErrorCodes::LOGICAL_ERROR, "Failed to commit Iceberg truncate update to catalog.");


I don't think it should be a LOGICAL_ERROR. This will crash ClickHouse

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5a7e014507

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T12:23:51Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+    if (catalog)
+    {
+        const auto & [namespace_name, table_name] = DataLake::parseTableName(storage_id.getTableName());
+        bool success = catalog->updateMetadata(namespace_name, table_name, storage_metadata_name, metadata_object);


Pass the generated snapshot to updateMetadata

truncate() builds new_snapshot but calls catalog->updateMetadata(..., metadata_object). For REST catalogs, RestCatalog::updateMetadata treats this argument as a snapshot and reads snapshot-id; table metadata objects do not have that field, so truncate fails before the request is sent. This makes TRUNCATE TABLE fail for Iceberg tables backed by REST catalogs.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-19T12:23:51Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+    if (catalog)
+    {
+        const auto & [namespace_name, table_name] = DataLake::parseTableName(storage_id.getTableName());
+        bool success = catalog->updateMetadata(namespace_name, table_name, storage_metadata_name, metadata_object);


Commit truncate with catalog-visible metadata path

The truncate commit passes storage_metadata_name to updateMetadata, but catalog updates need the catalog-facing metadata URI (as done in normal Iceberg writes). With Glue, this value is written to metadata_location; using an internal storage path instead of an s3://... URI leaves the catalog pointing at a non-resolvable location after truncate.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-19T12:23:51Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+        persistent_components.metadata_compression_method,
+        persistent_components.table_uuid);
+
+    Int64 parent_snapshot_id = actual_table_state_snapshot.snapshot_id.value_or(0);


Keep no-parent sentinel when table has no snapshot

Using value_or(0) fabricates parent snapshot id 0 for tables that currently have no snapshot. Catalog commits then assert main is at snapshot 0 instead of using the existing no-parent sentinel (-1 used in other Iceberg write paths), so truncating a freshly created empty table can fail with optimistic-lock checks.

Useful? React with 👍 / 👎.

arthurpassos · 2026-03-19T12:25:14Z

tests/integration/test_storage_iceberg_no_spark/test_iceberg_truncate.py

+
+    # Assert PyIceberg reads the empty snapshot successfully
+    assert len(table.scan().to_arrow()) == 0
+


Perhaps it is a good idea to insert data again and check it can be read just to make sure we haven't broken anything?

Good call. Added in the latest commit — after asserting count() == 0, now append a new row via PyIceberg and verify ClickHouse reads it back as count() == 1. This confirms the table remains fully writable after truncation and that the metadata state is consistent.

…log support) ## Overview Implements metadata-only TRUNCATE TABLE for the Iceberg database engine, targeting REST catalog (transactional) backends. Leaves physical file garbage collection to standard Iceberg maintenance operations. ## Root Cause Analysis & Fixes (7 bugs) ### Bug 1: Premature isTransactional() guard blocked REST catalog RCA: A guard was added that threw NOT_IMPLEMENTED for any transactional catalog (i.e. REST), which is the primary target of this feature. Fix: Removed the guard entirely. REST catalogs are fully supported. ### Bug 2: Catalog commit block was deleted RCA: The catalog->updateMetadata() call was removed alongside the guard, leaving object storage updated but the REST catalog pointer never atomically swapped. The table appeared truncated locally but the catalog still pointed to stale metadata. Fix: Restored the catalog commit block, building the catalog-visible URI (blob_type://namespace/metadata_name for transactional catalogs) consistent with the pattern in IcebergWrites.cpp and Mutations.cpp. ### Bug 3: FileNamesGenerator hardcoded is_transactional=false RCA: The refactored truncate path hardcoded false for the isTransactional flag, causing FileNamesGenerator to produce bare /path/ style filenames while the REST catalog expected full s3://... URIs. This triggered a FileNamesGenerator::convertMetadataPathToStoragePath() consistency check (BAD_ARGUMENTS: 'Paths in Iceberg must use a consistent format'). Fix: Force full URI base (from f_location) when catalog->isTransactional() is true, regardless of write_full_path_in_iceberg_metadata setting. ### Bug 4: value_or(0) used wrong no-parent sentinel RCA: Using 0 as the no-parent snapshot ID fabricates a fake parent snapshot ID for tables with no existing snapshot. REST catalog optimistic-lock checks assert the current snapshot matches; using 0 instead of -1 (the Iceberg spec sentinel) causes lock check failures on empty tables. Fix: Changed value_or(0) to value_or(-1). ### Bug 5: updateMetadata passed metadata_object instead of new_snapshot RCA: RestCatalog::updateMetadata() reads snapshot-id (a long) from its 4th argument to build the commit request. The truncate path passed metadata_object (full table metadata JSON) which has no top-level snapshot-id field. Poco threw InvalidAccessException: 'Can not convert empty value' (POCO_EXCEPTION). Fix: Pass new_snapshot (the generated snapshot object) to updateMetadata, consistent with how IcebergWrites.cpp and Mutations.cpp call it. ### Bug 6: Empty manifest list wrote field-id-less Avro schema header RCA: avro::DataFileWriter calls writeHeader() eagerly in its constructor, committing the binary encoder state. Attempting writer.setMetadata() after construction to inject the full Iceberg schema JSON (with field-id on each field) corrupted the encoder's internal StreamWriter::next_ pointer to NULL, causing a segfault in avro::StreamWriter::flush() on close(). Without the override, the Avro C++ library strips unknown field properties (like field-id) when reconstructing schema JSON from its internal node representation, causing PyIceberg to reject the manifest list with: ValueError: Cannot convert field, missing field-id: {'name': 'manifest_path'} Fix: For empty manifest lists (TRUNCATE path), bypass DataFileWriter entirely and write a minimal valid Avro Object Container File manually. This embeds the original schema_representation string (with all field-ids intact) directly into the avro.schema metadata entry. The Avro container format is: [magic(4)] [meta_map] [sync_marker(16)] with no data blocks for an empty manifest list. This avoids all contrib library issues without modifying vendored code. ### Bug 7: use_previous_snapshots=is_v2 copied old data into truncate snapshot RCA: The generateManifestList() call in truncate passed is_v2 as the use_previous_snapshots argument (true for v2 tables). This caused the function to read and copy all manifest entries from the parent snapshot into the new manifest list, defeating the truncation. PyIceberg then returned 3 rows instead of 0. Fix: Pass false explicitly for use_previous_snapshots in the truncate path. Truncation must always produce an empty manifest list. ## Changes - src/Databases/DataLake/RestCatalog.cpp Improve error propagation: re-throw HTTP exceptions with detail text instead of silently returning false from updateMetadata(). - src/Storages/ObjectStorage/DataLakes/Iceberg/Constant.h Add field aliases: deleted_records (deleted-records) and deleted_data_files (deleted-data-files) for truncate summary fields. - src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp Core truncate implementation: correct FileNamesGenerator construction, correct parent snapshot sentinel (-1), restored catalog commit, manual Avro container write for empty manifest list, correct use_previous_snapshots. - src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp Fix updateMetadata() call to pass new_snapshot instead of metadata_object. Add empty manifest list fast path with manual Avro container serialization. - src/Storages/ObjectStorage/DataLakes/Iceberg/MetadataGenerator.cpp/.h Add is_truncate parameter to generateNextMetadata(). When true, sets operation=overwrite, zeroes out cumulative totals, and populates deleted_records/deleted_data_files from parent snapshot summary for spec-compliant truncate snapshot summary. - src/Storages/ObjectStorage/DataLakes/Mutations.cpp Fix updateMetadata() call to pass new_snapshot instead of metadata_object. - tests/integration/test_storage_iceberg_no_spark/test_iceberg_truncate.py New integration test: PyIceberg creates REST catalog table with 3 rows, ClickHouse truncates via REST catalog, validates count=0 from both ClickHouse and PyIceberg (cross-engine validation), then verifies table remains writable by appending a new row. ## Test Results 19/19 passed across all backends (s3, azure, local) and all test cases including the new test_iceberg_truncate.

RCA: IcebergStorageSink::initializeMetadata() was passing the full table metadata object to catalog->updateMetadata(), but RestCatalog::updateMetadata() expects a snapshot object with a top-level snapshot-id field. This caused a Poco::InvalidAccessException ('Can not convert empty value') on every INSERT into a REST catalog table. Fix: Pass new_snapshot instead of metadata as the 4th argument, consistent with the truncate path and Mutations.cpp.

il9ue · 2026-03-23T02:15:37Z

On the CI regression failures:

The failures across other test suites seems pre-existing and unrelated to this PR's changes:

test_restore_db_replica — DatabaseReplicated::renameTable UNKNOWN_TABLE, a known ZooKeeper DDL replication race with no Iceberg stack frames
test_storage_delta — DeltaLake Rust kernel FFI error (Expected first commit version 2, got None), touches only DeltaLakeMetadataDeltaKernel TableChanges
test_storage_azure_blob_storage — query timeout (>900s), infra-level
test_backup_restore_s3 — SYSTEM FLUSH LOGS timeout (180s), infrastructure
test_ttl_move — concurrent alter races and ClickHouse restart failures
/parquet/datatypes/unsupportednull — Parquet null type unsupported in schema inference

None of these touch IcebergMetadata, IcebergWrites, RestCatalog, or StorageObjectStorage. Current code diff is scoped entirely to Iceberg write paths and the new truncate implementation.

il9ue · 2026-03-24T06:00:52Z

Requesting a peer review on the Iceberg TRUNCATE TABLE implementation for the Antalya release.

What this does
Adds native TRUNCATE TABLE support for the Iceberg engine targeting REST (transactional) catalog backends. This is a metadata-only operation — we generate a new snapshot with an empty manifest list and atomically commit it via the REST catalog. Physical file cleanup is left to standard Iceberg GC, consistent with how other Iceberg engines (Spark, PyIceberg) handle truncation.

Architecture

StorageObjectStorage::truncate() — removed the hardcoded NOT_IMPLEMENTED block; now delegates to IDataLakeMetadata::truncate() via a new supportsTruncate() virtual interface
IcebergMetadata::truncate() — full implementation: reads current snapshot chain, generates a new overwrite snapshot via MetadataGenerator (new is_truncate param), writes an empty manifest list to object storage, writes v<N+1>.metadata.json, then atomically commits via catalog->updateMetadata()

One non-obvious design decision worth reviewing
For the empty manifest list, I bypassed avro::DataFileWriter entirely and wrote the Avro OCF header manually. The reason: the vendored Avro C++ calls writeHeader() eagerly in its constructor, committing the binary encoder state. Any post-construction setMetadata() call corrupts StreamWriter::next_ and segfaults on close(). Since PyIceberg strictly requires field-id in the avro.schema header entry (which the Avro library strips during schema node serialization), writing the ~20-byte OCF header directly was the cleanest path — no contrib changes, no library hacks.

Also fixed as part of this work

IcebergStorageSink::initializeMetadata() and Mutations.cpp were both passing the full metadata_object to catalog->updateMetadata() instead of new_snapshot — this was silently breaking all INSERTs on REST catalog tables (same POCO_EXCEPTION crash)

Test results
19/19 pass across S3, Azure, and local backends. Cross-engine validation via PyIceberg confirms the truncated table is readable (0 rows, intact schema) and remains writable after truncation.

On the CI failures
The failures in test_restore_db_replica, test_storage_delta, test_backup_restore_s3, test_ttl_move, and /parquet/datatypes are all pre-existing with no Iceberg stack frames — ZooKeeper replication races, DeltaLake Rust kernel errors, and infrastructure timeouts unrelated to this diff.

Would love eyes on the IcebergMetadata.cpp truncate flow and the manual Avro serialization path in generateManifestList() specifically. Thanks!

src/Databases/DataLake/RestCatalog.cpp

il9ue · 2026-03-24T14:01:14Z

Feature Scope Elaboration

[Feature] Support TRUNCATE TABLE for Iceberg Engine

Overview
As part of the Antalya release, v26.1 needs to natively support the TRUNCATE TABLE command for the Iceberg database engine. Currently, upstream ClickHouse explicitly rejects this operation. As of PR ClickHouse#91713, executing TRUNCATE down-casts to StorageObjectStorage, where it immediately throws an ErrorCodes::NOT_IMPLEMENTED exception for Data Lake engines.

To support standard analytics workflows and testing pipelines without requiring users to DROP and recreate tables (which breaks catalog bindings), implementing a metadata-only truncation is essential.

Proposed Architecture
Unlike a standard MergeTree truncation that physically drops parts from the local disk, Iceberg truncation must be entirely logical. The implementation will leave physical file garbage collection to standard Iceberg maintenance operations and focus strictly on metadata manipulation.

Core Workflow:

Bypass the Upstream Block: Modify StorageObjectStorage::truncate to check data_lake_metadata->supportsTruncate().
Snapshot Generation: Generate a new Iceberg snapshot ID and increment the metadata version (v<N+1>.metadata.json).
Empty Avro Manifest List: The new snapshot cannot simply omit the manifest list. We must use ClickHouse's internal object storage and Avro APIs to generate and write a strictly typed, empty Avro manifest list to object storage.
Metadata Update: Attach the new empty manifest list to the new snapshot, append the snapshot to the snapshots array, and update the snapshot-log and current-snapshot-id.
Catalog Commit: Perform an atomic swap via the ICatalog interface (e.g., REST Catalog) to point the table to the newly generated metadata JSON.

Implementation Details
The required changes span the following internal abstractions:

src/Storages/ObjectStorage/StorageObjectStorage.h/cpp: Override truncate and remove the hardcoded throw Exception. Delegate to IDataLakeMetadata.
src/Storages/ObjectStorage/DataLakes/IDataLakeMetadata.h: Introduce supportsTruncate() and truncate(ContextPtr, ICatalog) virtual methods.
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.h/cpp: Implement the core truncation logic. Must safely obtain an IObjectStorage write buffer via context->getWriteSettings() to serialize the empty Avro file before committing the JSON metadata.
tests/integration/.../test_iceberg_truncate.py: Added Python integration tests. (Note: Stateless SQL tests are intentionally omitted as ClickHouse ENGINE=Iceberg requires an externally initialized catalog to bootstrap, which is handled via PyIceberg in the integration suite).

Throwing an exception broke the retry contract — callers in IcebergStorageSink::initializeMetadata and writeMetadataFiles treat false as a retryable metadata conflict signal and re-read the latest metadata before retrying. Replaced throw with LOG_WARNING to preserve retry semantics while still surfacing HTTP error details in logs.

il9ue · 2026-03-25T09:36:21Z

Flagging the CI failures for QA review. All observed failures are pre-existing and unrelated to this PR's changes. Summary below:

/docker vulnerabilities — CVE-2026-2673 (High)
Docker image vulnerability scan failure against the base Alpine image. This is a security scanner finding at the container layer — no relation to any C++ code changes in this diff.

test_replicated_database::test_sync_replica
SYSTEM SYNC DATABASE REPLICA timed out after 900s. Pure ZooKeeper/replicated database infrastructure flakiness. No Iceberg stack frames present.

test_backup_restore_s3 (multiple variants)
Two failure modes:

SYSTEM FLUSH LOGS timeout (180s / 300s) — S3 backup infrastructure under load

S3ReadRequestsErrors assertion — S3 request error metric unexpectedly present

Note: one failure surfaces TRUNCATE TABLE IF EXISTS system.query_log — this is truncating a system table via InterpreterSystemQuery, an entirely different code path from the Iceberg truncate implementation in this PR.

None of these failures touch IcebergMetadata, IcebergWrites, RestCatalog, or StorageObjectStorage. Our diff is scoped entirely to the Iceberg write path.

Could QA confirm these are known flaky/pre-existing failures so we can proceed to merge?
Ready to re-run specific suites if needed. 🙏

mkmkme · 2026-03-25T14:28:17Z

Ran an audit for this PR. Could you have a look please?

AI audit note: This review comment was generated by AI (claude-4.6-sonnet-medium-thinking).

Audit update for PR #1529 (Feature: Support TRUNCATE TABLE for Iceberg engine):

Confirmed Defects

High: Mutations.cpp regression — full metadata object passed where new_snapshot is required

Impact: All Iceberg mutation operations (ALTER TABLE ... DELETE, ALTER TABLE ... UPDATE) using a REST catalog backend crash after this PR is applied. The mutation (DELETE/UPDATE) path is broken regardless of whether TRUNCATE is used.
Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp / writeMetadataFiles / diff line 517–524
Trigger: Any ALTER TABLE DELETE or ALTER TABLE UPDATE on an Iceberg table with a REST catalog (catalog_type='rest').
Why defect: RestCatalog::updateMetadata (line 930 of RestCatalog.cpp) reads new_snapshot->getValue<Int64>("snapshot-id"). The PR changes the call-site argument from new_snapshot (the snapshot JSON object produced by MetadataGenerator::generateNextMetadata) to metadata (the full Iceberg metadata JSON). The full metadata object has no top-level "snapshot-id" field — only "current-snapshot-id" at the metadata level and "snapshot-id" inside each element of the "snapshots" array. Poco::JSON::Object::getValue<T> throws Poco::InvalidAccessException on a missing key, crashing the mutation path.

The PR comment in IcebergMetadata.cpp says "// Pass metadata_object (not new_snapshot) — matches the fix already applied in IcebergWrites.cpp and Mutations.cpp" but the truncate code itself correctly passes new_snapshot to catalog->updateMetadata. IcebergWrites.cpp also still uses new_snapshot (line 1034, unchanged by this PR). The Mutations.cpp change is the only site that was actually changed, and it is wrong.
```
// RestCatalog.cpp line 928–933 — expects snapshot object fields:
set_snapshot->set("snapshot-id", new_snapshot->getValue<Int64>("snapshot-id")); // throws if 'metadata' passed
```
```
// Mutations.cpp — incorrect change in this PR:
- if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, new_snapshot))
+ if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, metadata))
```
Fix direction: Revert the Mutations.cpp change; keep new_snapshot as the 4th argument.
Regression test direction: After any ALTER TABLE DELETE / UPDATE against a REST-catalog Iceberg table, verify the operation completes without exception and the catalog snapshot-ref is updated correctly.

Medium: No cleanup of orphaned files on catalog commit failure during TRUNCATE

Impact: If catalog->updateMetadata fails (transient network/catalog error) after both the manifest list file and metadata.json have been written to object storage, both files are permanently orphaned. No cleanup is attempted. Under repeated failures orphaned files accumulate indefinitely and are not recovered by any subsequent successful write (the manifest list name encodes the random snapshot_id, so it is never overwritten).
Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp / IcebergMetadata::truncate / lines after writeMessageToFile
Trigger: Transient REST catalog error or network interruption between the writeMessageToFile call and catalog->updateMetadata.
Why defect: The mutation path in Mutations.cpp wraps all writes in a cleanup() lambda that removes all written files on failure (lines 400–530), including a catch (...) { cleanup(); throw; } guard. The truncate path has no equivalent.
Fix direction: Wrap the truncate write sequence in a try/catch with a cleanup lambda mirroring Mutations.cpp::writeMetadataFiles::cleanup(), removing storage_manifest_list_name and storage_metadata_name on failure.
Regression test direction: Inject a failure-point after the metadata write but before the catalog commit; verify no orphaned files remain on storage after the exception.

Low: Manual Avro container writer for empty manifest list — unnecessary complexity

Impact: Maintenance risk and potential divergence from the standard Avro format; the standard avro::DataFileWriter with zero written records produces a valid empty container file without manual ZigZag encoding or hard-coded sync marker bytes.
Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp / generateManifestList / early-return block
Trigger: Every TRUNCATE TABLE call.
Why defect: The comment claims the manual path is needed to prevent DataFileWriter's eager writeHeader() from causing issues, but the standard path with an empty entry loop and writer.close() produces a valid empty OCF. The manual implementation hard-codes a 16-byte all-zero sync marker and bespoke ZigZag encoding, adding maintainability risk without correctness benefit.
Fix direction: Remove the early-return manual path; let the standard avro::DataFileWriter handle the empty case.
Regression test direction: Test truncating a table with both Iceberg v1 and v2 format; verify PyIceberg can read the resulting empty manifest list.

Coverage Summary

Scope reviewed: All 10 changed files: IDataLakeMetadata.h, Constant.h, IcebergMetadata.{h,cpp}, IcebergWrites.cpp, MetadataGenerator.{h,cpp}, Mutations.cpp, StorageObjectStorage.cpp, test_iceberg_truncate.py. REST catalog commit path (RestCatalog.cpp::updateMetadata) and WriteBuffer::finalize idempotency (WriteBuffer.cpp) also reviewed as relevant integration boundaries.
Categories failed: REST catalog integration (Mutations.cpp regression), exception-safety/rollback (no truncate cleanup).
Categories passed: Avro binary format correctness, WriteBuffer::finalize double-call safety (idempotent), path construction consistency with IcebergWrites.cpp, cache corruption (cache stores strings not shared Poco objects — in-place mutation of metadata_object does not corrupt cache), thread-safety under TableExclusiveLockHolder, truncate-then-insert round-trip logic, v1/v2 format version handling.
Assumptions/limits: Static analysis only; Glue catalog updateMetadata confirmed safe (ignores 4th arg); pre-existing Int32/stoi overflow for >2B row tables not re-raised as it is a systemic pre-existing pattern.

il9ue · 2026-03-25T14:44:48Z

Ran an audit for this PR. Could you have a look please?

AI audit note: This review comment was generated by AI (claude-4.6-sonnet-medium-thinking).

Audit update for PR #1529 (Feature: Support TRUNCATE TABLE for Iceberg engine):

Confirmed Defects

High: Mutations.cpp regression — full metadata object passed where new_snapshot is required
Impact: All Iceberg mutation operations (ALTER TABLE ... DELETE, ALTER TABLE ... UPDATE) using a REST catalog backend crash after this PR is applied. The mutation (DELETE/UPDATE) path is broken regardless of whether TRUNCATE is used.

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp / writeMetadataFiles / diff line 517–524

Trigger: Any ALTER TABLE DELETE or ALTER TABLE UPDATE on an Iceberg table with a REST catalog (catalog_type='rest').
Why defect: RestCatalog::updateMetadata (line 930 of RestCatalog.cpp) reads new_snapshot->getValue<Int64>("snapshot-id"). The PR changes the call-site argument from new_snapshot (the snapshot JSON object produced by MetadataGenerator::generateNextMetadata) to metadata (the full Iceberg metadata JSON). The full metadata object has no top-level "snapshot-id" field — only "current-snapshot-id" at the metadata level and "snapshot-id" inside each element of the "snapshots" array. Poco::JSON::Object::getValue<T> throws Poco::InvalidAccessException on a missing key, crashing the mutation path.
The PR comment in IcebergMetadata.cpp says "// Pass metadata_object (not new_snapshot) — matches the fix already applied in IcebergWrites.cpp and Mutations.cpp" but the truncate code itself correctly passes new_snapshot to catalog->updateMetadata. IcebergWrites.cpp also still uses new_snapshot (line 1034, unchanged by this PR). The Mutations.cpp change is the only site that was actually changed, and it is wrong.
// RestCatalog.cpp line 928–933 — expects snapshot object fields:
set_snapshot->set("snapshot-id", new_snapshot->getValue<Int64>("snapshot-id")); // throws if 'metadata' passed
// Mutations.cpp — incorrect change in this PR:
- if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, new_snapshot))
+ if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, metadata))
Fix direction: Revert the Mutations.cpp change; keep new_snapshot as the 4th argument.

Regression test direction: After any ALTER TABLE DELETE / UPDATE against a REST-catalog Iceberg table, verify the operation completes without exception and the catalog snapshot-ref is updated correctly.
Medium: No cleanup of orphaned files on catalog commit failure during TRUNCATE

Impact: If catalog->updateMetadata fails (transient network/catalog error) after both the manifest list file and metadata.json have been written to object storage, both files are permanently orphaned. No cleanup is attempted. Under repeated failures orphaned files accumulate indefinitely and are not recovered by any subsequent successful write (the manifest list name encodes the random snapshot_id, so it is never overwritten).

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp / IcebergMetadata::truncate / lines after writeMessageToFile

Trigger: Transient REST catalog error or network interruption between the writeMessageToFile call and catalog->updateMetadata.

Why defect: The mutation path in Mutations.cpp wraps all writes in a cleanup() lambda that removes all written files on failure (lines 400–530), including a catch (...) { cleanup(); throw; } guard. The truncate path has no equivalent.

Fix direction: Wrap the truncate write sequence in a try/catch with a cleanup lambda mirroring Mutations.cpp::writeMetadataFiles::cleanup(), removing storage_manifest_list_name and storage_metadata_name on failure.

Regression test direction: Inject a failure-point after the metadata write but before the catalog commit; verify no orphaned files remain on storage after the exception.

Low: Manual Avro container writer for empty manifest list — unnecessary complexity

Impact: Maintenance risk and potential divergence from the standard Avro format; the standard avro::DataFileWriter with zero written records produces a valid empty container file without manual ZigZag encoding or hard-coded sync marker bytes.

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp / generateManifestList / early-return block

Trigger: Every TRUNCATE TABLE call.

Why defect: The comment claims the manual path is needed to prevent DataFileWriter's eager writeHeader() from causing issues, but the standard path with an empty entry loop and writer.close() produces a valid empty OCF. The manual implementation hard-codes a 16-byte all-zero sync marker and bespoke ZigZag encoding, adding maintainability risk without correctness benefit.

Fix direction: Remove the early-return manual path; let the standard avro::DataFileWriter handle the empty case.

Regression test direction: Test truncating a table with both Iceberg v1 and v2 format; verify PyIceberg can read the resulting empty manifest list.

Coverage Summary

Scope reviewed: All 10 changed files: IDataLakeMetadata.h, Constant.h, IcebergMetadata.{h,cpp}, IcebergWrites.cpp, MetadataGenerator.{h,cpp}, Mutations.cpp, StorageObjectStorage.cpp, test_iceberg_truncate.py. REST catalog commit path (RestCatalog.cpp::updateMetadata) and WriteBuffer::finalize idempotency (WriteBuffer.cpp) also reviewed as relevant integration boundaries.

Categories failed: REST catalog integration (Mutations.cpp regression), exception-safety/rollback (no truncate cleanup).

Categories passed: Avro binary format correctness, WriteBuffer::finalize double-call safety (idempotent), path construction consistency with IcebergWrites.cpp, cache corruption (cache stores strings not shared Poco objects — in-place mutation of metadata_object does not corrupt cache), thread-safety under TableExclusiveLockHolder, truncate-then-insert round-trip logic, v1/v2 format version handling.

Assumptions/limits: Static analysis only; Glue catalog updateMetadata confirmed safe (ignores 4th arg); pre-existing Int32/stoi overflow for >2B row tables not re-raised as it is a systemic pre-existing pattern.

High (Mutations.cpp): Confirmed — this is a legitimate regression introduced by our diff. Will revert Mutations.cpp back to passing new_snapshot. Fix going up now. ✅
Medium (orphaned files): Valid concern. This is a new gap in the truncate path — acknowledged. Will track this as a follow-up issue rather than block the current PR, since adding a cleanup lambda now risks introducing new failure modes under time pressure. The existing Iceberg write paths have the same cleanup pattern that can model from.
Low (manual Avro writer): Need to push back here. The audit notes "static analysis only" — this specific issue was caught only through runtime testing. We have a confirmed server segfault (SIGSEGV 11, Signal 11, avro::StreamWriter::flush() → NULL dereference) with a full stack trace proving that DataFileWriter::close() crashes when setMetadata() is called post-construction. The standard empty-loop path is not viable for this library version. The manual OCF path is the direct fix for a confirmed crash, not premature optimization.

The audit correctly identified this as a regression — passing metadata (full table metadata JSON) instead of new_snapshot (snapshot object) to catalog->updateMetadata() causes Poco::InvalidAccessException on snapshot-id read, breaking all ALTER TABLE DELETE/UPDATE on REST catalog Iceberg tables.

il9ue · 2026-03-25T23:45:46Z

@arthurpassos @ianton-ru @zvonand
Could you please give a review?

ianton-ru · 2026-03-26T08:15:00Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp

+    {
+        auto write_avro_long = [](WriteBuffer & out, int64_t val)
+        {
+            uint64_t n = (static_cast<uint64_t>(val) << 1) ^ static_cast<uint64_t>(val >> 63);


Please add a comment with explain of this magic. What this code does, and why writing long is so complex.

Valid point!
Will Add a comment explaining both the zigzag encoding step ((val << 1) ^ (val >> 63) maps signed integers to unsigned so small magnitudes stay small regardless of sign) and the variable-length base-128 serialization (high bit of each byte signals continuation).

link: the Avro spec section for reference

I think link on avro specs is enough.
Just to avoid WTFs from someone who is not very close with avro. :)

Honestly, I would say it is better to move this code out to a helper function

Isn't there a Avro writer in ClickHouse that could abstract this away?

…erateManifestList Explain the two-step integer encoding used in the manual Avro OCF writer: 1. Zigzag encoding: maps signed int64 to unsigned so small magnitudes stay compact regardless of sign (positive n → 2n, negative n → 2(-n)-1) 2. Variable-length base-128 serialization: high bit of each byte signals whether more bytes follow (little-endian order) Ref: https://avro.apache.org/docs/1.11.1/specification/#binary-encoding

arthurpassos

Can you check if it's possible to write a test in which you truncate a table, re-start clickhouse, and the table is functional? If you manage to implement that, maybe one in which you truncate, insert data, and re-start would also be good.

I know this might sound odd, but while working on Iceberg related writes, there was a time in which the table was not readable after I re-started clickhouse. Good to check that.

arthurpassos · 2026-03-26T12:28:04Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+            "Iceberg truncate is experimental. "
+            "To allow its usage, enable setting allow_experimental_insert_into_iceberg");
+
+    // Bug 1 fix: REMOVE the isTransactional() guard entirely.


Nitpick: this type of documentation seems a bit weird to me. If there is a thing you want to explain, I would say just write down some text explaining it instead of loose references like "bug 1 fix".

Sure — will remove those debug-style references and replaced with plain explanatory prose.

arthurpassos · 2026-03-26T12:32:15Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+
+    auto [new_snapshot, manifest_list_name, storage_manifest_list_name] = MetadataGenerator(metadata_object).generateNextMetadata(
+        filename_generator, metadata_name, parent_snapshot_id,
+        0, 0, 0, 0, 0, 0, std::nullopt, std::nullopt, /*is_truncate=*/true);


Nit: Add the comments for the bunch of zeroes, just like you did for is_truncate or create variables with proper names and pass them

arthurpassos · 2026-03-26T13:00:00Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp

+    {
+        // Build the catalog-visible path (blob URI for transactional, bare path otherwise)
+        String catalog_filename = metadata_name;
+        if (is_transactional)


I see in IcebergWrites the check is different, but leads to the same result. To the reader like me, it is not very clear why transactional needs a different URI.

I suggest one of the below:

Use the same condition as IcebergWrites - that one is easier to understand because it is a path condition

Add a comment explaining why transactional paths are different

arthurpassos · 2026-03-26T13:02:38Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp

+    {
+        auto write_avro_long = [](WriteBuffer & out, int64_t val)
+        {
+            uint64_t n = (static_cast<uint64_t>(val) << 1) ^ static_cast<uint64_t>(val >> 63);


Honestly, I would say it is better to move this code out to a helper function

arthurpassos · 2026-03-26T13:04:45Z

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp

+    {
+        auto write_avro_long = [](WriteBuffer & out, int64_t val)
+        {
+            uint64_t n = (static_cast<uint64_t>(val) << 1) ^ static_cast<uint64_t>(val >> 63);


Isn't there a Avro writer in ClickHouse that could abstract this away?

arthurpassos · 2026-03-26T13:06:55Z

src/Storages/ObjectStorage/DataLakes/Iceberg/MetadataGenerator.cpp

-    if (num_deleted_rows == 0)
+    if (is_truncate)
+    {
+        summary->set(Iceberg::f_operation, Iceberg::f_overwrite);


Can't sum_with_parent_snapshot be used here?

On Abstraction

I DID try avro::DataFileWriter first and it's the organic one. But, unfortunately in the vendored Avro C++ version, writeHeader() is called eagerly in the constructor committing the binary encoder state. Calling setMetadata("avro.schema", ...) post-construction corrupts StreamWriter::next_ causing a NULL dereference on close(). I have a confirmed server segfault stack trace from this exact path. The manual OCF serialization exists because the abstraction is broken for this specific use case in this library version.

On sum_with_parent_snapshot

The is_truncate guard inside sum_with_parent_snapshot already correctly zeroes cumulative totals (total_records, total_data_files, etc.) — you're correct that we could lean on it there. However deleted_records and deleted_data_files are different: they need to be populated from the parent snapshot's values (not zero), since they represent what was removed. That's why they're set separately. Glad to add a comment making this distinction explicit.

…ation - Remove debug-style 'Bug N fix' comments; replace with plain explanatory prose - Add named zero arguments to generateNextMetadata() call for readability - Add comment explaining why transactional catalogs require full blob URIs - Extract Avro zigzag encoding into static writeAvroLong/writeAvroBytes helpers with full comment explaining the encoding and link to Avro spec - Add test_iceberg_truncate_restart: verifies table is readable after ClickHouse restart post-truncation, and remains writable for new inserts Signed-off-by: Daniel Q. Kim <daniel.kim@altinity.com>

il9ue · 2026-03-27T14:18:55Z

@arthurpassos
I've updated reviews to new patch. Regression tests seems all known issues and no truncate table related ones.
Could you please give final review?

arthurpassos

LGTM

Selfeer · 2026-03-30T12:23:18Z

PR #1529 CI Triage

PR Summary

Field	Value
Title	Feature: Support TRUNCATE TABLE for Iceberg engine
Base Branch	antalya-26.1
Feature Branch	feature/iceberg-truncate
PR Link	#1529

PR Description

Adds TRUNCATE TABLE support for the Iceberg database engine with REST catalog backends. This is a metadata-only operation that creates a new snapshot with an empty manifest list, increments the metadata version, and performs an atomic catalog update.

Files Changed

src/Storages/ObjectStorage/DataLakes/IDataLakeMetadata.h (+6)
src/Storages/ObjectStorage/DataLakes/Iceberg/Constant.h (+2)
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp (+96, -1)
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.h (+3)
src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp (+57)
src/Storages/ObjectStorage/DataLakes/Iceberg/MetadataGenerator.cpp (+18, -3)
src/Storages/ObjectStorage/DataLakes/Iceberg/MetadataGenerator.h (+2, -1)
src/Storages/ObjectStorage/StorageObjectStorage.cpp (+7, -3)
tests/integration/test_storage_iceberg_no_spark/test_iceberg_truncate.py (+81) - New test file

CI Failures Analysis

Summary Table

Category	Count	Tests
PR-caused regression	0	None identified
Cascade failures	0	N/A
Pre-existing flaky	~48	test_s3_cluster/*, settings snapshots, aggregate_functions snapshots
Infrastructure issues	~5	BuzzHouse timeout, swarms timeouts

Quick Links

CI System	Run Link	Report
Upstream CI	Run #23651612837	ci_run_report.html
Regression Tests	Run #23655067918	-

Upstream CI Failures (Integration/Stateless/Fuzzer)

BuzzHouse (amd_debug) - 1 failure


Job Link	BuzzHouse (amd_debug)
Log Browser	json.html

Test: Unknown error
Assessment: Infrastructure/timeout issue
Relation to PR: Unrelated - BuzzHouse is a general fuzzer, not Iceberg-specific
Classification: Infrastructure issue

Integration tests (amd_asan, db disk, old analyzer, 4/6) - 23 failures


Job Link	Integration tests (amd_asan, db disk, old analyzer, 4/6)
Log Browser	json.html

All failures are in test_s3_cluster:

Click to expand full list of 23 failing tests

test_s3_cluster/test.py::test_ambiguous_join
test_s3_cluster/test.py::test_cluster_default_expression
test_s3_cluster/test.py::test_cluster_format_detection
test_s3_cluster/test.py::test_cluster_hosts_limit
test_s3_cluster/test.py::test_cluster_with_header
test_s3_cluster/test.py::test_cluster_with_named_collection
test_s3_cluster/test.py::test_count
test_s3_cluster/test.py::test_count_macro
test_s3_cluster/test.py::test_distributed_insert_select_with_replicated
test_s3_cluster/test.py::test_distributed_s3_table_engine
test_s3_cluster/test.py::test_graceful_shutdown
test_s3_cluster/test.py::test_hive_partitioning[0]
test_s3_cluster/test.py::test_hive_partitioning[1]
test_s3_cluster/test.py::test_joins
test_s3_cluster/test.py::test_object_storage_remote_initiator
test_s3_cluster/test.py::test_parallel_distributed_insert_select_with_schema_inference
test_s3_cluster/test.py::test_remote_hedged
test_s3_cluster/test.py::test_remote_no_hedged
test_s3_cluster/test.py::test_select_all
test_s3_cluster/test.py::test_skip_unavailable_shards
test_s3_cluster/test.py::test_union_all
test_s3_cluster/test.py::test_unset_skip_unavailable_shards
test_s3_cluster/test.py::test_virtual_columns

Assessment: All 23 failures are in the same test module (test_s3_cluster) and only in shard 4/6 of the ASAN integration run. This pattern strongly suggests an infrastructure issue with that specific runner/shard.
Relation to PR: Unrelated - These tests are about S3 cluster functionality, not Iceberg tables
Classification: Infrastructure issue (flaky shard)

Integration tests (arm_binary, distributed plan, 3/4) - 1 failure


Job Link	Integration tests (arm_binary, distributed plan, 3/4)
Log Browser	json.html

Test: Unknown (needs further investigation)
Assessment: Likely flaky test on ARM architecture
Relation to PR: Unrelated
Classification: Pre-existing flaky

Stateless tests (amd_binary, ParallelReplicas, s3 storage, parallel) - 1 failure


Job Link	Stateless tests (amd_binary, ParallelReplicas, s3 storage, parallel)
Log Browser	json.html

Test: Unknown specific test
Assessment: The CI report shows "New Fails in PR: Nothing to report" indicating this is not a new failure
Relation to PR: Unrelated
Classification: Pre-existing flaky

Regression Test Failures (clickhouse-regression)

aggregate_functions_3 - Multiple failures


Job Link	aggregate_functions_3
Job ID	68910010720

Failing tests: /aggregate functions/part 3/state/sumCountState/Nullable(*) for all numeric types

Click to expand full list of failing data types

Nullable(UInt8)
Nullable(UInt16)
Nullable(UInt32)
Nullable(UInt64)
Nullable(UInt128)
Nullable(Int8)
Nullable(Int16)
Nullable(Int32)
Nullable(Int64)
Nullable(Int128)
Nullable(Float32)
Nullable(Float64)
Nullable(Decimal128(38))
LowCardinality(Nullable(UInt8))
LowCardinality(Nullable(UInt16))
... and more LowCardinality variants

Error: SnapshotError - output doesn't match expected snapshot
Assessment: These are snapshot comparison failures for aggregate function state tests. The sumCountState function output format may have changed in ClickHouse version 26.1.
Relation to PR: Unrelated - PR only changes Iceberg TRUNCATE functionality, not aggregate functions
Classification: Pre-existing flaky / snapshot needs update

swarms - Multiple failures


Job Link	swarms
Job ID	68910012578

Failing tests:

Test Path	Result	Duration	Error
`/swarms/feature/node failure/check restart clickhouse on swarm node`	Error	10m 10s	`ExpectTimeoutError: Timeout 600.000s`
`/swarms/feature/node failure/network failure`	Fail	1m 54s	`AssertionError: exitcode should be 138`
`/swarms/feature/node failure/initiator out of disk space`	Fail	32s	`DB::Exception: Database ... does not exist`
`/swarms/feature/feature/object storage cluster profile events with overloaded node`	Fail	15s	`AssertionError: ObjectStorageClusterSentToNonMatchedReplica > 0`

Assessment: The swarms tests involve Iceberg tables but the failures appear to be:
- Timeout issues (infrastructure)
- Network simulation issues (flaky test infrastructure)
- The "Database does not exist" error could theoretically be related to truncation, but this specific test (initiator out of disk space) simulates disk space exhaustion, which is not the TRUNCATE flow
Relation to PR: Unlikely related - These are stress/chaos engineering tests that have known flakiness. The PR's TRUNCATE feature is not exercised by these failure scenarios.
Classification: Pre-existing flaky / Infrastructure issues

settings - 2 failures


Job Link	settings
Job ID	68910012803

Failing tests:

Test Path	Error
`/settings/default values/allow_nullable_tuple_in_extracted_subcolumns`	`SnapshotNotFoundError`
`/settings/default values/iceberg_metadata_staleness_ms`	`SnapshotNotFoundError`

Assessment: These are SnapshotNotFoundError failures, meaning the test framework couldn't find expected snapshot files for these settings. This happens when:
- New settings were added to ClickHouse but snapshots weren't generated
- The base branch (antalya-26.1) has settings not yet tracked by regression tests
Relation to PR: Unrelated - The iceberg_metadata_staleness_ms setting exists in the base branch (26.1), not added by this PR
Classification: Test infrastructure (missing snapshots for new version)

Iceberg Test Results ✅

The PR includes dedicated Iceberg regression tests that were explicitly requested via the CI configuration:

ci_regression_iceberg was checked in the PR options

Iceberg Test Status

Job	Status	Link
iceberg_1	✅ success	Job #68910012170
iceberg_2	✅ success	Job #68910011832

Both Iceberg test suites passed, confirming the TRUNCATE TABLE implementation works correctly.

Root Cause Analysis

No PR-Caused Regressions Identified

All failures fall into these categories:

Infrastructure issues (BuzzHouse timeout, swarms timeouts, s3_cluster shard 4/6 failures)
Snapshot test mismatches (aggregate_functions, settings) - test framework needs snapshot updates for ClickHouse 26.1
Known flaky tests (ARM integration, stateless parallel replicas)

Evidence Supporting No PR Regression

CI Report States: "New Fails in PR: Nothing to report" - the comparison with base SHA shows no new failures
Iceberg-specific tests passed: Both iceberg_1 and iceberg_2 completed successfully
PR's integration tests passed: test_iceberg_truncate.py is part of the integration test suite and is covered by the passing Iceberg regression jobs
Failure patterns are generic: s3_cluster failures affect a single shard, aggregate_functions failures are snapshot mismatches, swarms failures are timeouts

Recommendations

Merge readiness: PR is ready for merge from a CI perspective
- No failures attributable to the PR changes
- All Iceberg-specific tests pass
- PR has been approved by reviewer (arthurpassos)
Follow-up actions (not blocking):
- Update aggregate_functions snapshots for ClickHouse 26.1
- Update settings snapshots for new settings in 26.1
- Monitor swarms test stability
Docker image CVE: There is 1 high/critical vulnerability in the Alpine server image that should be addressed separately (not PR-related)
- CVE Report

References

Resource	Link
Pull Request	#1529
Commit	2155e32ca8073d895008a8fa890a77bcb3a8be5d
CI Run (This PR)	23651612837
CI Run (Regression)	23655067918
CI Report	ci_run_report.html

Conclusion

CI Status: PASS (no PR-caused regressions)

All identified failures are either:

Pre-existing flaky tests unrelated to Iceberg functionality
Infrastructure issues (runner/shard problems)
Test framework issues (missing snapshots for new ClickHouse version)

The PR's core functionality (Iceberg TRUNCATE TABLE) is verified by the passing iceberg_1 and iceberg_2 regression test suites.

Selfeer · 2026-03-30T12:35:48Z

Ran an audit for this PR. Could you have a look please?

AI audit note: This review comment was generated by AI (claude-4.6-sonnet-medium-thinking).
Audit update for PR #1529 (Feature: Support TRUNCATE TABLE for Iceberg engine):

Confirmed Defects

High: Mutations.cpp regression — full metadata object passed where new_snapshot is required
Impact: All Iceberg mutation operations (ALTER TABLE ... DELETE, ALTER TABLE ... UPDATE) using a REST catalog backend crash after this PR is applied. The mutation (DELETE/UPDATE) path is broken regardless of whether TRUNCATE is used.

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp / writeMetadataFiles / diff line 517–524

Trigger: Any ALTER TABLE DELETE or ALTER TABLE UPDATE on an Iceberg table with a REST catalog (catalog_type='rest').
Why defect: RestCatalog::updateMetadata (line 930 of RestCatalog.cpp) reads new_snapshot->getValue<Int64>("snapshot-id"). The PR changes the call-site argument from new_snapshot (the snapshot JSON object produced by MetadataGenerator::generateNextMetadata) to metadata (the full Iceberg metadata JSON). The full metadata object has no top-level "snapshot-id" field — only "current-snapshot-id" at the metadata level and "snapshot-id" inside each element of the "snapshots" array. Poco::JSON::Object::getValue<T> throws Poco::InvalidAccessException on a missing key, crashing the mutation path.
The PR comment in IcebergMetadata.cpp says "// Pass metadata_object (not new_snapshot) — matches the fix already applied in IcebergWrites.cpp and Mutations.cpp" but the truncate code itself correctly passes new_snapshot to catalog->updateMetadata. IcebergWrites.cpp also still uses new_snapshot (line 1034, unchanged by this PR). The Mutations.cpp change is the only site that was actually changed, and it is wrong.
// RestCatalog.cpp line 928–933 — expects snapshot object fields:
set_snapshot->set("snapshot-id", new_snapshot->getValue<Int64>("snapshot-id")); // throws if 'metadata' passed
// Mutations.cpp — incorrect change in this PR:
- if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, new_snapshot))
+ if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, metadata))
Fix direction: Revert the Mutations.cpp change; keep new_snapshot as the 4th argument.

Regression test direction: After any ALTER TABLE DELETE / UPDATE against a REST-catalog Iceberg table, verify the operation completes without exception and the catalog snapshot-ref is updated correctly.
Medium: No cleanup of orphaned files on catalog commit failure during TRUNCATE

Impact: If catalog->updateMetadata fails (transient network/catalog error) after both the manifest list file and metadata.json have been written to object storage, both files are permanently orphaned. No cleanup is attempted. Under repeated failures orphaned files accumulate indefinitely and are not recovered by any subsequent successful write (the manifest list name encodes the random snapshot_id, so it is never overwritten).

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp / IcebergMetadata::truncate / lines after writeMessageToFile

Trigger: Transient REST catalog error or network interruption between the writeMessageToFile call and catalog->updateMetadata.

Why defect: The mutation path in Mutations.cpp wraps all writes in a cleanup() lambda that removes all written files on failure (lines 400–530), including a catch (...) { cleanup(); throw; } guard. The truncate path has no equivalent.

Fix direction: Wrap the truncate write sequence in a try/catch with a cleanup lambda mirroring Mutations.cpp::writeMetadataFiles::cleanup(), removing storage_manifest_list_name and storage_metadata_name on failure.

Regression test direction: Inject a failure-point after the metadata write but before the catalog commit; verify no orphaned files remain on storage after the exception.

Low: Manual Avro container writer for empty manifest list — unnecessary complexity

Impact: Maintenance risk and potential divergence from the standard Avro format; the standard avro::DataFileWriter with zero written records produces a valid empty container file without manual ZigZag encoding or hard-coded sync marker bytes.

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp / generateManifestList / early-return block

Trigger: Every TRUNCATE TABLE call.

Why defect: The comment claims the manual path is needed to prevent DataFileWriter's eager writeHeader() from causing issues, but the standard path with an empty entry loop and writer.close() produces a valid empty OCF. The manual implementation hard-codes a 16-byte all-zero sync marker and bespoke ZigZag encoding, adding maintainability risk without correctness benefit.

Fix direction: Remove the early-return manual path; let the standard avro::DataFileWriter handle the empty case.

Regression test direction: Test truncating a table with both Iceberg v1 and v2 format; verify PyIceberg can read the resulting empty manifest list.

Coverage Summary

Scope reviewed: All 10 changed files: IDataLakeMetadata.h, Constant.h, IcebergMetadata.{h,cpp}, IcebergWrites.cpp, MetadataGenerator.{h,cpp}, Mutations.cpp, StorageObjectStorage.cpp, test_iceberg_truncate.py. REST catalog commit path (RestCatalog.cpp::updateMetadata) and WriteBuffer::finalize idempotency (WriteBuffer.cpp) also reviewed as relevant integration boundaries.

Categories failed: REST catalog integration (Mutations.cpp regression), exception-safety/rollback (no truncate cleanup).

Categories passed: Avro binary format correctness, WriteBuffer::finalize double-call safety (idempotent), path construction consistency with IcebergWrites.cpp, cache corruption (cache stores strings not shared Poco objects — in-place mutation of metadata_object does not corrupt cache), thread-safety under TableExclusiveLockHolder, truncate-then-insert round-trip logic, v1/v2 format version handling.

Assumptions/limits: Static analysis only; Glue catalog updateMetadata confirmed safe (ignores 4th arg); pre-existing Int32/stoi overflow for >2B row tables not re-raised as it is a systemic pre-existing pattern.
* **High (Mutations.cpp): Confirmed** — this is a legitimate regression introduced by our diff. Will revert Mutations.cpp back to passing new_snapshot. Fix going up now. ✅

* **Medium (orphaned files): Valid concern**. This is a new gap in the truncate path — acknowledged. **Will track this as a follow-up issue rather than block the current PR**, since adding a cleanup lambda now risks introducing new failure modes under time pressure. The existing Iceberg write paths have the same cleanup pattern that can model from.

* **Low (manual Avro writer): Need to push back here**. The audit notes "static analysis only" — this specific issue was caught only through runtime testing. We have a confirmed server segfault (SIGSEGV 11, Signal 11, avro::StreamWriter::flush() → NULL dereference) with a full stack trace proving that DataFileWriter::close() crashes when setMetadata() is called post-construction. The standard empty-loop path is not viable for this library version. The manual OCF path is the direct fix for a confirmed crash, not premature optimization.

Do we already have an issue opened for this? If not I can do that so we don't lose track of the things that need to be fixed later-on. @il9ue

il9ue · 2026-03-31T01:40:02Z

On opening an issue

Thanks for offering to open an issue — that would be great. Please go ahead and create one for the orphaned file cleanup on catalog commit failure. Happy to link it from the PR description and add a TODO comment in the code pointing to it.

On the Mutations.cpp revert

yes, that's already been fixed. The commit fix(iceberg): revert Mutations.cpp updateMetadata to pass new_snapshot is in the branch. The change was a regression introduced by this PR and is now correctly back to passing new_snapshot as the 4th argument to catalog->updateMetadata().

Ran an audit for this PR. Could you have a look please?

AI audit note: This review comment was generated by AI (claude-4.6-sonnet-medium-thinking).
Audit update for PR #1529 (Feature: Support TRUNCATE TABLE for Iceberg engine):

Confirmed Defects

High: Mutations.cpp regression — full metadata object passed where new_snapshot is required
Impact: All Iceberg mutation operations (ALTER TABLE ... DELETE, ALTER TABLE ... UPDATE) using a REST catalog backend crash after this PR is applied. The mutation (DELETE/UPDATE) path is broken regardless of whether TRUNCATE is used.

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp / writeMetadataFiles / diff line 517–524

Trigger: Any ALTER TABLE DELETE or ALTER TABLE UPDATE on an Iceberg table with a REST catalog (catalog_type='rest').
Why defect: RestCatalog::updateMetadata (line 930 of RestCatalog.cpp) reads new_snapshot->getValue<Int64>("snapshot-id"). The PR changes the call-site argument from new_snapshot (the snapshot JSON object produced by MetadataGenerator::generateNextMetadata) to metadata (the full Iceberg metadata JSON). The full metadata object has no top-level "snapshot-id" field — only "current-snapshot-id" at the metadata level and "snapshot-id" inside each element of the "snapshots" array. Poco::JSON::Object::getValue<T> throws Poco::InvalidAccessException on a missing key, crashing the mutation path.
The PR comment in IcebergMetadata.cpp says "// Pass metadata_object (not new_snapshot) — matches the fix already applied in IcebergWrites.cpp and Mutations.cpp" but the truncate code itself correctly passes new_snapshot to catalog->updateMetadata. IcebergWrites.cpp also still uses new_snapshot (line 1034, unchanged by this PR). The Mutations.cpp change is the only site that was actually changed, and it is wrong.
// RestCatalog.cpp line 928–933 — expects snapshot object fields:
set_snapshot->set("snapshot-id", new_snapshot->getValue<Int64>("snapshot-id")); // throws if 'metadata' passed
// Mutations.cpp — incorrect change in this PR:
- if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, new_snapshot))
+ if (!catalog->updateMetadata(namespace_name, table_name, catalog_filename, metadata))
Fix direction: Revert the Mutations.cpp change; keep new_snapshot as the 4th argument.

Regression test direction: After any ALTER TABLE DELETE / UPDATE against a REST-catalog Iceberg table, verify the operation completes without exception and the catalog snapshot-ref is updated correctly.
Medium: No cleanup of orphaned files on catalog commit failure during TRUNCATE

Impact: If catalog->updateMetadata fails (transient network/catalog error) after both the manifest list file and metadata.json have been written to object storage, both files are permanently orphaned. No cleanup is attempted. Under repeated failures orphaned files accumulate indefinitely and are not recovered by any subsequent successful write (the manifest list name encodes the random snapshot_id, so it is never overwritten).

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergMetadata.cpp / IcebergMetadata::truncate / lines after writeMessageToFile

Trigger: Transient REST catalog error or network interruption between the writeMessageToFile call and catalog->updateMetadata.

Why defect: The mutation path in Mutations.cpp wraps all writes in a cleanup() lambda that removes all written files on failure (lines 400–530), including a catch (...) { cleanup(); throw; } guard. The truncate path has no equivalent.

Fix direction: Wrap the truncate write sequence in a try/catch with a cleanup lambda mirroring Mutations.cpp::writeMetadataFiles::cleanup(), removing storage_manifest_list_name and storage_metadata_name on failure.

Regression test direction: Inject a failure-point after the metadata write but before the catalog commit; verify no orphaned files remain on storage after the exception.

Low: Manual Avro container writer for empty manifest list — unnecessary complexity

Impact: Maintenance risk and potential divergence from the standard Avro format; the standard avro::DataFileWriter with zero written records produces a valid empty container file without manual ZigZag encoding or hard-coded sync marker bytes.

Anchor: src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergWrites.cpp / generateManifestList / early-return block

Trigger: Every TRUNCATE TABLE call.

Why defect: The comment claims the manual path is needed to prevent DataFileWriter's eager writeHeader() from causing issues, but the standard path with an empty entry loop and writer.close() produces a valid empty OCF. The manual implementation hard-codes a 16-byte all-zero sync marker and bespoke ZigZag encoding, adding maintainability risk without correctness benefit.

Fix direction: Remove the early-return manual path; let the standard avro::DataFileWriter handle the empty case.

Regression test direction: Test truncating a table with both Iceberg v1 and v2 format; verify PyIceberg can read the resulting empty manifest list.

Coverage Summary

Scope reviewed: All 10 changed files: IDataLakeMetadata.h, Constant.h, IcebergMetadata.{h,cpp}, IcebergWrites.cpp, MetadataGenerator.{h,cpp}, Mutations.cpp, StorageObjectStorage.cpp, test_iceberg_truncate.py. REST catalog commit path (RestCatalog.cpp::updateMetadata) and WriteBuffer::finalize idempotency (WriteBuffer.cpp) also reviewed as relevant integration boundaries.

Categories failed: REST catalog integration (Mutations.cpp regression), exception-safety/rollback (no truncate cleanup).

Categories passed: Avro binary format correctness, WriteBuffer::finalize double-call safety (idempotent), path construction consistency with IcebergWrites.cpp, cache corruption (cache stores strings not shared Poco objects — in-place mutation of metadata_object does not corrupt cache), thread-safety under TableExclusiveLockHolder, truncate-then-insert round-trip logic, v1/v2 format version handling.

Assumptions/limits: Static analysis only; Glue catalog updateMetadata confirmed safe (ignores 4th arg); pre-existing Int32/stoi overflow for >2B row tables not re-raised as it is a systemic pre-existing pattern.
* **High (Mutations.cpp): Confirmed** — this is a legitimate regression introduced by our diff. Will revert Mutations.cpp back to passing new_snapshot. Fix going up now. ✅

* **Medium (orphaned files): Valid concern**. This is a new gap in the truncate path — acknowledged. **Will track this as a follow-up issue rather than block the current PR**, since adding a cleanup lambda now risks introducing new failure modes under time pressure. The existing Iceberg write paths have the same cleanup pattern that can model from.

* **Low (manual Avro writer): Need to push back here**. The audit notes "static analysis only" — this specific issue was caught only through runtime testing. We have a confirmed server segfault (SIGSEGV 11, Signal 11, avro::StreamWriter::flush() → NULL dereference) with a full stack trace proving that DataFileWriter::close() crashes when setMetadata() is called post-construction. The standard empty-loop path is not viable for this library version. The manual OCF path is the direct fix for a confirmed crash, not premature optimization.
Do we already have an issue opened for this? If not I can do that so we don't lose track of the things that need to be fixed later-on. @il9ue

il9ue mentioned this pull request Mar 16, 2026

Add TRUNCATE TABLE command for Iceberg Altinity/altinity-dashboard#1613

Open

il9ue self-assigned this Mar 16, 2026

arthurpassos reviewed Mar 16, 2026

View reviewed changes

il9ue force-pushed the feature/iceberg-truncate branch from 41eff19 to c1d252e Compare March 17, 2026 05:31

il9ue force-pushed the feature/iceberg-truncate branch from c1d252e to 601250f Compare March 17, 2026 07:21

Daniel Q. Kim added 3 commits March 18, 2026 08:56

Fix: Properly escape Iceberg namespace in DataLakeCatalog queries

27a8467

Fix: Add missing S3 credentials to ClickHouse DataLakeCatalog settings

84d76f1

Fix: Remove invalid DataLakeCatalog S3 settings

c81fc7f

Daniel Q. Kim and others added 3 commits March 19, 2026 07:52

Fix: Restore minio_secret_key variable for DataLakeCatalog auth

876577b

Merge branch 'antalya-26.1' into feature/iceberg-truncate

cc1c7db

Fix: Dynamically determine Iceberg format version to prevent Avro seg…

5a7e014

…fault

arthurpassos reviewed Mar 19, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

arthurpassos reviewed Mar 19, 2026

View reviewed changes

Daniel Q. Kim and others added 3 commits March 20, 2026 15:33

Merge branch 'antalya-26.1' into feature/iceberg-truncate

4681e4e

ianton-ru reviewed Mar 24, 2026

View reviewed changes

src/Databases/DataLake/RestCatalog.cpp Outdated Show resolved Hide resolved

il9ue changed the title ~~[WIP] Feature: Support TRUNCATE TABLE for Iceberg engine~~ Feature: Support TRUNCATE TABLE for Iceberg engine Mar 25, 2026

Merge branch 'antalya-26.1' into feature/iceberg-truncate

11810d8

svb-alt added antalya antalya-26.1 antalya-26.1.6.20001 labels Mar 25, 2026

ianton-ru reviewed Mar 26, 2026

View reviewed changes

Daniel Q. Kim and others added 2 commits March 26, 2026 13:16

Merge branch 'antalya-26.1' into feature/iceberg-truncate

e70e425

svb-alt requested review from arthurpassos and ianton-ru March 26, 2026 12:29

arthurpassos reviewed Mar 26, 2026

View reviewed changes

Daniel Q. Kim and others added 2 commits March 27, 2026 04:22

Merge branch 'antalya-26.1' into feature/iceberg-truncate

47c8acc

arthurpassos approved these changes Mar 27, 2026

View reviewed changes

Merge branch 'antalya-26.1' into feature/iceberg-truncate

2155e32

il9ue added verified Verified by QA and removed verified Verified by QA labels Mar 27, 2026

Selfeer added the verified Verified by QA label Mar 30, 2026

Selfeer mentioned this pull request Mar 30, 2026

Export Partition - release the part lock when the query is cancelled #1593

Open

27 tasks


		# Assert PyIceberg reads the empty snapshot successfully
		assert len(table.scan().to_arrow()) == 0

Conversation

il9ue commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

CI/CD Options

Exclude tests:

Regression jobs to run:

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

il9ue commented Mar 19, 2026

Uh oh!

arthurpassos commented Mar 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

il9ue commented Mar 23, 2026

Uh oh!

il9ue commented Mar 24, 2026

Requesting a peer review on the Iceberg TRUNCATE TABLE implementation for the Antalya release.

Uh oh!

Uh oh!

il9ue commented Mar 24, 2026

Feature Scope Elaboration

[Feature] Support TRUNCATE TABLE for Iceberg Engine

Uh oh!

il9ue commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkmkme commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ran an audit for this PR. Could you have a look please?

Confirmed Defects

Coverage Summary

Uh oh!

il9ue commented Mar 25, 2026

Ran an audit for this PR. Could you have a look please?

Confirmed Defects

Coverage Summary

Uh oh!

il9ue commented Mar 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

il9ue Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

il9ue commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

il9ue commented Mar 25, 2026 •

edited

Loading

mkmkme commented Mar 25, 2026 •

edited

Loading

il9ue Mar 26, 2026 •

edited

Loading

Selfeer commented Mar 30, 2026 •

edited

Loading