Skip to content

Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 - Parquet metadata cache (upstream impl) and arrow library version bump#1574

Open
arthurpassos wants to merge 15 commits intoantalya-26.1from
backport_upstream_parquet_metadata_cache_261
Open

Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 - Parquet metadata cache (upstream impl) and arrow library version bump#1574
arthurpassos wants to merge 15 commits intoantalya-26.1from
backport_upstream_parquet_metadata_cache_261

Conversation

@arthurpassos
Copy link
Copy Markdown
Collaborator

@arthurpassos arthurpassos commented Mar 24, 2026

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Backport of:

ClickHouse#98140 - Added a new SLRU cache for Parquet metadata to improve read performance by removing the need to re-download files just to read metadata.

ClickHouse#99230 - Add parquet format check to metadata cache

ClickHouse#99231 - Avoid possible crash in Parquet with metadata cache enabled

ClickHouse#96545 - bump arrow version

It also removes the antalya implementation of parquet metadata caching

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

…t_metadata_cache_v2

Parquet metadata cache v2
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 24, 2026

Workflow [PR], commit [5de03da]

@arthurpassos arthurpassos changed the title Merge pull request #98140 from grantholly-clickhouse/parquet_metadata… Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 Mar 25, 2026
@arthurpassos arthurpassos changed the title Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 - Parquet metadata cache (upstream impl) and arrow library version bump Mar 25, 2026
@arthurpassos
Copy link
Copy Markdown
Collaborator Author

@codex review the backports - compare the changes and conflicts with the original prs if possible

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Something went wrong. Try again later by commenting “@codex review”.

We were unable to download your code in a timely manner.
ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@svb-alt svb-alt requested a review from mkmkme March 26, 2026 12:31
@mkmkme
Copy link
Copy Markdown
Collaborator

mkmkme commented Mar 26, 2026

Ran an audit. here's the result.

Side note, here's a comment from AI:

Bottom line: The backport itself is clean — the upstream code was brought over accurately. The only real concern is the backport-specific settings migration: users on existing Antalya 26.1.x builds who have input_format_parquet_use_metadata_cache in their configs or profiles will get "Unknown setting" errors after upgrading. Adding a MAKE_OBSOLETE entry for it would handle that gracefully.

Good job!


AI audit note: This review comment was generated by AI (claude-4.6-opus-high-thinking).

Audit for PR #1574 — Antalya 26.1 backport of ClickHouse#98140, ClickHouse#99230, ClickHouse#99231, ClickHouse#96545 (Parquet metadata cache + Arrow 23):

Confirmed defects

Medium: input_format_parquet_use_metadata_cache removed without MAKE_OBSOLETE (backport-specific)

  • Impact: Users on existing Antalya 26.1.x builds who have input_format_parquet_use_metadata_cache in XML profiles, SET statements, or users.xml will get "Unknown setting" errors after upgrading. The setting was actively shipped and documented in SettingsChangesHistory for version 26.1.
  • Anchor: src/Core/FormatFactorySettings.h line 1530 (removed); src/Core/SettingsChangesHistory.cpp (entry removed)
  • Trigger: Upgrade from any 26.1.x build that shipped input_format_parquet_use_metadata_cache to a build containing this PR.
  • Why defect: The setting was present on the base branch and tracked in SettingsChangesHistory. Removing it without MAKE_OBSOLETE breaks the settings compatibility contract across minor versions. The upstream never had this setting so it has no obligation here — this is purely a backport migration concern.
  • Fix direction: Add MAKE_OBSOLETE(M, Bool, input_format_parquet_use_metadata_cache, false) to OBSOLETE_FORMAT_SETTINGS in FormatFactorySettings.h.
  • Regression test direction: SET input_format_parquet_use_metadata_cache = 0 should not throw.

Low: Log message inside cache-miss loader says "got metadata from cache" (exists in upstream)

  • Impact: Misleading trace log during debugging; no correctness impact.
  • Anchor: src/Processors/Formats/Impl/ParquetMetadataCache.h / getOrSetMetadata — the load_fn_wrapper lambda logs "got metadata from cache" but only executes on a cache miss.
  • Trigger: Any cache miss with trace logging enabled.
  • Why defect: Log contradicts actual semantics; the immediately following branch correctly logs "cache miss" for the same event.
  • Fix direction: Change the in-loader log to "loaded metadata from file" or remove it.
  • Regression test direction: None required.

Low: Misleading argument comments in getInputWithMetadata call (exists in upstream)

  • Impact: No runtime impact (all three arguments are std::nullopt), but the inline comments label the parameter positions incorrectly.
  • Anchor: src/Storages/ObjectStorage/StorageObjectStorageSource.cppgetInputWithMetadata call passes std::nullopt /*min_block_size_bytes*/, std::nullopt /*min_block_size_rows*/, std::nullopt /*max_block_size_bytes*/ but the FormatFactory.h signature has them as max_block_size_bytes, min_block_size_rows, min_block_size_bytes.
  • Trigger: Static read of code; future change populating these arguments could use the wrong positions.
  • Why defect: Comments do not match the declared parameter order.
  • Fix direction: Correct the comments to match the signature.
  • Regression test direction: None required.

Coverage summary

  • Scope reviewed: Full diff (2716 lines) — cache core (ParquetMetadataCache.cpp/h), V3 format integration, ReadManager, FormatFactory dispatch, StorageObjectStorageSource wiring, Context lifecycle, server/local-server init, system query handler, parser/AST, settings/history, Arrow 23 API adaptations (Decoding.cpp, Write.cpp, HiveFile.cpp, DeltaLakeMetadata.cpp, ArrowFlightHandler.cpp, arrow-cmake), all test files. Cross-referenced with upstream master.
  • Categories failed: Settings compatibility (backport-specific).
  • Categories passed: Cache SLRU logic and weight accounting; ReadManager::init schema.empty() sentinel guard; S3Queue null-context crash fix (PR 99231); getInputImpl dual-creator dispatch priority; Arrow 23 API migration (OpenFile result-type, RleBitPackedEncoder, header renames, disable_write_page_index polarity flip); bzip2/curl contrib availability; use_parquet_metadata_cache default and use_native_reader_v3 interaction; old Antalya ParquetFileMetaDataCache singleton cleanly removed; test coverage (old tests removed, new upstream tests added, existing tests augmented with use_parquet_metadata_cache=0 for isolation).
  • Assumptions/limits: Static analysis only; Arrow 23 submodule not rebuilt from the PR's pin; integration test behavior not runtime-verified.

Copy link
Copy Markdown
Collaborator

@mkmkme mkmkme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't fully cover all of those backports myself, but the brief look seems okay and the AI review approved it saying that it was backported accurately. So LGTM

@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented Mar 27, 2026

Here is an audit review after the latest changes, @arthurpassos I assume this low risk defect are not related?


AI audit note: This review comment was generated by AI (gpt-5.3-codex).

Audit update for PR #1574 (Parquet metadata cache backport + Arrow 23 bump):

Confirmed defects

Low: Cache-miss loader logs a cache-hit message

Impact: Misleading trace diagnostics during cache troubleshooting; no direct correctness impact, but it can hide real cache-behavior signals during incidents.

Anchor: src/Processors/Formats/Impl/ParquetMetadataCache.h / ParquetMetadataCache::getOrSetMetadata.

Trigger: Any cache miss path that executes the loader lambda.

Affected transition: cache lookup miss -> metadata load -> cache insert -> observability/log trace.

Why defect: The loader lambda runs when cache lookup misses, but the emitted log string says metadata was obtained "from cache", contradicting the actual transition and subsequent miss accounting.

Affected subsystem and blast radius: Parquet metadata cache observability (SYSTEM troubleshooting, trace log interpretation) across all Parquet object-storage reads using metadata cache.

Smallest logical reproduction steps: Enable trace logging, run two reads of a Parquet object with use_parquet_metadata_cache=1; on first read (miss), observe loader emits "got metadata from cache" while miss event is incremented.

Logical fault-injection mapping: Inject cold-cache state (empty cache) to force loader execution and expose incorrect hit/miss log semantics.

Fix direction (short): Change loader log text to "loaded metadata from file" (or equivalent) and keep hit/miss logs only in post-getOrSet branch.

Regression test direction (short): Add a trace-log assertion test on cold-cache read ensuring miss path never emits "from cache".

Code evidence:

auto load_fn_wrapper = [&]()
{
    auto metadata = load_fn();
    LOG_TRACE(log, "got metadata from cache {} | {}", key.file_path, key.etag);
    return std::make_shared<ParquetMetadataCacheCell>(std::move(metadata));
};
auto result = Base::getOrSet(key, load_fn_wrapper);
if (result.second)
    ProfileEvents::increment(ProfileEvents::ParquetMetadataCacheMisses);

Low: Argument comments are mislabeled in getInputWithMetadata call

Impact: No runtime break today (std::nullopt for all three parameters), but misleading comments increase future maintenance risk and can cause incorrect argument population in follow-up edits.

Anchor: src/Storages/ObjectStorage/StorageObjectStorageSource.cpp / StorageObjectStorageSource::createReader.

Trigger: Any future change that replaces these std::nullopt values with concrete limits using current inline labels.

Affected transition: object reader construction -> format factory input creation with optional block-size tuning.

Why defect: FormatFactory::getInputWithMetadata expects (max_block_size_bytes, min_block_size_rows, min_block_size_bytes), but call-site comments label the arguments as (min_block_size_bytes, min_block_size_rows, max_block_size_bytes).

Affected subsystem and blast radius: Object-storage reader initialization for Parquet path; risk is latent misconfiguration in later maintenance changes.

Smallest logical reproduction steps: Compare function signature in FormatFactory with commented call-site argument labels in StorageObjectStorageSource::createReader; parameter labels do not match declared order.

Logical fault-injection mapping: Inject maintenance fault by replacing these std::nullopt placeholders with non-null values according to current comments; this would map values to wrong formal parameters.

Fix direction (short): Correct inline comments to match actual parameter order or remove comments to avoid positional misinformation.

Regression test direction (short): Add a compile-time named-wrapper or helper invocation for these three optional limits to avoid positional mistakes.

Code evidence:

// Signature order in FormatFactory:
// (..., max_block_size_bytes, min_block_size_rows, min_block_size_bytes)
input_format = FormatFactory::instance().getInputWithMetadata(
    ...,
    need_only_count,
    std::nullopt /*min_block_size_bytes*/,
    std::nullopt /*min_block_size_rows*/,
    std::nullopt /*max_block_size_bytes*/);

Coverage summary

Scope reviewed: Full PR delta against origin/antalya-26.1 with functional partitioning across cache core (ParquetMetadataCache), format dispatch (FormatFactory/Parquet registration), object-storage integration (StorageObjectStorageSource), context/server lifecycle (Context, Server, LocalServer), system-query plumbing (AST/Parser/InterpreterSystemQuery), settings/history compatibility (FormatFactorySettings, Settings, SettingsChangesHistory), and regression tests.

Categories failed: Observability/logging consistency; API-callsite comment/signature consistency.

Categories passed: Backport settings compatibility (input_format_parquet_use_metadata_cache obsoleted + history entry), cache keying and miss/hit accounting flow, no-query-context crash path (S3Queue/native reader v3), cache clear command path and access control wiring, cache/state lifecycle initialization, concurrency/locking checks in touched paths (no new race/deadlock trigger confirmed), major C++ memory/lifetime/UB checks in reviewed transitions.

Assumptions/limits: Static audit only (no full CI/runtime execution); Arrow 23 integration and remote-object behavior were validated by code-path reasoning plus in-tree regression-test intent.

@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented Mar 30, 2026

PR #1574 CI Verification Report

PR: Antalya 26.1 Backport of #98140, #99230, #99231 and #96545 - Parquet metadata cache (upstream impl) and arrow library version bump

Commit: 5de03dadfe0ee4efeb96bf5e9bd8ad50d61489dc

Base Branch: antalya-26.1

CI Report: ci_run_report.html

Workflow Run: 23644531742

Date: 2026-03-27


Summary

Category Count Details
PR-caused regression 0 CI report confirms "New Fails in PR (0)"
Infrastructure issues 1 Stateless tests timeout
Pre-existing flaky (Swarms) 5 Node failure scenario tests
CVE scan (unrelated) 1 Alpine image high CVE

Verdict: Safe to merge - No regressions introduced by this PR.


Upstream PRs Backported

PR Description Status
#98140 Parquet metadata SLRU cache for improved read performance Backported
#99230 Add parquet format check to metadata cache Backported
#99231 Avoid possible crash in Parquet with metadata cache enabled Backported
#96545 Bump Arrow version to 23.0.0 Backported

Build Status

All builds completed successfully:

Build Status Link
Build (amd_release) ✅ Success json.html
Build (amd_debug) ✅ Success json.html
Build (amd_asan) ✅ Success json.html
Build (amd_binary) ✅ Success json.html
Build (arm_release) ✅ Success json.html
Build (arm_asan) ✅ Success json.html
Build (arm_binary) ✅ Success json.html

Stateless Tests

All stateless test jobs passed with 0 failures:

Job Status Results Link
Fast test ✅ Success 7480 passed, 0 failed json.html
Stateless (amd_debug, parallel) ✅ Success 8521 passed, 0 failed json.html
Stateless (amd_debug, sequential) ✅ Success 651 passed, 0 failed json.html
Stateless (amd_debug, s3 storage, parallel) ✅ Success 8307 passed, 0 failed json.html
Stateless (amd_debug, s3 storage, sequential) ✅ Success 605 passed, 0 failed json.html
Stateless (amd_debug, AsyncInsert, parallel) ✅ Success 8255 passed, 0 failed json.html
Stateless (amd_debug, AsyncInsert, sequential) ✅ Success 596 passed, 0 failed json.html
Stateless (amd_asan, distributed plan, parallel, 1/2) ✅ Success 4239 passed, 0 failed json.html
Stateless (amd_asan, distributed plan, parallel, 2/2) ✅ Success 4253 passed, 0 failed json.html
Stateless (amd_binary, ParallelReplicas, parallel) ✅ Success 7716 passed, 0 failed json.html
Stateless (amd_binary, ParallelReplicas, sequential) ✅ Success 514 passed, 0 failed json.html
Stateless (amd_binary, DatabaseReplicated, parallel) ✅ Success 8205 passed, 0 failed json.html
Stateless (amd_binary, DatabaseReplicated, sequential) ✅ Success 573 passed, 0 failed json.html
Stateless (arm_binary, parallel) ✅ Success 8605 passed, 0 failed json.html
Stateless (arm_binary, sequential) ✅ Success 667 passed, 0 failed json.html
Stateless (arm_asan, targeted) ✅ Success 374 passed, 0 failed json.html

Stateless Tests - Timeout (Infrastructure Issue)

Job Status Issue Link
Stateless (amd_asan, db disk, sequential) ⚠️ Timeout Job timed out (not test failure) json.html

Integration Tests

All integration test jobs passed with 0 failures:

Job Status Results Link
Integration (amd_asan, old analyzer, 1/6) ✅ Success 505 passed, 0 failed json.html
Integration (amd_asan, old analyzer, 2/6) ✅ Success 777 passed, 0 failed json.html
Integration (amd_asan, old analyzer, 3/6) ✅ Success 936 passed, 0 failed json.html
Integration (amd_asan, old analyzer, 4/6) ✅ Success 743 passed, 0 failed json.html
Integration (amd_asan, old analyzer, 5/6) ✅ Success 980 passed, 0 failed json.html
Integration (amd_asan, old analyzer, 6/6) ✅ Success 840 passed, 0 failed json.html
Integration (amd_asan, targeted) ✅ Success 116 passed, 0 failed json.html
Integration (amd_binary, 1/5) ✅ Success 553 passed, 0 failed json.html
Integration (amd_binary, 2/5) ✅ Success 911 passed, 0 failed json.html
Integration (amd_binary, 3/5) ✅ Success 1259 passed, 0 failed json.html
Integration (amd_binary, 4/5) ✅ Success 1007 passed, 0 failed json.html
Integration (amd_binary, 5/5) ✅ Success 1051 passed, 0 failed json.html
Integration (arm_binary, distributed plan, 1/4) ✅ Success 1102 passed, 0 failed json.html
Integration (arm_binary, distributed plan, 2/4) ✅ Success 1129 passed, 0 failed json.html
Integration (arm_binary, distributed plan, 3/4) ✅ Success 1377 passed, 0 failed json.html
Integration (arm_binary, distributed plan, 4/4) ✅ Success 1173 passed, 0 failed json.html

Regression Tests (Parquet/Iceberg/S3)

All PR-relevant regression tests passed:

Iceberg Tests

Job Arch Status Results Link
Regression iceberg_1 x86_64 ✅ Success 476 ok, 9 xfail report.html
Regression iceberg_1 aarch64 ✅ Success 476 ok, 9 xfail report.html
Regression iceberg_2 x86_64 ✅ Success 1569 ok, 5 xfail report.html
Regression iceberg_2 aarch64 ✅ Success 1569 ok, 5 xfail report.html

Parquet Tests

Job Arch Status Results Link
Regression parquet x86_64 ✅ Success 36 ok, 1 xfail report.html
Regression parquet aarch64 ✅ Success 36 ok, 1 xfail report.html
Regression parquet_aws_s3 x86_64 ✅ Success 89 steps ok report.html
Regression parquet_aws_s3 aarch64 ✅ Success 89 steps ok report.html
Regression parquet_minio x86_64 ✅ Success 6 features ok report.html
Regression parquet_minio aarch64 ✅ Success 6 features ok report.html

S3 Export Tests

Job Arch Status Results Link
Regression s3_export_part x86_64 ✅ Success 87 scenarios ok report.html
Regression s3_export_part aarch64 ✅ Success 87 scenarios ok report.html
Regression s3_export_partition x86_64 ✅ Success 7 scenarios ok report.html
Regression s3_export_partition aarch64 ✅ Success 7 scenarios ok report.html

Failed Jobs Analysis

1. Swarms Tests - Pre-existing Flaky (Unrelated to PR)

Arch Test Status Link
x86_64 /swarms/feature/swarm joins/join clause/join 458 of 816480 Fail results
x86_64 /swarms/feature/node failure/check restart clickhouse on swarm node Error results
x86_64 /swarms/feature/node failure/initiator out of disk space Fail results
aarch64 /swarms/feature/node failure/check restart clickhouse on swarm node Error results
aarch64 /swarms/feature/node failure/initiator out of disk space Fail results

Analysis: These tests simulate node failure scenarios (restart, disk space exhaustion) which are inherently flaky due to timing-sensitive infrastructure operations. The fact they fail on both architectures with the same pattern indicates infrastructure instability, not a code regression. These failures are unrelated to Parquet/Arrow changes.

2. CVE Scan - Unrelated to PR

Image Severity CVE Link
altinityinfra/clickhouse-server:1574-26.1.4.20001.altinityantalya-alpine High CVE-2026-2673 results.html

Analysis: This is a pre-existing vulnerability in the Alpine base image, not caused by PR changes.


Other Passing Tests

Job Status Link
Unit tests (asan) ✅ Success (4191 passed) json.html
Stress test (amd_debug) ✅ Success json.html
Stress test (arm_asan) ✅ Success json.html
Stress test (arm_asan, s3) ✅ Success json.html
AST fuzzer (amd_debug) ✅ Success json.html
AST fuzzer (arm_asan) ✅ Success json.html
BuzzHouse (amd_debug) ✅ Success json.html
BuzzHouse (arm_asan) ✅ Success json.html
Compatibility check (amd_release) ✅ Success json.html
Compatibility check (arm_release) ✅ Success json.html

Code Audit Summary

From PR review comments:

Fixed Issues

Severity Issue Status
Medium input_format_parquet_use_metadata_cache removed without MAKE_OBSOLETE ✅ Fixed in commit 691c4a5

Remaining Low-Risk Issues (Upstream)

Severity Issue Location
Low Cache-miss loader logs "got metadata from cache" ParquetMetadataCache.h (cosmetic)
Low Argument comments mislabeled in getInputWithMetadata StorageObjectStorageSource.cpp (cosmetic)

Conclusion

This PR is safe to merge.

All failures are either:

  • ✅ Infrastructure issues (stateless tests timeout)
  • ✅ Pre-existing flaky tests (swarms node failure scenarios)
  • ✅ Base image CVEs (unrelated to code changes)

The CI explicitly confirms "New Fails in PR (0)" when compared against the base SHA (a76c804).

All PR-relevant tests (Parquet, Iceberg, S3, Stateless, Integration) passed successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants