[fix](filecache) clean empty v3 cache dirs#63344
Conversation
af219e4 to
4e6b403
Compare
|
/review |
There was a problem hiding this comment.
Pull request overview
This PR fixes v3 file cache cleanup so removing a cache block also attempts to remove the now-empty v3 key directory, addressing inode leakage from stale empty cache directories.
Changes:
- Tracks v2 and v3 cache paths separately in
FSFileCacheStorage::remove(). - Adds v3 key-directory cleanup after file removal.
- Adds a unit test covering removal of an empty v3 key directory.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
be/src/io/cache/fs_file_cache_storage.cpp |
Updates removal logic to clean v2 and v3 cache directories separately. |
be/test/io/cache/fs_file_cache_storage_leak_cleaner_test.cpp |
Adds coverage for deleting an empty v3 key directory after cache removal. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
I reviewed the PR diff and the existing inline review context. I am not adding duplicate inline comments because the blocking issues are already covered by the existing threads on be/src/io/cache/fs_file_cache_storage.cpp:310.
Critical checkpoint conclusions:
- Goal/test: The PR aims to clean empty v3 cache directories and adds a unit test for deleting
<hash>_0, but it does not fully accomplish the stated behavior because the empty<hash_prefix>parent cleanup described in the PR body is not implemented or tested. - Scope/focus: The code change is small and focused, but the cleanup primitive is too broad for the concurrent cache directory lifecycle.
- Concurrency:
FSFileCacheStorage::remove()can run concurrently with cache writes. The new v3 empty-dir cleanup lists the directory and then callsFileSystem::delete_directory(), which is recursive for local FS, so a file created after the emptiness check can be deleted. This is the same blocking concern already raised in the existing inline thread; it should be changed to non-recursive rmdir-style cleanup that toleratesENOTEMPTY/NOT_FOUND. - Lifecycle/static initialization: No new special lifecycle or static initialization issue found.
- Config/compatibility/protocol: No new config or FE/BE protocol compatibility issue found.
- Parallel paths: v2 compatibility cleanup remains; v3 cleanup is added, but the v3 prefix parent path is a parallel part of the v3 layout and remains unhandled despite the PR description.
- Tests: The added unit test covers key-directory deletion only. It does not cover prefix-directory cleanup or the concurrent-create race introduced by recursive deletion.
- Observability/transactions/data visibility: Not applicable beyond cache file cleanup; no transaction/version visibility path is modified.
- Performance: No separate performance issue found; the concern is correctness/safety of recursive deletion.
User focus: no additional user-provided review focus was supplied.
67bd338 to
f1437cd
Compare
|
run buildall |
f1437cd to
102a46f
Compare
102a46f to
2a1cb8c
Compare
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 31288 ms |
TPC-DS: Total hot run time: 169209 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: #63117
Fixes file cache empty directory leakage for v3 cache layout.
In v3 layout, cache files live under
<hash_prefix>/<hash>_0/.FSFileCacheStorage::remove()deletes the v3 cache file first, then switches the local directory variable to the legacy v2 path for compatibility cleanup. The final empty-directory cleanup therefore checks the v2 directory and can leave the actual v3<hash>_0directory behind.This patch keeps v2 and v3 directory paths separately, preserves the existing v2 compatibility cleanup, and additionally removes the v3 key directory when it is empty. The v3 directory cleanup uses non-recursive
std::filesystem::remove()to avoid deleting concurrently created files.Check List