-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Open
Description
Context
- I've found huge amount of wal files (1m+) in db with pretty low traffic, while having 22 SSTs. It has several column families with active writes and one which holds rarely updated metadata.
- Workload: write batches; auto-compactions disabled;
write_buffer_size=256 MiB,max_write_buffer_number=2,min_write_buffer_number_to_merge=2,level_compaction_dynamic_level_bytes=true. - CF had only a couple of records; it seems that a large memtable allowed WAL files to pile up. After adding a manual flush for that column family every n records, the WALs were compacted.
EXT COUNT SIZE
.sst 22 426398673
.log 1372126 1200701440
Problem
- RocksDB currently gates WAL accumulation by size (
max_total_wal_size) and archive size/age (WAL_size_limit_MB,WAL_ttl_seconds). There is no count-based limit on live WALs, so many small logs can accumulate, burning inodes and potentially making you unable to recover. Eg you have hit ext4 file limit per dir, you change code to call flush, but you can't create the new SST required for compaction to progress.
Proposal
- Add a live-WAL count cap (e.g.,
max_live_wal_files). When the live WAL count exceeds this cap, triggerFlushReason::kWalFullon CFs holding the oldest WALs until the count drops below the limit. Honoratomic_flushby batching CFs when enabled. - (Optional) Add an archive-WAL count cap (e.g.,
max_archived_wal_files) to delete oldest archived WALs when the count exceeds the cap, alongside existing TTL/size pruning. - Telemetry: new stats for count-based flush triggers and count-based archive deletions.
All code for it seems to be here, so if it looks ok for you i can send a pr
Rexagon
Metadata
Metadata
Assignees
Labels
No labels