Skip to content

Conversation

@ZocoLini
Copy link
Collaborator

@ZocoLini ZocoLini commented Dec 16, 2025

This change aims to refactor the background worker that persists the data when receives a command into a time based. SegmentCache keeps in memory the segments evicted to fast load if requested while waiting for someone to make the call to persist them (the new time based worker in this case). Once they are persisted they are dropped from memory.

Using criterion benches this change improved 10% performance when requesting items randomly. It also increases performance when storing items because the eviction is no longer persisting in the moment

Summary by CodeRabbit

  • Refactor

    • Background persistence now runs on a fixed periodic cadence instead of reacting to command signals.
    • Evicted segments are kept in memory for controlled, later persistence rather than being discarded immediately.
    • Segment state and persistence behavior simplified for more predictable saves.
    • Shutdown and worker stop now perform persistence deterministically and synchronously.
  • Tests

    • Reduced an artificial test delay to match the new direct persistence timing.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Warning

Rate limit exceeded

@ZocoLini has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 29 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between b095db4 and 256c18c.

📒 Files selected for processing (1)
  • dash-spv/benches/storage.rs

Walkthrough

Replaces a command-driven background worker with a timer-driven loop that calls persist_evicted every 5s; SegmentCache now keeps evicted segments in-memory and persists them via persist_evicted; Segment state simplified to Clean/Dirty; worker-related fields and commands removed; several call signatures adjusted accordingly.

Changes

Cohort / File(s) Summary
Background worker & manager
dash-spv/src/storage/manager.rs
Removed WorkerCommand and mpsc channel; replaced command-driven worker with a 5s ticker that calls persist_evicted; removed worker_tx and last_index_save_count; stop_worker changed to synchronous abort-based stop; startup/stop flow simplified.
Segment cache, eviction & persistence
dash-spv/src/storage/segments.rs
Added evicted: HashMap<u32, Segment<I>>; eviction moves dirty segments into evicted instead of immediate disk removal; added persist_evicted; Persistable gained path metadata helpers and lost make_save_command; narrowed visibility of segment accessors; removed Saving state (now Clean/Dirty).
Storage state & shutdown flow
dash-spv/src/storage/state.rs
shutdown signature changed from &mut self to &self; removed worker-command shutdown coordination; save_dirty now persists headers/filters/index directly (calls save_index_to_disk) rather than relying on background commands.
Header storage callsite
dash-spv/src/storage/headers.rs
store_items_at_height invocations updated to remove passing self/owner argument; explicit early drop(reverse_index) before continue removed so the reverse index lock is held until function end.
Tests (timing)
dash-spv/tests/reverse_index_test.rs
Removed a short tokio::time::sleep after storing headers; test proceeds to shutdown immediately without that delay.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
participant Caller as API Caller
participant Manager as DiskStorageManager
participant Cache as SegmentCache
participant Evicted as EvictedMap
participant Disk as Disk I/O
Note over Manager,Cache: Startup: timer-based worker (5s) created
Caller->>Manager: store_items_at_height(headers, height)
Manager->>Cache: store_items_at_height(headers, height)
Cache-->>Cache: mark segment Dirty or evict LRU
alt Segment evicted
Cache->>Evicted: move dirty segment -> evicted map
end
Note over Manager: Background ticker fires every 5s
Manager->>Cache: persist_evicted()
Cache->>Evicted: iterate evicted segments
Evicted->>Disk: persist(segment)
Disk-->>Evicted: persist ack
Evicted-->>Cache: remove persisted entries

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Refactor: storage worker reworked' directly corresponds to the main change: refactoring the background storage worker from an on-demand to time-based persistence model, which is the core objective of this PR.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dash-spv/src/storage/segments.rs (1)

474-480: Silent data loss when inserting at out-of-bounds offset.

When offset > self.items.len(), the item is silently discarded with only a log message. This could lead to data corruption that's difficult to debug. Consider returning a Result or using debug_assert! to catch this during development.

🧹 Nitpick comments (4)
dash-spv/src/storage/segments.rs (2)

404-436: Async function uses synchronous file I/O.

Segment::load is marked async but uses synchronous std::fs::File::open and BufReader, which can block the async runtime. Consider using tokio::fs::File with async reads, or spawn the blocking operation on a dedicated thread pool via tokio::task::spawn_blocking.

Per coding guidelines: "Use async/await throughout the codebase, built on tokio runtime".


445-447: Minor: Synchronous create_dir_all in async context.

fs::create_dir_all is synchronous and could briefly block the async runtime. Consider using tokio::fs::create_dir_all for consistency with the async atomic_write that follows.

dash-spv/src/storage/manager.rs (1)

156-164: Worker handle not cleared after abort.

After abort(), worker_handle remains Some(...) with an aborted handle. While start_worker overwrites it, consider clearing it to None for cleaner state management and to avoid holding a reference to an aborted task.

-    pub(super) fn stop_worker(&self) {
+    pub(super) fn stop_worker(&mut self) {
         if let Some(handle) = &self.worker_handle {
             handle.abort();
         }
+        self.worker_handle = None;
     }

Note: This would require changing the signature to &mut self, which may conflict with shutdown(&self) in state.rs.

dash-spv/src/storage/state.rs (1)

436-447: Sequential write lock acquisition in save_dirty.

Each persist() call acquires a write lock on its respective cache. While this is safe, if persistence takes significant time, it could increase latency for concurrent readers. Consider whether the locks need to be held during the entire I/O operation or only for copying data.

This is a performance consideration for the future rather than a correctness issue.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9234fd and a558988.

📒 Files selected for processing (5)
  • dash-spv/src/storage/headers.rs (1 hunks)
  • dash-spv/src/storage/manager.rs (3 hunks)
  • dash-spv/src/storage/segments.rs (12 hunks)
  • dash-spv/src/storage/state.rs (7 hunks)
  • dash-spv/tests/reverse_index_test.rs (0 hunks)
💤 Files with no reviewable changes (1)
  • dash-spv/tests/reverse_index_test.rs
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Never hardcode network parameters, addresses, or keys in Rust code
Use proper error types with thiserror crate and propagate errors appropriately in Rust code
Use tokio runtime for async operations in Rust code
Use conditional compilation with feature flags for optional features
Write unit tests for new functionality in Rust
Format Rust code with cargo fmt before committing
Ensure clippy passes with all warnings as errors on all Rust code

**/*.rs: Each crate keeps sources in src/; unit tests live alongside code with #[cfg(test)]
Maintain MSRV of 1.89 for Rust code
Format Rust code with rustfmt (see rustfmt.toml); run cargo fmt --all before commits
Lint Rust code with clippy; avoid unwrap()/expect() in library code; use error types (e.g., thiserror)
Use snake_case for function and variable names in Rust
Use UpperCamelCase for types and traits in Rust
Use SCREAMING_SNAKE_CASE for constants in Rust
Follow mixed editions (2021/2024) and crate-specific idioms; prefer async via tokio where applicable
Unit tests should be placed near code with descriptive names (e.g., test_parse_address_mainnet)
Use #[ignore] for network-dependent or long-running tests; run with -- --ignored flag
Never commit secrets or real keys; avoid logging sensitive data in Rust code
Keep test vectors deterministic in Rust test code

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
dash-spv/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

dash-spv/**/*.rs: Use async/await throughout the codebase, built on tokio runtime
Use Arc for trait objects to enable runtime polymorphism for NetworkManager and StorageManager
Use Tokio channels for inter-component message passing between async tasks
Maintain minimum Rust version (MSRV) of 1.89 and use only compatible syntax and features

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
dash-spv/src/storage/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
🧠 Learnings (19)
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/storage/**/*.rs : Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-06-26T15:54:02.509Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:54:02.509Z
Learning: The `StorageManager` trait in `dash-spv/src/storage/mod.rs` uses `&mut self` methods but is also `Send + Sync`, and implementations often use interior mutability for concurrency. This can be confusing, so explicit documentation should clarify thread-safety expectations and the rationale for the API design.

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/sync/**/*.rs : Implement sequential phase-based synchronization via SyncManager with phases progressing in order: Headers → Masternode List → Filter Headers → Filters → Blocks

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use async/await throughout the codebase, built on tokio runtime

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:01:18.174Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-01T08:01:18.174Z
Learning: Organize workspace with crates: `dash`, `hashes`, `internals`, `dash-network`, `dash-spv`, `key-wallet`, `rpc-*`, utilities (`fuzz`, `test-utils`), and FFI crates (`*-ffi`)

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Arc<dyn TraitName> for trait objects to enable runtime polymorphism for NetworkManager and StorageManager

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/src/**/*.rs : Use thread-local storage for error propagation via `dash_spv_ffi_get_last_error()` function

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Tokio channels for inter-component message passing between async tasks

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-06-26T15:51:03.126Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:51:03.126Z
Learning: In Rust network code, using unwrap() on SystemTime::now().duration_since(UNIX_EPOCH) can panic if the system clock is before UNIX_EPOCH. Use unwrap_or with a safe fallback (e.g., Duration::from_secs(0)) to prevent panics in edge cases.

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/network/**/*.rs : Implement configurable timeouts with recovery mechanisms for network operations

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Applies to key-wallet/**/*.rs : Never panic in library code; use `Result<T>` for all fallible operations with custom `Error` type variants and provide context in error messages

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-02-25T06:13:52.858Z
Learnt from: QuantumExplorer
Repo: dashpay/rust-dashcore PR: 51
File: dash/src/sml/masternode_list/scores_for_quorum.rs:50-73
Timestamp: 2025-02-25T06:13:52.858Z
Learning: ScoreHash is a cryptographic hash in the rust-dashcore library, and therefore does not need collision handling when used as a key in collections like BTreeMap due to the extremely low probability of collisions.

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Applies to key-wallet/**/managed_account/**/*.rs : Implement atomic state updates in ManagedAccount: when processing transactions, update address usage state, UTXO set, transaction history, and balance calculations together to maintain consistency

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-06-26T16:01:37.609Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T16:01:37.609Z
Learning: The mempool tracking infrastructure (UnconfirmedTransaction, MempoolState, configuration, and mempool_filter.rs) is fully implemented and integrated in the Dash SPV client as of this PR, including client logic, FFI APIs, and tests.

Applied to files:

  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-01T08:00:50.618Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: swift-dash-core-sdk/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:50.618Z
Learning: When making changes to Rust FFI in dash-spv-ffi, rebuild with `cargo build --release` and run Swift tests to verify integration

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/tests/**/*.rs : Organize tests into unit tests (in-module), integration tests (tests/ directory), real network tests (with live Dash Core nodes), and performance benchmarks

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Keep test vectors updated with Dash Core, monitor for new DIPs affecting wallet structure, maintain backward compatibility for serialized wallets, and update derivation paths for Platform features

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:01:18.174Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-01T08:01:18.174Z
Learning: Applies to **/*.rs : Keep test vectors deterministic in Rust test code

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/tests/unit/**/*.rs : Add corresponding unit tests in `tests/unit/` for each new FFI function

Applied to files:

  • dash-spv/src/storage/state.rs
🧬 Code graph analysis (1)
dash-spv/src/storage/state.rs (1)
dash-spv/src/storage/headers.rs (2)
  • headers (21-21)
  • save_index_to_disk (66-74)
🔇 Additional comments (9)
dash-spv/src/storage/segments.rs (3)

214-225: Eviction logic correctly preserves dirty segments.

The LRU eviction moves dirty segments to the evicted map to ensure data isn't lost before the timer-based persistence runs. This aligns well with the PR objective of decoupling eviction from immediate persistence.


390-390: Segment size differs from documented guideline.

The segment size is set to 50,000 items, but coding guidelines specify 10,000-header segments. If this is an intentional change for performance optimization, consider updating the documentation to reflect the new segment size.

Based on coding guidelines: "Store headers in 10,000-header segments with index files in headers/ directory".


518-549: Good test coverage for eviction semantics.

The test correctly validates the LRU eviction behavior with MAX_SEGMENTS + 1 segments, ensuring segments are properly evicted and reloaded from the eviction map or disk.

dash-spv/src/storage/headers.rs (1)

16-34: Clean API simplification.

The method now directly calls store_items_at_height(headers, height) without passing self, aligning with the new direct persistence model. The reverse index update logic remains correct.

dash-spv/src/storage/manager.rs (1)

144-154: Timer-driven persistence loop looks correct.

The 5-second interval for persist_evicted provides a good balance between durability and performance, aligning with the PR objective of moving eviction persistence out of the immediate store path.

dash-spv/src/storage/state.rs (4)

429-447: Shutdown correctly persists all dirty data.

The shutdown method properly stops the worker first, then calls save_dirty to persist all segment caches and the header index. This ensures data durability on graceful shutdown.


349-392: clear() correctly stops worker before filesystem operations.

Stopping the worker before removing files and clearing in-memory state prevents race conditions. The retry logic for remove_dir_all handles transient filesystem races well.


691-694: Trait implementation correctly adapts signature.

The trait requires &mut self for shutdown, and the implementation correctly delegates to Self::shutdown(&self). This is valid Rust and maintains trait compatibility.


852-874: Excellent test for shutdown durability.

This test verifies that headers stored after save_dirty() are still persisted via shutdown(), confirming the new persistence model works correctly across restarts.

@github-actions github-actions bot added the merge-conflict The PR conflicts with the target branch. label Dec 17, 2025
@github-actions
Copy link

This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them.

@github-actions github-actions bot removed the merge-conflict The PR conflicts with the target branch. label Dec 18, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
dash-spv/src/storage/manager.rs (1)

137-141: abort() may leave dirty data unpersisted.

This concern was raised in a previous review. Using abort() terminates the worker immediately without a final persist_evicted call. While shutdown() in state.rs calls save_dirty() after stop_worker(), code paths like clear() and clear_filters() that only call stop_worker() could lose data if dirty segments exist.

🧹 Nitpick comments (4)
dash-spv/src/storage/manager.rs (1)

121-130: Worker loop lacks error resilience and could block concurrent access.

The background worker acquires write locks on all three caches sequentially every 5 seconds. Consider two improvements:

  1. If any persist_evicted call panics, the worker task will terminate silently without restarting.
  2. Holding write locks sequentially may cause brief contention during persistence cycles.
🔎 Consider wrapping each persist call for resilience:
 let worker_handle = tokio::spawn(async move {
     let mut ticker = tokio::time::interval(Duration::from_secs(5));

     loop {
         ticker.tick().await;

-        block_headers.write().await.persist_evicted().await;
-        filter_headers.write().await.persist_evicted().await;
-        filters.write().await.persist_evicted().await;
+        if let Err(e) = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| async {
+            block_headers.write().await.persist_evicted().await;
+        })) {
+            tracing::error!("Panic in block_headers persistence: {:?}", e);
+        }
+        // Similar for filter_headers and filters...
     }
 });

Alternatively, the current approach may be acceptable if persistence failures are rare and a restart is the expected recovery path.

dash-spv/src/storage/segments.rs (3)

74-79: Unbounded evicted map could cause memory growth.

The evicted map has no size limit. If segments are evicted faster than the 5-second persistence interval can flush them (e.g., rapid sequential access across many segments), the evicted map could grow unbounded. Consider adding a maximum evicted size or blocking eviction when the evicted queue is too large.


223-233: Use immutable iterator since nothing is mutated.

iter_mut() is used but only last_accessed is read for comparison. Use iter() instead.

🔎 Apply this diff:
-            let key_to_evict =
-                self.segments.iter_mut().min_by_key(|(_, s)| s.last_accessed).map(|(k, v)| (*k, v));
-
-            if let Some((key, _)) = key_to_evict {
+            let key_to_evict =
+                self.segments.iter().min_by_key(|(_, s)| s.last_accessed).map(|(k, _)| *k);
+
+            if let Some(key) = key_to_evict {
                 if let Some(segment) = self.segments.remove(&key) {
                     if segment.state == SegmentState::Dirty {
                         self.evicted.insert(key, segment);
                     }
                 }
             }

480-502: Blocking I/O in async context.

Segment::persist uses synchronous fs::create_dir_all on line 487 within an async function. While atomic_write is async, the directory creation could block the async runtime briefly.

🔎 Consider using tokio::fs for consistency:
-        if let Err(e) = fs::create_dir_all(path.parent().unwrap()) {
+        if let Err(e) = tokio::fs::create_dir_all(path.parent().unwrap()).await {
             return Err(StorageError::WriteFailed(format!("Failed to persist segment: {}", e)));
         }

This is a minor optimization since directory creation is typically fast, but it maintains consistency with the async pattern used elsewhere.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a558988 and 1f168b7.

📒 Files selected for processing (3)
  • dash-spv/src/storage/manager.rs (3 hunks)
  • dash-spv/src/storage/segments.rs (12 hunks)
  • dash-spv/src/storage/state.rs (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • dash-spv/src/storage/state.rs
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Never hardcode network parameters, addresses, or keys in Rust code
Use proper error types with thiserror crate and propagate errors appropriately in Rust code
Use tokio runtime for async operations in Rust code
Use conditional compilation with feature flags for optional features
Write unit tests for new functionality in Rust
Format Rust code with cargo fmt before committing
Ensure clippy passes with all warnings as errors on all Rust code

**/*.rs: Each crate keeps sources in src/; unit tests live alongside code with #[cfg(test)]
Maintain MSRV of 1.89 for Rust code
Format Rust code with rustfmt (see rustfmt.toml); run cargo fmt --all before commits
Lint Rust code with clippy; avoid unwrap()/expect() in library code; use error types (e.g., thiserror)
Use snake_case for function and variable names in Rust
Use UpperCamelCase for types and traits in Rust
Use SCREAMING_SNAKE_CASE for constants in Rust
Follow mixed editions (2021/2024) and crate-specific idioms; prefer async via tokio where applicable
Unit tests should be placed near code with descriptive names (e.g., test_parse_address_mainnet)
Use #[ignore] for network-dependent or long-running tests; run with -- --ignored flag
Never commit secrets or real keys; avoid logging sensitive data in Rust code
Keep test vectors deterministic in Rust test code

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
dash-spv/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

dash-spv/**/*.rs: Use async/await throughout the codebase, built on tokio runtime
Use Arc for trait objects to enable runtime polymorphism for NetworkManager and StorageManager
Use Tokio channels for inter-component message passing between async tasks
Maintain minimum Rust version (MSRV) of 1.89 and use only compatible syntax and features

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
dash-spv/src/storage/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
🧠 Learnings (12)
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/storage/**/*.rs : Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/sync/**/*.rs : Implement sequential phase-based synchronization via SyncManager with phases progressing in order: Headers → Masternode List → Filter Headers → Filters → Blocks

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-06-26T15:54:02.509Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:54:02.509Z
Learning: The `StorageManager` trait in `dash-spv/src/storage/mod.rs` uses `&mut self` methods but is also `Send + Sync`, and implementations often use interior mutability for concurrency. This can be confusing, so explicit documentation should clarify thread-safety expectations and the rationale for the API design.

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use async/await throughout the codebase, built on tokio runtime

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-06-26T16:02:42.390Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T16:02:42.390Z
Learning: Passing exclusive mutable references to network and storage managers into the sync manager in async Rust can lead to borrow checker issues or runtime contention if concurrent access is needed elsewhere. It's advisable to document this architectural tradeoff and consider refactoring to interior mutability or message passing for shared access as the codebase evolves.

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-02-25T06:13:52.858Z
Learnt from: QuantumExplorer
Repo: dashpay/rust-dashcore PR: 51
File: dash/src/sml/masternode_list/scores_for_quorum.rs:50-73
Timestamp: 2025-02-25T06:13:52.858Z
Learning: ScoreHash is a cryptographic hash in the rust-dashcore library, and therefore does not need collision handling when used as a key in collections like BTreeMap due to the extremely low probability of collisions.

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Tokio channels for inter-component message passing between async tasks

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Applies to key-wallet/**/{wallet,account,managed_account}/**/*.rs : Use BTreeMap for ordered data structures (accounts, transactions) and HashMap for quick lookups (address to index mapping) to optimize data structure performance

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/src/**/*.rs : Use thread-local storage for error propagation via `dash_spv_ffi_get_last_error()` function

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Arc<dyn TraitName> for trait objects to enable runtime polymorphism for NetworkManager and StorageManager

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/network/**/*.rs : Implement configurable timeouts with recovery mechanisms for network operations

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-06-26T16:01:37.609Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T16:01:37.609Z
Learning: The mempool tracking infrastructure (UnconfirmedTransaction, MempoolState, configuration, and mempool_filter.rs) is fully implemented and integrated in the Dash SPV client as of this PR, including client logic, FFI APIs, and tests.

Applied to files:

  • dash-spv/src/storage/segments.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: fuzz (dash_outpoint_string)
  • GitHub Check: fuzz (dash_deserialize_address)
  • GitHub Check: fuzz (dash_script_bytes_to_asm_fmt)
  • GitHub Check: fuzz (dash_deserialize_script)
  • GitHub Check: fuzz (dash_deserialize_witness)
  • GitHub Check: fuzz (hashes_sha1)
  • GitHub Check: fuzz (hashes_ripemd160)
  • GitHub Check: fuzz (hashes_sha512_256)
  • GitHub Check: fuzz (hashes_json)
  • GitHub Check: fuzz (hashes_sha256)
  • GitHub Check: fuzz (dash_deserialize_amount)
  • GitHub Check: fuzz (hashes_sha512)
  • GitHub Check: fuzz (hashes_cbor)
  • GitHub Check: fuzz (dash_deser_net_msg)
  • GitHub Check: fuzz (dash_deserialize_block)
  • GitHub Check: RPC Tests (stable, true)
  • GitHub Check: SPV Components Tests
  • GitHub Check: Core Components Tests
  • GitHub Check: Pre-commit (macos-latest)
  • GitHub Check: Pre-commit (ubuntu-latest)
🔇 Additional comments (5)
dash-spv/src/storage/segments.rs (5)

405-406: Verify segment size differs from coding guidelines.

ITEMS_PER_SEGMENT is set to 50,000, but the coding guidelines specify "10,000-header segments". Please confirm this deviation is intentional.

Based on learnings, the guidelines state: "Store headers in 10,000-header segments with index files in headers/ directory".


355-362: LGTM - persist_evicted correctly clears after iteration.

The method properly iterates over evicted segments, persists each one, logs errors without panicking, and clears the evicted map afterward. This addresses the previous review comment about duplicate persistence logic.


389-394: Clean state model simplification.

The simplified SegmentState enum (Clean/Dirty only) removes the previous Saving intermediate state. This is appropriate for the new timer-based persistence model where state transitions are synchronous within the persist call.


313-352: LGTM - store_items delegation and height tracking.

The store_items method correctly delegates to store_items_at_height using next_height(), and store_items_at_height properly updates tip_height after storing all items.


569-599: Test correctly validates eviction and reload behavior.

The test_segment_cache_eviction test properly validates the LRU eviction policy by exceeding MAX_ACTIVE_SEGMENTS and verifying segments can be reloaded with correct data.

@github-actions github-actions bot added the merge-conflict The PR conflicts with the target branch. label Dec 22, 2025
@github-actions
Copy link

This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them.

@github-actions github-actions bot removed the merge-conflict The PR conflicts with the target branch. label Dec 23, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f168b7 and b095db4.

📒 Files selected for processing (1)
  • dash-spv/src/storage/state.rs
🧰 Additional context used
📓 Path-based instructions (5)
dash-spv/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

dash-spv/**/*.rs: Use async/await throughout the codebase, built on tokio runtime
Use Arc for trait objects to enable runtime polymorphism for NetworkManager and StorageManager
Use Tokio channels for inter-component message passing between async tasks
Maintain minimum Rust version (MSRV) of 1.89 and use only compatible syntax and features

Files:

  • dash-spv/src/storage/state.rs
dash-spv/src/storage/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Files:

  • dash-spv/src/storage/state.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (CLAUDE.md)

Never hardcode network parameters, addresses, or keys

Files:

  • dash-spv/src/storage/state.rs
**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Use proper error types (thiserror) and propagate errors appropriately
Use tokio runtime for async operations
Use conditional compilation with feature flags for optional features
Write unit tests for new functionality
Use secure random number generation for keys
Never log or expose private keys
Code must target Rust 1.89 minimum supported version (MSRV)
Format code using cargo fmt
Pass clippy linting without warnings

**/*.rs: MSRV (Minimum Supported Rust Version) is 1.89; ensure compatibility with this version for all builds
Unit tests should live alongside code with #[cfg(test)] annotation; integration tests use the tests/ directory
Use snake_case for function and variable names
Use UpperCamelCase for types and traits
Use SCREAMING_SNAKE_CASE for constants
Format code with rustfmt before commits; ensure cargo fmt --all is run
Run cargo clippy --workspace --all-targets -- -D warnings for linting; avoid warnings in CI
Prefer async/await via tokio for asynchronous operations

Files:

  • dash-spv/src/storage/state.rs
**/{dash-network,dash-spv,key-wallet}/**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

Use async/await for async operations in network and wallet modules

Files:

  • dash-spv/src/storage/state.rs
🧠 Learnings (9)
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/storage/**/*.rs : Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-06-26T15:54:02.509Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:54:02.509Z
Learning: The `StorageManager` trait in `dash-spv/src/storage/mod.rs` uses `&mut self` methods but is also `Send + Sync`, and implementations often use interior mutability for concurrency. This can be confusing, so explicit documentation should clarify thread-safety expectations and the rationale for the API design.

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-22T17:59:37.849Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-22T17:59:37.849Z
Learning: Applies to **/{dash-network,dash-spv,key-wallet}/**/*.rs : Use async/await for async operations in network and wallet modules

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/sync/**/*.rs : Implement sequential phase-based synchronization via SyncManager with phases progressing in order: Headers → Masternode List → Filter Headers → Filters → Blocks

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use async/await throughout the codebase, built on tokio runtime

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/src/**/*.rs : Use thread-local storage for error propagation via `dash_spv_ffi_get_last_error()` function

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-02-25T06:13:52.858Z
Learnt from: QuantumExplorer
Repo: dashpay/rust-dashcore PR: 51
File: dash/src/sml/masternode_list/scores_for_quorum.rs:50-73
Timestamp: 2025-02-25T06:13:52.858Z
Learning: ScoreHash is a cryptographic hash in the rust-dashcore library, and therefore does not need collision handling when used as a key in collections like BTreeMap due to the extremely low probability of collisions.

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-19T00:07:22.904Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-19T00:07:22.904Z
Learning: Applies to key-wallet/**/*.rs : Use `BTreeMap` for ordered data (accounts, transactions) and `HashMap` for lookups (address mappings); apply memory management strategies for old transaction data

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/tests/**/*.rs : Organize tests into unit tests (in-module), integration tests (tests/ directory), real network tests (with live Dash Core nodes), and performance benchmarks

Applied to files:

  • dash-spv/src/storage/state.rs
🔇 Additional comments (4)
dash-spv/src/storage/state.rs (4)

350-355: LGTM: Simplified filter header storage.

The change to use store_items directly is a clean simplification that aligns with the new time-based persistence model. The segment cache now handles height calculation and persistence internally.


391-393: LGTM: Appropriate use of store_items_at_height.

Using store_items_at_height for storing a single filter at a specific height is more explicit and semantically correct than the previous approach.


214-256: No issues identified. The stop_worker() method is implemented as a synchronous call to handle.abort() on the worker task, which is a non-blocking operation in tokio. Calling synchronous code from async contexts is correct as long as the synchronous code itself doesn't perform blocking I/O, which stop_worker() does not. The pattern of stopping the worker, performing cleanup operations, and restarting it is sound.


259-264: No actionable issues found. The shutdown sequence is safe.

Tokio's RwLock guarantees proper lock release when tasks are aborted. The stop_worker() call followed immediately by save_dirty() does not create a race condition or deadlock risk—when the worker task is aborted, its locks are properly released. Additionally, save_dirty() is more comprehensive than the worker's persist_evicted() (it persists all segments, not just evicted ones), ensuring all data is persisted even if the worker is interrupted mid-operation. This pattern is used consistently across shutdown(), clear(), and clear_filters() methods, and the test_shutdown_flushes_index test confirms the sequence works correctly.

@ZocoLini ZocoLini changed the base branch from v0.41-dev to v0.42-dev December 30, 2025 22:25
Copy link
Collaborator

@xdustinface xdustinface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few comments and also:

  • Can you add a test which tests the actual periodical writes (having the interval configurable would help here to speed it up i guess)?
  • What about the index, we now (and before) write it only on shutdown, would it maybe make sense to have some longer interval for index writes also?
  • And a bit more offtopic: Any idea how we could make the index writes incremental also? Or not worth it? Did you maybe check how long it takes with 2million headers like we have on mainnet?

break;
}
}
let mut ticker = tokio::time::interval(Duration::from_secs(5));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets have a constant somewhere easier visible for the interval seconds. Maybe even make it configurable?

let _ = handle.await;
pub(super) fn stop_worker(&self) {
if let Some(handle) = &self.worker_handle {
handle.abort();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit risky to just abort the worker? Wouldn't this lead to corrupted data if its aborted during write? I think we should have a proper shutdown mechanism here still (maybe with a tokio_util::sync::CancellationToken) and await the task to be done like it was before? Also, maybe not that important but before we took the handle out of the option leaving it empty after, now the handle stays in there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants