Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[env]
# QMDB 2026.2.0's into_merkleized() (MMR computation) requires >2 MiB of stack
# when called multiple times on the same database instance. The default Rust test
# thread stack size is 2 MiB, which is insufficient. Set 16 MiB to give QMDB
# tests adequate headroom without affecting production code.
RUST_MIN_STACK = "16777216"
Comment on lines +1 to +6
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

RUST_MIN_STACK under [env] applies to all cargo invocations, not just tests.

The comment says "without affecting production code," but .cargo/config.toml [env] sets environment variables for every cargo command (cargo run, cargo build, etc.), not only cargo test. While RUST_MIN_STACK only reserves virtual address space (pages are committed on demand), this is worth being aware of.

If you want to limit this strictly to tests, consider setting it in a test harness or via a CI environment variable instead. That said, the practical impact on production binaries is negligible since the OS only commits stack pages as they're touched, so this is fine to keep as-is if the broader scope is acceptable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.cargo/config.toml around lines 1 - 6, The RUST_MIN_STACK entry under the
[env] table applies to all cargo commands (RUST_MIN_STACK) — remove this global
setting from the [env] table and instead set RUST_MIN_STACK only for test runs
by exporting it in the test harness/CI job or by prefixing the test command
(e.g., set the environment variable in the CI job or the test runner script that
invokes cargo test) so production builds and runs are not affected;
alternatively keep the current global if you accept the broader scope but
document the trade-off where RUST_MIN_STACK is defined.

20 changes: 15 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ members = [
"crates/testing/simulator",
"crates/testing/debugger",
"crates/testing/cli",
"crates/consensus",
"crates/p2p",
]
resolver = "2"

Expand Down Expand Up @@ -67,14 +69,22 @@ evolve_simulator = { path = "crates/testing/simulator" }
evolve_debugger = { path = "crates/testing/debugger" }
evolve_node = { path = "crates/app/node" }
evolve_testapp = { path = "bin/testapp" }
evolve_p2p = { path = "crates/p2p" }

# outside deps
linkme = { version = "0.3", default-features = false }
commonware-cryptography = "0.0.65"
commonware-runtime = "0.0.65"
commonware-storage = "0.0.65"
commonware-utils = "0.0.65"
commonware-codec = "0.0.65"
commonware-cryptography = "2026.2.0"
commonware-parallel = "2026.2.0"
commonware-runtime = "2026.2.0"
commonware-storage = "2026.2.0"
commonware-utils = "2026.2.0"
commonware-codec = "2026.2.0"
commonware-consensus = "2026.2.0"
commonware-broadcast = "2026.2.0"
commonware-macros = "2026.2.0"
commonware-p2p = "2026.2.0"
commonware-resolver = "2026.2.0"
evolve_consensus = { path = "crates/consensus" }
borsh = { features = ["derive"], version = "1.5.5" }
clap = { version = "4.5.31", features = ["derive"] }
fixed = { version = "1.29", features = ["borsh", "serde"] }
Expand Down
44 changes: 44 additions & 0 deletions crates/consensus/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
[package]
name = "evolve_consensus"
version = "0.1.0"
edition = "2021"
license.workspace = true
repository.workspace = true
rust-version.workspace = true

[dependencies]
commonware-broadcast = { workspace = true }
commonware-consensus = { workspace = true }
commonware-cryptography = { workspace = true }
commonware-codec = { workspace = true }
commonware-p2p = { workspace = true }
commonware-parallel = { workspace = true }
commonware-resolver = { workspace = true }
commonware-runtime = { workspace = true }
commonware-storage = { workspace = true }
commonware-utils = { workspace = true }

evolve_core = { workspace = true }
evolve_server = { workspace = true }
evolve_stf_traits = { workspace = true }
evolve_mempool = { workspace = true }
evolve_storage = { workspace = true }
evolve_p2p = { package = "evolve-p2p", path = "../p2p" }

rand_core = "0.6"

alloy-primitives = { workspace = true }
borsh = { workspace = true }
bytes = { workspace = true }
tokio = { workspace = true }
tracing = { workspace = true }

[dev-dependencies]
commonware-consensus = { workspace = true, features = ["fuzz"] }
commonware-cryptography = { workspace = true, features = ["mocks"] }
commonware-p2p = { workspace = true }
commonware-runtime = { workspace = true, features = ["test-utils"] }
commonware-macros = { workspace = true }

[lints]
workspace = true
283 changes: 283 additions & 0 deletions crates/consensus/src/automaton.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
use crate::block::ConsensusBlock;
use crate::config::ConsensusConfig;
use alloy_primitives::B256;
use commonware_consensus::types::Epoch;
use commonware_consensus::{Automaton, CertifiableAutomaton};
use commonware_cryptography::{Hasher, Sha256};
use commonware_utils::channel::oneshot;
use evolve_core::ReadonlyKV;
use evolve_mempool::{Mempool, MempoolTx, SharedMempool};
use evolve_server::BlockBuilder;
use evolve_server::StfExecutor;
use evolve_stf_traits::{AccountsCodeStorage, Transaction};
use evolve_storage::Storage;
use std::collections::BTreeMap;
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::{Arc, RwLock};
use tokio::sync::RwLock as TokioRwLock;

/// EvolveAutomaton bridges Evolve's STF and mempool with commonware's consensus.
///
/// It implements the `Automaton` and `CertifiableAutomaton` traits, allowing
/// the simplex consensus engine to propose and verify blocks through Evolve's
/// state transition function.
///
/// # Design
///
/// Consensus operates on opaque digests, not full blocks. The automaton:
/// - On `propose()`: builds a block from mempool txs, stores it locally,
/// returns only the digest to consensus.
/// - On `verify()`: looks up the block by digest (populated via Relay),
/// validates parent chain and height.
pub struct EvolveAutomaton<Stf, S, Codes, Tx: MempoolTx, Ctx> {
stf: Stf,
storage: S,
codes: Codes,
mempool: SharedMempool<Mempool<Tx>>,
/// Block cache: stores proposed/received blocks by their digest.
pending_blocks: Arc<RwLock<BTreeMap<[u8; 32], ConsensusBlock<Tx>>>>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

pending_blocks uses std::sync::RwLock, which is a blocking lock. Using this in an async function can block the executor thread, leading to poor performance or deadlocks. It's recommended to use tokio::sync::RwLock for asynchronous code.

This will require changing lock acquisitions from .write().unwrap() to .write().await (and similarly for read locks) in genesis, propose, and verify methods. This change will also make the code safer by avoiding panics from poisoned locks that .unwrap() can cause.

    pending_blocks: Arc<tokio::sync::RwLock<BTreeMap<[u8; 32], ConsensusBlock<Tx>>>>,

/// Current chain height.
height: Arc<AtomicU64>,
/// Last block hash.
last_hash: Arc<TokioRwLock<B256>>,
/// Consensus configuration.
config: ConsensusConfig,
/// Phantom for the context type.
_ctx: std::marker::PhantomData<Ctx>,
}

impl<Stf, S, Codes, Tx: MempoolTx, Ctx> Clone for EvolveAutomaton<Stf, S, Codes, Tx, Ctx>
where
Stf: Clone,
S: Clone,
Codes: Clone,
{
fn clone(&self) -> Self {
Self {
stf: self.stf.clone(),
storage: self.storage.clone(),
codes: self.codes.clone(),
mempool: self.mempool.clone(),
pending_blocks: self.pending_blocks.clone(),
height: self.height.clone(),
last_hash: self.last_hash.clone(),
config: self.config.clone(),
_ctx: std::marker::PhantomData,
}
}
}

impl<Stf, S, Codes, Tx: MempoolTx, Ctx> EvolveAutomaton<Stf, S, Codes, Tx, Ctx> {
pub fn new(
stf: Stf,
storage: S,
codes: Codes,
mempool: SharedMempool<Mempool<Tx>>,
pending_blocks: Arc<RwLock<BTreeMap<[u8; 32], ConsensusBlock<Tx>>>>,
config: ConsensusConfig,
) -> Self {
Self {
stf,
storage,
codes,
mempool,
pending_blocks,
height: Arc::new(AtomicU64::new(1)),
last_hash: Arc::new(TokioRwLock::new(B256::ZERO)),
config,
_ctx: std::marker::PhantomData,
}
}

/// Get the current height.
pub fn height(&self) -> u64 {
self.height.load(Ordering::SeqCst)
}

/// Get a reference to the shared pending blocks.
pub fn pending_blocks(&self) -> &Arc<RwLock<BTreeMap<[u8; 32], ConsensusBlock<Tx>>>> {
&self.pending_blocks
}

/// Get a reference to the shared last block hash.
///
/// The finalization path (reporter) must update this after a block is
/// certified so that subsequent `propose()` calls use the correct parent.
pub fn last_hash(&self) -> &Arc<TokioRwLock<B256>> {
&self.last_hash
}

/// Get a reference to the shared height counter.
///
/// The finalization path (reporter) may use this to reconcile height
/// after block finalization.
pub fn height_atomic(&self) -> &Arc<AtomicU64> {
&self.height
}
}

impl<Stf, S, Codes, Tx, Ctx> Automaton for EvolveAutomaton<Stf, S, Codes, Tx, Ctx>
where
Tx: Transaction + MempoolTx + Clone + Send + Sync + 'static,
S: ReadonlyKV + Storage + Clone + Send + Sync + 'static,
Codes: AccountsCodeStorage + Clone + Send + Sync + 'static,
Stf: StfExecutor<Tx, S, Codes> + Send + Sync + Clone + 'static,
Ctx: Clone + Send + 'static,
{
type Context = Ctx;
type Digest = commonware_cryptography::sha256::Digest;

async fn genesis(&mut self, _epoch: Epoch) -> Self::Digest {
// Genesis: return the digest of the empty genesis block at height 0.
let genesis_block = BlockBuilder::<Tx>::new()
.number(0)
.timestamp(0)
.parent_hash(B256::ZERO)
.gas_limit(self.config.gas_limit)
.build();

let mut genesis_block = genesis_block;
genesis_block.header.transactions_root =
compute_transactions_root(&genesis_block.transactions);

let cb = ConsensusBlock::new(genesis_block);
let digest = cb.digest_value();

// Store genesis in pending blocks.
self.pending_blocks.write().unwrap().insert(digest.0, cb);

digest
}

// SystemTime::now() is used here for block timestamps only. This is acceptable
// in a consensus proposer context — the timestamp is validated by verifiers.
#[allow(clippy::disallowed_types)]
async fn propose(&mut self, _context: Self::Context) -> oneshot::Receiver<Self::Digest> {
let (sender, receiver) = oneshot::channel();

let height = self.height.clone();
let last_hash = self.last_hash.clone();
let gas_limit = self.config.gas_limit;
let mempool = self.mempool.clone();
let pending_blocks = self.pending_blocks.clone();

// Spawn block building onto a background task.
tokio::spawn(async move {
// Pull transactions from mempool.
let selected = {
let mut pool = mempool.write().await;
pool.select(1000) // max txs per block
Comment on lines +160 to +169

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The maximum number of transactions to select from the mempool is hardcoded as 1000. This value should be configurable to allow for easier tuning of block size without changing the code. This suggestion makes the entire ConsensusConfig available in the spawned task by cloning it.

You will need to:

  1. Add a max_txs_per_block: usize field to ConsensusConfig (and its Default impl).
  2. Update line 166 to use config.gas_limit instead of gas_limit.
        let config = self.config.clone();
        let mempool = self.mempool.clone();
        let pending_blocks = self.pending_blocks.clone();

        // Spawn block building onto a background task.
        tokio::spawn(async move {
            // Pull transactions from mempool.
            let selected = {
                let mut pool = mempool.write().await;
                pool.select(config.max_txs_per_block) // max txs per block

};

let transactions: Vec<Tx> = selected
.into_iter()
.map(|arc_tx| (*arc_tx).clone())
.collect();

// Build the block.
let timestamp = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_secs();
Comment on lines 154 to 181
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SystemTime::now() violates the coding guideline and creates non-deterministic block timestamps.

The _context parameter already carries per-round information — the block timestamp should be sourced from there (or a ConsensusConfig field) rather than the wall clock. Different proposers running on machines with clock skew will produce different timestamps for the same logical view, and the current verify() does not validate timestamps, so skewed values go unchecked.

🛠️ Sketch: source timestamp from config/context instead of wall clock
-    // SystemTime::now() is used here for block timestamps only. This is acceptable
-    // in a consensus proposer context — the timestamp is validated by verifiers.
-    #[allow(clippy::disallowed_types)]
     async fn propose(&mut self, _context: Self::Context) -> oneshot::Receiver<Self::Digest> {
         ...
-        let timestamp = std::time::SystemTime::now()
-            .duration_since(std::time::UNIX_EPOCH)
-            .unwrap_or_default()
-            .as_secs();
+        // Source the timestamp from config or context to avoid reliance
+        // on a non-deterministic wall clock.
+        let timestamp = self.config.block_timestamp(); // or derive from context

As per coding guidelines, "Use block time from context instead of std::time::Instant or SystemTime to avoid non-deterministic time sources."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/consensus/src/automaton.rs` around lines 154 - 181, The propose async
task currently uses SystemTime::now() to set the block timestamp (the local
"timestamp" variable), which introduces non-determinism; instead, obtain the
per-round/block time from the provided context or consensus configuration (e.g.,
a timestamp/round_time field on Self::Context or ConsensusConfig) and use that
value when building the block. Replace the SystemTime::now() usage in propose
with reading the context's timestamp (convert to the same numeric type, e.g.,
u64 seconds) before spawning the background task or capture it from self.config
if your config exposes a deterministic block time, and ensure the variable named
"timestamp" is assigned from that context/config value.


let parent_hash = *last_hash.read().await;

// Build first, then consume the height counter so a failed build
// cannot permanently skip a height.
let mut block = BlockBuilder::<Tx>::new()
.number(0)
.timestamp(timestamp)
.parent_hash(parent_hash)
.gas_limit(gas_limit)
.transactions(transactions)
.build();

let block_height = height.fetch_add(1, Ordering::SeqCst);
block.header.number = block_height;

block.header.transactions_root = compute_transactions_root(&block.transactions);

let cb = ConsensusBlock::new(block);
let digest = cb.digest_value();

// Store in pending blocks for later retrieval.
pending_blocks.write().unwrap().insert(digest.0, cb);
Comment on lines 162 to 204
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

pending_blocks grows without bound on nullified views — memory leak.

Every propose() and genesis() call inserts into pending_blocks, but only EvolveReporter::report(Finalization) removes entries. In Simplex consensus, a view can be nullified (leader timeout) without finalizing the proposed block, leaving the entry in pending_blocks permanently. Under sustained network degradation or Byzantine conditions, this map can grow without bound, consuming unbounded memory.

Pruning should be added: either bounded by height (remove blocks older than the last finalized height) or by receiving nullification events in the reporter.

🛠️ Sketch: prune on finalization by removing all entries below the finalized height
// In EvolveReporter::report after updating last_hash / height:
let finalized_number = block.inner.header.number;
let mut pending = state.pending_blocks.write().unwrap();
pending.retain(|_, v| v.inner.header.number >= finalized_number);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/consensus/src/automaton.rs` around lines 162 - 202, The pending_blocks
map (used in automaton.rs where the background block-builder inserts via
pending_blocks.write().unwrap().insert(...)) can grow unbounded because
propose()/genesis() insert entries but only EvolveReporter::report(Finalization)
currently removes them; add pruning logic to remove stale/unfinalized entries:
in EvolveReporter::report when handling Finalization (after updating
last_hash/height) acquire state.pending_blocks.write().unwrap() and retain only
blocks whose inner.header.number is >= finalized_number (or otherwise within a
bounded window relative to last_finalized_height); additionally consider
removing entries when a view is explicitly nullified (e.g., on timeout handlers)
to avoid leaks from nullified proposals.


// Return the digest to consensus.
let _ = sender.send(digest);
});

receiver
}
Comment on lines 155 to 211
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

propose() increments height eagerly — a failed/cancelled spawn leaves a height gap.

fetch_add on line 137 increments the height before the spawned task runs. If the task panics, is cancelled, or fails to build a block, the height has already been consumed. This creates a permanent gap in block heights.

Consider incrementing inside the spawned task after successful block creation, or using a reservation pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/consensus/src/automaton.rs` around lines 134 - 181, The height is
incremented eagerly via height.fetch_add(...) in propose(), creating permanent
gaps if the spawned task fails; fix by removing the early fetch_add and instead
clone the atomic height into the spawned task (like mempool, pending_blocks,
last_hash, sender) and call height.fetch_add(1, Ordering::SeqCst) inside the
tokio::spawn after successfully building/preparing the block but before using
that number in BlockBuilder::number(...); ensure you keep SeqCst ordering and
that no other code relies on the old eager increment (adjust any locals
accordingly) so the counter is only consumed on successful block creation.


async fn verify(
&mut self,
_context: Self::Context,
payload: Self::Digest,
) -> oneshot::Receiver<bool> {
let (sender, receiver) = oneshot::channel();

let pending_blocks = self.pending_blocks.clone();
let last_hash = self.last_hash.clone();

tokio::spawn(async move {
// Look up the block by digest.
let block = {
let blocks = pending_blocks.read().unwrap();
blocks.get(&payload.0).cloned()
};

let Some(block) = block else {
tracing::warn!(
digest = ?payload,
"verify: block not found in pending blocks"
);
let _ = sender.send(false);
return;
};

// Validate parent hash chain.
let expected_parent = *last_hash.read().await;
if block.inner.header.parent_hash != expected_parent {
tracing::warn!(
expected = ?expected_parent,
actual = ?block.inner.header.parent_hash,
"verify: parent hash mismatch"
);
let _ = sender.send(false);
return;
}

// Validate height is positive.
if block.inner.header.number == 0 {
tracing::warn!("verify: block height cannot be 0 (genesis)");
let _ = sender.send(false);
return;
}
Comment on lines +251 to +256

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The verify method in EvolveAutomaton does not validate that the block timestamp is monotonically increasing with respect to the parent block's timestamp. Although the comments suggest this validation should occur, the implementation only checks the parent hash and the block height. This could allow a malicious proposer to use incorrect timestamps, potentially affecting time-dependent application logic.

            // Validate height is positive.
            if block.inner.header.number == 0 {
                tracing::warn!("verify: block height cannot be 0 (genesis)");
                let _ = sender.send(false);
                return;
            }

            // Validate timestamp monotonicity.
            // Note: In a real implementation, you would need the parent block's timestamp.
            if block.inner.header.timestamp == 0 {
                tracing::warn!("verify: block timestamp cannot be 0");
                let _ = sender.send(false);
                return;
            }


let _ = sender.send(true);
});

receiver
}
Comment on lines +213 to +262
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

verify() doesn't validate timestamp monotonicity despite the struct doc claiming it (line 30).

The struct-level documentation states the verifier "validates parent chain, timestamp monotonicity, etc." but the actual implementation only checks parent hash and height > 0. Either add timestamp validation or correct the documentation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/consensus/src/automaton.rs` around lines 183 - 232, The verify()
implementation currently checks parent hash and height but lacks the timestamp
monotonicity check promised by the struct docs; add a check that the candidate
block's timestamp (block.inner.header.timestamp) is strictly greater than its
parent's timestamp: look up the parent block using pending_blocks (using the
expected_parent hash obtained from last_hash.read().await), fail with
tracing::warn! and send false if the parent is missing or if
block.inner.header.timestamp <= parent.inner.header.timestamp, otherwise proceed
to send true; update the same async tokio::spawn task inside verify() and use
the existing sender/receiver flow so behavior and logging remain consistent.

}

fn compute_transactions_root<Tx: Transaction>(transactions: &[Tx]) -> B256 {
let mut hasher = Sha256::new();
hasher.update(&(transactions.len() as u64).to_le_bytes());
for tx in transactions {
hasher.update(&tx.compute_identifier());
}
B256::from_slice(&hasher.finalize().0)
}

impl<Stf, S, Codes, Tx, Ctx> CertifiableAutomaton for EvolveAutomaton<Stf, S, Codes, Tx, Ctx>
where
Tx: Transaction + MempoolTx + Clone + Send + Sync + 'static,
S: ReadonlyKV + Storage + Clone + Send + Sync + 'static,
Codes: AccountsCodeStorage + Clone + Send + Sync + 'static,
Stf: StfExecutor<Tx, S, Codes> + Send + Sync + Clone + 'static,
Ctx: Clone + Send + 'static,
{
// Use the default implementation which always certifies.
}
Loading
Loading