feat(sdk): add WASM route handling, executor methods, and subnet owner resolution#55
feat(sdk): add WASM route handling, executor methods, and subnet owner resolution#55
Conversation
…thods, and subnet owner resolution - Add WasmRouteDefinition, WasmRouteRequest, WasmRouteResponse types to challenge-sdk-wasm - Extend Challenge trait with routes() and handle_route() default methods - Add get_routes and handle_route exports to register_challenge! macro - Add execute_get_routes and execute_handle_route methods to WasmChallengeExecutor - Resolve subnet owner from UID 0 hotkey during validator sync
📝 WalkthroughWalkthroughThis PR introduces WASM route handling across the codebase by adding new route types and trait extensions to the WASM SDK, implementing executor methods for route operations, establishing subnet owner resolution via UID 0 hotkey extraction, and documenting the flows with architecture diagrams. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Executor as WASM Executor
participant Module as WASM Module
participant Memory as WASM Memory
Client->>Executor: execute_get_routes(module_path, policies)
activate Executor
Executor->>Module: load_module(module_path)
Executor->>Module: instantiate(policies)
Executor->>Module: call get_routes()
Module->>Memory: write route definitions
Module-->>Executor: return pointer, length
Executor->>Memory: read output from pointer
Executor->>Executor: compute ExecutionMetrics
deactivate Executor
Executor-->>Client: return (Vec<u8>, ExecutionMetrics)
Client->>Executor: execute_handle_route(module_path, policies, request_data)
activate Executor
Executor->>Module: load_module(module_path)
Executor->>Module: instantiate(policies)
Executor->>Memory: allocate memory for request
Executor->>Memory: write request_data to allocated memory
Executor->>Module: call handle_route(ptr, len)
Module->>Memory: process request, write response
Module-->>Executor: return pointer, length
Executor->>Memory: read response from pointer
Executor->>Executor: compute ExecutionMetrics
deactivate Executor
Executor-->>Client: return (Vec<u8>, ExecutionMetrics)
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/bittensor-integration/src/validator_sync.rs (1)
108-116: Subnet owner update happens afterupdate_state— the removal phase uses the stalesudo_key.
update_state(Line 109) callsstate.is_sudo(&hotkey)to protect the sudo validator from removal, butsudo_keyisn't updated until Line 114. If UID 0 changed between syncs, the old UID 0 hotkey getsis_sudoprotection during this sync even though it's no longer the owner, while the new UID 0 hotkey is protected only by being present inbt_map.In practice this is benign (the new UID 0 is a neuron so it's in
bt_mapand won't be removed; the old one gets one extra sync of protection), but moving thesudo_keyupdate beforeupdate_statewould make the intent clearer and avoid the stale-key edge.♻️ Suggested reordering
drop(client); // Release lock // Update registered hotkeys in state (all miners + validators) { let mut state_guard = state.write(); state_guard.registered_hotkeys = all_hotkeys; } + // Resolve subnet owner from UID 0 (before update_state so is_sudo uses new key) + if let Some(hotkey) = uid0_hotkey { + let mut state_guard = state.write(); + state_guard.sudo_key = hotkey; + debug!("Subnet owner set to UID 0 hotkey: {}", state_guard.sudo_key); + } + // Update state with validators let result = self.update_state(state, bt_validators, banned_validators); - // Resolve subnet owner from UID 0 - if let Some(hotkey) = uid0_hotkey { - let mut state_guard = state.write(); - state_guard.sudo_key = hotkey; - debug!("Subnet owner set to UID 0 hotkey: {}", state_guard.sudo_key); - } - // Update last sync block🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/bittensor-integration/src/validator_sync.rs` around lines 108 - 116, The sudo_key is updated after calling update_state, causing state.is_sudo checks inside update_state to use a stale sudo key; move the block that writes state_guard.sudo_key (the uid0_hotkey handling that acquires state.write and sets sudo_key and logs) to occur before calling self.update_state(...) so that update_state and its state.is_sudo checks see the current sudo_key (refer to uid0_hotkey, sudo_key, update_state, and state.is_sudo).bins/validator-node/src/wasm_executor.rs (1)
583-665:execute_get_routesis a near-exact clone ofexecute_get_tasks.The ~80-line
InstanceConfigboilerplate is now duplicated across five methods (execute_evaluation_with_sandbox,execute_validation,execute_get_tasks,execute_configure,execute_get_routes,execute_handle_route). Consider extracting a helper that builds a defaultInstanceConfigfrom the executor's config + the provided policies:♻️ Sketch: extract config builder
// Inside WasmChallengeExecutor impl fn build_instance_config( &self, network_policy: &NetworkPolicy, sandbox_policy: &SandboxPolicy, challenge_id: &str, storage_override: Option<(StorageHostConfig, Arc<dyn StorageBackend>)>, ) -> InstanceConfig { let (shc, sb) = storage_override.unwrap_or_else(|| { (StorageHostConfig::default(), Arc::new(InMemoryStorageBackend::new())) }); InstanceConfig { network_policy: network_policy.clone(), sandbox_policy: sandbox_policy.clone(), exec_policy: ExecPolicy::default(), time_policy: TimePolicy::default(), audit_logger: None, memory_export: "memory".to_string(), challenge_id: challenge_id.to_string(), validator_id: "validator".to_string(), restart_id: String::new(), config_version: 0, storage_host_config: shc, storage_backend: sb, fixed_timestamp_ms: None, consensus_policy: ConsensusPolicy::default(), terminal_policy: TerminalPolicy::default(), llm_policy: match &self.config.chutes_api_key { Some(key) => LlmPolicy::with_api_key(key.clone()), None => LlmPolicy::default(), }, ..Default::default() } }Each executor method then becomes a thin wrapper calling the helper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bins/validator-node/src/wasm_executor.rs` around lines 583 - 665, The InstanceConfig block duplicated in execute_get_routes (and siblings execute_evaluation_with_sandbox, execute_validation, execute_get_tasks, execute_configure, execute_handle_route) should be extracted into a helper on WasmChallengeExecutor (e.g., fn build_instance_config(&self, network_policy: &NetworkPolicy, sandbox_policy: &SandboxPolicy, challenge_id: &str, storage_override: Option<(StorageHostConfig, Arc<dyn StorageBackend>)>) -> InstanceConfig). Implement that helper to construct and return the InstanceConfig (including LlmPolicy selection from self.config.chutes_api_key and defaulting storage to InMemoryStorageBackend when no override), then replace the inline InstanceConfig construction in execute_get_routes with a call to build_instance_config(module_path, network_policy, sandbox_policy, None) (or pass a storage_override where needed) so all executor methods reuse the same builder.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bins/validator-node/src/wasm_executor.rs`:
- Around line 738-746: The route-output branch in
execute_get_routes/handle_route reads unbounded memory (the out_len from WASM) —
add a size check before calling instance.read_memory: validate out_len is
non-negative and <= a defined max (reuse MAX_EVALUATION_OUTPUT_SIZE or introduce
MAX_ROUTE_OUTPUT_SIZE ~64 MiB), and return an error (anyhow::anyhow! with clear
context referencing handle_route/output size) if it exceeds the limit; apply the
same guard where route Vec<u8> is deserialized (the code around result_data and
in execute_get_routes/execute_evaluation) so the host never allocates or reads
more than the allowed limit.
---
Nitpick comments:
In `@bins/validator-node/src/wasm_executor.rs`:
- Around line 583-665: The InstanceConfig block duplicated in execute_get_routes
(and siblings execute_evaluation_with_sandbox, execute_validation,
execute_get_tasks, execute_configure, execute_handle_route) should be extracted
into a helper on WasmChallengeExecutor (e.g., fn build_instance_config(&self,
network_policy: &NetworkPolicy, sandbox_policy: &SandboxPolicy, challenge_id:
&str, storage_override: Option<(StorageHostConfig, Arc<dyn StorageBackend>)>) ->
InstanceConfig). Implement that helper to construct and return the
InstanceConfig (including LlmPolicy selection from self.config.chutes_api_key
and defaulting storage to InMemoryStorageBackend when no override), then replace
the inline InstanceConfig construction in execute_get_routes with a call to
build_instance_config(module_path, network_policy, sandbox_policy, None) (or
pass a storage_override where needed) so all executor methods reuse the same
builder.
In `@crates/bittensor-integration/src/validator_sync.rs`:
- Around line 108-116: The sudo_key is updated after calling update_state,
causing state.is_sudo checks inside update_state to use a stale sudo key; move
the block that writes state_guard.sudo_key (the uid0_hotkey handling that
acquires state.write and sets sudo_key and logs) to occur before calling
self.update_state(...) so that update_state and its state.is_sudo checks see the
current sudo_key (refer to uid0_hotkey, sudo_key, update_state, and
state.is_sudo).
| let result_data = if out_ptr > 0 && out_len > 0 { | ||
| instance | ||
| .read_memory(out_ptr as usize, out_len as usize) | ||
| .map_err(|e| { | ||
| anyhow::anyhow!("failed to read WASM memory for handle_route output: {}", e) | ||
| })? | ||
| } else { | ||
| Vec::new() | ||
| }; |
There was a problem hiding this comment.
No output size limit on route response deserialization.
execute_evaluation applies MAX_EVALUATION_OUTPUT_SIZE (64 MiB) via bincode::DefaultOptions::new().with_limit(...) when deserializing evaluation output (Line 255-260). The route methods return raw Vec<u8> without any size cap — a malicious or buggy WASM module could return an arbitrarily large out_len, causing the host to allocate unbounded memory in read_memory.
Consider validating out_len against a reasonable upper bound before reading.
🛡️ Proposed guard
+ const MAX_ROUTE_OUTPUT_SIZE: usize = 64 * 1024 * 1024; // 64 MiB
+
let result_data = if out_ptr > 0 && out_len > 0 {
+ if out_len as usize > MAX_ROUTE_OUTPUT_SIZE {
+ return Err(anyhow::anyhow!(
+ "WASM handle_route output exceeds maximum size ({} bytes)",
+ out_len
+ ));
+ }
instance
.read_memory(out_ptr as usize, out_len as usize)Apply the same check in execute_get_routes (Line 635).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bins/validator-node/src/wasm_executor.rs` around lines 738 - 746, The
route-output branch in execute_get_routes/handle_route reads unbounded memory
(the out_len from WASM) — add a size check before calling instance.read_memory:
validate out_len is non-negative and <= a defined max (reuse
MAX_EVALUATION_OUTPUT_SIZE or introduce MAX_ROUTE_OUTPUT_SIZE ~64 MiB), and
return an error (anyhow::anyhow! with clear context referencing
handle_route/output size) if it exceeds the limit; apply the same guard where
route Vec<u8> is deserialized (the code around result_data and in
execute_get_routes/execute_evaluation) so the host never allocates or reads more
than the allowed limit.
Summary
Extends the platform-v2 WASM challenge infrastructure with route handling capabilities, allowing challenge modules to define and serve custom HTTP-like routes through the existing RPC
challenge_callmechanism. Also adds subnet owner resolution via UID 0 hotkey from the Bittensor metagraph.Changes
Challenge SDK WASM (
crates/challenge-sdk-wasm)WasmRouteDefinition,WasmRouteRequest,WasmRouteResponsetypes totypes.rsfor no_std-compatible route handlingroutes()andhandle_route()default trait methods to theChallengetrait inlib.rsget_routesandhandle_routeWASM ABI exports to theregister_challenge!macroValidator Node (
bins/validator-node)execute_handle_route()method toWasmChallengeExecutorfor dispatching route requests to WASM modulesexecute_get_routes()method for retrieving route definitions from WASM moduleschallenge_callRPC → WASM executorBittensor Integration (
crates/bittensor-integration)resolve_subnet_owner()function to identify UID 0 hotkey as the authoritative subnet owner from metagraph dataDocumentation
Backward Compatibility
All changes are additive. The
Challengetrait methods have default implementations returning empty vecs, so existing challenge modules that do not implementroutes()/handle_route()continue to compile without changes. Theregister_challenge!macro exports are added alongside existing ones.Summary by CodeRabbit
New Features
Documentation