Skip to content

fix: runner alloc idx logic, api auth for actor get#4443

Open
MasterPtato wants to merge 1 commit intomainfrom
03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get
Open

fix: runner alloc idx logic, api auth for actor get#4443
MasterPtato wants to merge 1 commit intomainfrom
03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get

Conversation

@MasterPtato
Copy link
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link

railway-app bot commented Mar 17, 2026

🚅 Deployed to the rivet-pr-4443 environment in rivet-frontend

Service Status Web Updated (UTC)
frontend-cloud 😴 Sleeping (View Logs) Web Mar 25, 2026 at 5:34 am
frontend-inspector 😴 Sleeping (View Logs) Web Mar 19, 2026 at 3:17 pm
website ❌ Build Failed (View Logs) Web Mar 17, 2026 at 7:41 pm
mcp-hub ✅ Success (View Logs) Web Mar 17, 2026 at 7:39 pm
ladle ❌ Build Failed (View Logs) Web Mar 17, 2026 at 7:38 pm

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 17, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4443

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4443

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4443

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4443

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4443

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4443

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4443

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4443

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4443

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4443

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4443

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4443

commit: 73107a6

@claude
Copy link

claude bot commented Mar 17, 2026

PR Review: fix: runner alloc idx logic, api auth for actor get

Overview

This PR fixes two separate issues: (1) incorrect alloc idx behavior during runner eviction/draining, and (2) unnecessary auth requirement on actor lookup-by-ID endpoints. The changes are generally well-structured and fix real bugs.


Runner Alloc Idx Logic

Eviction propagation via result type — The refactor from returning Err(WsError::Eviction) to Ok(Err(LifecycleResult::Evicted)) in both WebSocket tasks is the right approach. It cleanly separates "expected eviction" from "unexpected errors" and allows the caller to decide whether to clear the alloc idx. Previously, eviction and error were treated the same, leading to the alloc idx being cleared on eviction when it shouldn't be (since pegboard's coordinator handles that).

Draining ≠ Expired — Removing the ExpiredTsKey write in the Draining branch of clear_db is correct. Draining a runner was incorrectly preventing UpdatePing from refreshing the alloc idx (since expired short-circuits those actions), which may have caused runners in drain to not properly update their allocation state.

Condition change in update_alloc_idx — Swapping from tx.exists(&old_alloc_key) to !draining for the UpdatePing path is a behavioral change worth noting: the old check required the alloc key to already exist before updating it (idempotent-safe), while the new check skips the update unconditionally when draining. This is the intended behavior, but it is worth confirming that a draining runner that somehow ends up in UpdatePing should never refresh its index.

Minor comment placement (lib.rs) — The comment between } and else is a bit unusual:

if let Ok(LifecycleResult::Evicted) = &lifecycle_res {
    lifecycle_res = Err(errors::WsError::Eviction.build());
}
// Clear alloc idx if not evicted
else {

Moving the comment inside the else block or before the if would be cleaner.


API Auth for Actor Get

if query.actor_ids.is_none() && query.actor_id.is_empty() && query.key.is_none() {
    ctx.auth().await?;
} else {
    ctx.skip_auth();
}

Security concern worth discussing: This change allows completely unauthenticated access to actors when any of actor_ids, actor_id, or key are provided. A few questions:

  1. Is the intent that all actors are readable without auth if you know the ID/key? Or only actors that have been explicitly marked public?
  2. What does ctx.skip_auth() do beyond skipping the auth check — does it still establish any identity context, or is the request treated as fully anonymous?
  3. If an actor contains sensitive state, any caller who can guess or enumerate an actor ID could read it unauthenticated. Is there a namespace scoping or project scoping check downstream that prevents cross-project reads?

The comment // Reading is allowed, list requires auth does not fully explain the intent. A more descriptive comment explaining why reads by ID are allowed without auth (e.g., "actors with known IDs are considered public for SDK discovery") would help future maintainers.


Small Wins

  • tracing::info!(db_path=%db_path.display(), ...) — Good fix. % (Display) is more readable than ? (Debug) for PathBuf.
  • Downgrading "critical: failed to evict runner..." to a plain error log is appropriate now that eviction is handled separately.
  • The mark_eligible warn log for non-empty notifications is a useful diagnostic addition.

Summary

Area Assessment
Eviction lifecycle refactor Correct and clean
Draining/Expired separation Correct fix
Alloc idx condition change Likely correct, worth a comment on why
Auth bypass for actor reads Needs discussion / clarification on intent
Logging improvements Good

The runner-side changes look solid. The auth change is the part that warrants the most scrutiny before merging.

@MasterPtato MasterPtato force-pushed the 03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get branch from 5b2bead to cfc4fad Compare March 17, 2026 20:38
@MasterPtato MasterPtato force-pushed the 03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get branch from cfc4fad to 65280b5 Compare March 18, 2026 22:12
@MasterPtato MasterPtato force-pushed the 03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get branch from 65280b5 to 84dbf32 Compare March 21, 2026 01:55
@MasterPtato MasterPtato mentioned this pull request Mar 21, 2026
11 tasks
@MasterPtato MasterPtato force-pushed the 03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get branch from 84dbf32 to 90c2e97 Compare March 24, 2026 00:30
@MasterPtato MasterPtato mentioned this pull request Mar 24, 2026
11 tasks
@MasterPtato MasterPtato force-pushed the 03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get branch 2 times, most recently from ebdaa13 to 63b3a1f Compare March 25, 2026 00:05
@MasterPtato MasterPtato force-pushed the 03-17-fix_runner_alloc_idx_logic_api_auth_for_actor_get branch from 63b3a1f to 73107a6 Compare March 26, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant