Add burst-proof admission control and fast resource accounting#187
Add burst-proof admission control and fast resource accounting#187sjmiller609 wants to merge 5 commits intomainfrom
Conversation
|
Addressed the review follow-ups in the latest commits:
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9566638. Configure here.
| state := string(StateStopped) | ||
| if active { | ||
| state = string(StateCreated) | ||
| } |
There was a problem hiding this comment.
Admission allocation uses stale state for "Created" only
Low Severity
allocationFromStoredMetadata maps all active instances to state "Created" and all inactive to "Stopped". GetFullStatus in resource.go filters per-instance allocation breakdowns with a hardcoded check for "Running" || "Paused" || "Created". Since both paths now share the same ListInstanceAllocations source, instances that were previously in "Initializing" state (and excluded from the allocation breakdown) are now included as "Created". This is a subtle behavioral change to the /resources API response for any consumer relying on the previous filtering semantics.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 9566638. Configure here.
| State: state, | ||
| VolumeBytes: volumeBytes, | ||
| } | ||
| } |
There was a problem hiding this comment.
Volume lookups hold admission lock causing contention
Medium Severity
allocationFromStoredMetadata calls m.volumeManager.GetVolume (disk I/O per volume) with context.Background() while every caller holds admissionAllocationsMu.Lock(). This runs on every saveMetadata call via syncAdmissionAllocation and on every setAdmissionAllocationActive call, blocking all concurrent ListInstanceAllocations readers. For instances with many volumes, this lock-holding duration scales linearly with attached volume count.
Reviewed by Cursor Bugbot for commit 9566638. Configure here.


Summary
hypeman resourcesThe main thing to balance in this PR is making admission reservation consistent (i.e. burst-proof) but also not slowing down concurrent creation.
Existing Issue
hypeman resourcescommand too slowWhat This Changes
Testing
go test ./lib/resourcesdev-yul-hypeman-1with a temporary read-only probe built from this codeReserveAllocation: mean27.6 us, p9556.1 usGetFullStatus/hypeman resourcespath: mean20.0 ms, p9526.2 msNote
Medium Risk
Touches core admission/resource accounting paths and introduces new in-memory reservation/caching behavior, which could affect capacity enforcement under concurrency. Changes are internal but impact scheduling decisions across CPU/memory/disk/network/disk I/O.
Overview
Implements burst-proof admission control by adding pending resource reservations for in-flight
create/start/restoreoperations, and releasing them only once the instance is marked visible to allocation tracking.Extends aggregate admission validation to include disk bytes (overlay + volume overlays + images + OCI cache + volumes), and speeds up accounting by caching a conservative instance-allocation view in the instances manager plus cached totals for image/OCI cache bytes and volume bytes (with startup warm-up and incremental refresh on lifecycle/storage changes).
Reviewed by Cursor Bugbot for commit 9566638. Bugbot is set up for automated code reviews on this repo. Configure here.