Conversation
✱ Stainless preview buildsThis PR will update the Edit this comment to update it. It will appear in the SDK's changelogs. ✅ hypeman-typescript studio · code · diff
✅ hypeman-openapi studio · code · diff
✅ hypeman-go studio · code · diff
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
There was a problem hiding this comment.
Automated Review: Add standby compression start delay
Overall this is a well-structured PR. The state machine extension (pending_delay → running) is clean, the preemption metric suppression for pending jobs is correctly implemented, and test coverage is thorough. A few items worth discussing:
Design: No upper bound on compression delay
parseOptionalDuration and normalizeStandbyCompressionDelay reject negative values but allow arbitrarily large positive durations. A user could set compression_delay: "8760h" which would keep a job in pending_delay state for a year, holding a slot in the compressionJobs map and effectively meaning "never compress" without being explicit about it. Consider adding a reasonable ceiling (e.g. 1h or configurable max) or at least documenting the expected range in the OpenAPI description.
Correctness: canceledCompressionJob.State read safety
The read of job.state in cancelAndWaitCompressionJob after <-job.done is safe — Go's memory model guarantees that close(job.done) happens-before the receive returns, and all state mutations occur before the close. The returned state correctly reflects whether actual compression work was performed (running) vs the job was still waiting (pending_delay). Good.
Correctness: Preemption metric gating
All five call sites of cancelAndWaitCompressionJob now gate recordSnapshotCompressionPreemption on target.State == compressionJobStateRunning. This is the right call — preemption implies interrupting active work, not canceling a job that hasn't started yet. The new skipped result and separate wait duration metric provide the observability for the pending-delay cancellation path. Clean separation.
Correctness: Timer cleanup
The stopCompressionTimer helper correctly drains the channel after Stop() returns false. In the happy path (<-timer.Chan() fires), the timer has already fired so no cleanup is needed — this is correct.
Pre-existing bug fix: cloneStoredMetadataForFork
The addition of cloneSnapshotPolicy in cloneStoredMetadataForFork (fork.go) fixes what looks like a pre-existing bug — without it, forked instances share a pointer to the parent's SnapshotPolicy, risking mutation side effects. Worth calling out in the PR description since it's a correctness fix independent of the delay feature.
Minor: newCompressionTimer nil receiver guard
func (m *manager) newCompressionTimer(delay time.Duration) compressionTimer {
if m != nil && m.compressionTimerFactory != nil {The m != nil check is defensive against a nil receiver call which would indicate a bug elsewhere. Not a problem, just noting it's an unusual pattern for a method receiver.
Tests
Good coverage of the key paths:
- Delay precedence resolution (override > stored > zero default)
- Negative delay rejection
- Pending-delay cancellation records
skipped(not preemption) ensureSnapshotMemoryReadyskipping pending compression without preemption metric- Fake timer abstraction is clean and threadsafe
LGTM with the upper-bound consideration as the main open question.
Summary
POST /instances/{id}/standbyand a per-instance default insnapshot_policyTesting
go test ./lib/instancesgo test ./cmd/api/api -run 'Test(CreateInstance_MapsStandbyCompressionDelayInSnapshotPolicy|CreateInstance_InvalidStandbyCompressionDelayInSnapshotPolicy|InstanceToOAPI_EmitsStandbyCompressionDelayInSnapshotPolicy|StandbyInstance_MapsCompressionDelay|StandbyInstance_InvalidCompressionDelay|StandbyInstance_InvalidRequest)$'Notes
go test ./cmd/api/apistill hits unrelated environment-dependent volume tests on this machine becausemkfs.ext4is not available in$PATH.