feat(instances): integrate runtime management with instance manager#64
feat(instances): integrate runtime management with instance manager#64feloy wants to merge 5 commits intokortex-hub:mainfrom
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
📝 WalkthroughWalkthroughAdds runtime management to the instance manager: introduces RuntimeData on instances, a runtime registry, runtime-aware Manager API (RegisterRuntime, Add/Start/Stop/Delete with context and runtime type), and updates commands/tests to register and use a fake runtime. Changes
Sequence Diagram(s)sequenceDiagram
participant Cmd as Command
participant Mgr as Manager
participant Reg as Runtime Registry
participant RT as Runtime
participant Store as Storage
Cmd->>Mgr: RegisterRuntime(fakeRuntime)
activate Mgr
Mgr->>Reg: Register(fakeRuntime)
Reg-->>Mgr: ok
deactivate Mgr
Cmd->>Mgr: Add(ctx, Instance, "fake")
activate Mgr
Mgr->>Reg: Get("fake")
activate Reg
Reg-->>Mgr: fakeRuntime
deactivate Reg
Mgr->>RT: Create(ctx, params)
activate RT
RT-->>Mgr: RuntimeInstance{id, state, info}
deactivate RT
Mgr->>Store: Save(instance with Runtime)
Store-->>Mgr: ok
Mgr-->>Cmd: Instance(with RuntimeData)
deactivate Mgr
Cmd->>Mgr: Start(ctx, instanceID)
activate Mgr
Mgr->>Store: Load(instance)
Store-->>Mgr: Instance(with RuntimeData)
Mgr->>Reg: Get(instance.Runtime.Type)
Reg-->>Mgr: fakeRuntime
Mgr->>RT: Start(ctx, runtimeInstance)
activate RT
RT-->>Mgr: started
deactivate RT
Mgr->>Store: Save(updated Runtime.State)
Store-->>Mgr: ok
Mgr-->>Cmd: success
deactivate Mgr
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📝 Coding Plan for PR comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/instances/instance.go (1)
157-166:⚠️ Potential issue | 🟠 Major
Runtime.Infostill leaks mutable state throughDump()andNewInstanceFromData().
GetRuntimeData()defensively copies the map, but these two paths still share the sameInfomap reference. Mutating theInstanceDatapassed intoNewInstanceFromData()or the value returned byDump()will silently change the live instance.🛡️ Suggested fix
func (i *instance) Dump() InstanceData { + infoCopy := make(map[string]string, len(i.Runtime.Info)) + maps.Copy(infoCopy, i.Runtime.Info) + return InstanceData{ ID: i.ID, Name: i.Name, Paths: InstancePaths{ Source: i.SourceDir, Configuration: i.ConfigDir, }, - Runtime: i.Runtime, + Runtime: RuntimeData{ + Type: i.Runtime.Type, + InstanceID: i.Runtime.InstanceID, + State: i.Runtime.State, + Info: infoCopy, + }, } } @@ + runtimeInfo := RuntimeData{ + Type: data.Runtime.Type, + InstanceID: data.Runtime.InstanceID, + State: data.Runtime.State, + } + if data.Runtime.Info != nil { + runtimeInfo.Info = make(map[string]string, len(data.Runtime.Info)) + maps.Copy(runtimeInfo.Info, data.Runtime.Info) + } + return &instance{ ID: data.ID, Name: data.Name, SourceDir: data.Paths.Source, ConfigDir: data.Paths.Configuration, - Runtime: data.Runtime, + Runtime: runtimeInfo, }, nil }Also applies to: 225-230
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/instances/instance.go` around lines 157 - 166, Dump() and NewInstanceFromData() currently leak mutable Runtime.Info by returning or accepting the same map reference; update both to perform a defensive deep copy of the Runtime.Info map so callers cannot mutate the live instance. Specifically, in instance.Dump() copy i.Runtime.Info into a new map before assigning to the returned InstanceData.Runtime.Info, and in NewInstanceFromData() create a new map and copy values from the incoming InstanceData.Runtime.Info into the instance's Runtime.Info (mirror the behavior of GetRuntimeData()). Ensure the copy handles nil maps safely and preserves all key/value pairs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/cmd/init.go`:
- Around line 62-65: The code currently hard-wires the test runtime by calling
manager.RegisterRuntime(fake.New()) and later persisting workspaces with runtime
type "fake" (see manager.RegisterRuntime(fake.New()) and the workspace
persistence around lines where runtime type is set to "fake"); change this to
accept a runtime type from flags/config, thread that runtime identifier through
the init flow, and initialize/register the correct runtime implementation via a
single bootstrap path instead of always using fake.New(); update the workspace
persistence to save the configured runtime type rather than the literal "fake"
so created workspaces bind to the user-selected runtime.
In `@pkg/cmd/workspace_remove.go`:
- Line 65: preRun() currently sets w.manager =
instances.NewManager(absStorageDir) but does not register any runtimes, so the
later call to w.manager.Delete(cmd.Context(), w.id) runs with an empty runtime
registry and skips backend cleanup; fix by bootstrapping/registering the
supported runtimes immediately after creating the manager and before storing it
on the command (i.e., after instances.NewManager(absStorageDir) call, invoke the
code that registers/bootstraps supported runtimes into that manager so w.manager
has the runtime implementations available before Delete is called).
In `@pkg/instances/manager.go`:
- Around line 384-407: The code currently calls rt.Remove (via
m.runtimeRegistry.Get and rt.Remove) and then proceeds to persist the filtered
instances with m.saveInstances even if removeErr != nil; instead, change the
flow so you do not call m.saveInstances or remove the instance from storage when
runtime cleanup (runtimeInfo from instanceToDelete.GetRuntimeData and rt.Remove)
fails — either return the removeErr immediately (after wrapping/context) or mark
the instance with a recoverable cleanup-pending state and persist that state
(but not deletion) so retry is possible; ensure all references to filtered are
only persisted when removeErr == nil (or when you explicitly set a
cleanup-pending flag on the instance before calling m.saveInstances).
- Around line 236-256: The Start/Stop flows call rt.Start/rt.Stop before
persisting the updated instance list, which can leave runtime state and
instances.json out of sync if m.saveInstances(...) fails; update Start (and
symmetric Stop) to perform the persistent save in a transactional/recovery
style: after calling rt.Start/rt.Stop and computing the updatedInstance (the
local instance struct and RuntimeData), attempt m.saveInstances(instances) and
if it fails, roll back the runtime change by calling rt.Stop/rt.Start
(respectively) to restore the original runtime state, logging and returning the
save error; reuse the same retry/rollback pattern used in Add() to ensure
instances[index] remains consistent and mirror the behavior at the analogous
block around lines 298-324.
---
Outside diff comments:
In `@pkg/instances/instance.go`:
- Around line 157-166: Dump() and NewInstanceFromData() currently leak mutable
Runtime.Info by returning or accepting the same map reference; update both to
perform a defensive deep copy of the Runtime.Info map so callers cannot mutate
the live instance. Specifically, in instance.Dump() copy i.Runtime.Info into a
new map before assigning to the returned InstanceData.Runtime.Info, and in
NewInstanceFromData() create a new map and copy values from the incoming
InstanceData.Runtime.Info into the instance's Runtime.Info (mirror the behavior
of GetRuntimeData()). Ensure the copy handles nil maps safely and preserves all
key/value pairs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9459b466-a7a0-4db4-a53c-7bb4e93a7866
📒 Files selected for processing (8)
pkg/cmd/init.gopkg/cmd/workspace_list_test.gopkg/cmd/workspace_remove.gopkg/cmd/workspace_remove_test.gopkg/instances/instance.gopkg/instances/instance_runtime_test.gopkg/instances/manager.gopkg/instances/manager_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/instances/manager.go (1)
428-457:⚠️ Potential issue | 🟡 MinorReconcile removes instances without runtime cleanup.
Reconcile()removes instances with inaccessible directories from storage but doesn't clean up their associated runtime resources. This could leave orphaned runtime instances (containers, processes) when directories become inaccessible.Consider whether Reconcile should attempt runtime cleanup for removed instances, or at minimum log/return which instances had runtime data that may need manual cleanup.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/instances/manager.go` around lines 428 - 457, Reconcile currently drops instances whose directories are inaccessible but never attempts to clean up their runtime resources; update manager.Reconcile to for each removed instance attempt runtime cleanup (e.g., call a manager runtime cleanup method such as m.runtime.Cleanup(instance.GetID()) or an instance-level cleanup like instance.CleanupRuntime()), handle/log any cleanup errors without aborting the storage removal, and ensure saveInstances(accessible) still runs; if a runtime cleanup API isn't present, add a cleanup hook on manager (e.g., CleanupRuntimeByID) and/or include the IDs of instances that still may have runtime artifacts in the returned result or logs so callers can perform/manual cleanup.
🧹 Nitpick comments (3)
pkg/runtime/fake/fake_test.go (1)
258-264: Subtest should callt.Parallel()for consistency.Per coding guidelines, all test functions should call
t.Parallel(). The subtests inTestFakeRuntime_InvalidParamsshare the samertinstance which is safe sinceCreateonly reads/writes to distinct entries based on theNameparameter.♻️ Suggested fix
for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { + t.Parallel() _, err := rt.Create(ctx, tt.params) if !errors.Is(err, runtime.ErrInvalidParams) { t.Errorf("Expected ErrInvalidParams, got %v", err) } }) }As per coding guidelines: "All Go test functions must call
t.Parallel()as the first line to ensure faster execution and better resource utilization."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/runtime/fake/fake_test.go` around lines 258 - 264, Add t.Parallel() as the first statement inside the subtest function passed to t.Run in TestFakeRuntime_InvalidParams so each subtest runs concurrently; this is safe because the shared rt instance is used only by rt.Create which operates on distinct entries keyed by tt.params.Name. Locate the anonymous func(t *testing.T) in the loop over tests and insert t.Parallel() at the top of that function to comply with the test parallelization guideline.pkg/cmd/workspace_remove.go (1)
58-63: TODO indicates incomplete implementation requiring follow-up.The comment acknowledges this is temporary scaffolding. Consider creating a tracking issue to implement proper runtime discovery/registration before merging to main.
Would you like me to open an issue to track implementing production runtime registration?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cmd/workspace_remove.go` around lines 58 - 63, The TODO shows temporary test scaffolding where a fake runtime is always registered via manager.RegisterRuntime(fake.New()); create a tracking issue to implement proper production runtime discovery/registration, then update this code: replace the unconditional fake.New() registration with a production-aware registration path (use configuration/feature-flag or environment to enumerate and register real runtimes via manager.RegisterRuntime) and limit fake.New() to tests (use a build tag, test-only helper, or conditional on a test flag); also annotate the TODO with the created issue ID/link so future reviewers know the follow-up task.pkg/instances/manager_runtime_test.go (1)
693-699: Consider using file-appropriate permissions formakeWritable.The helper uses
0755(executable) for restoring file permissions, butinstances.jsonis a data file that doesn't need execute permission. This works but is slightly imprecise.♻️ Suggested improvement
func makeReadOnly(path string) error { - return chmod(path, 0555) + return chmod(path, 0444) } func makeWritable(path string) error { - return chmod(path, 0755) + return chmod(path, 0644) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/instances/manager_runtime_test.go` around lines 693 - 699, The helper makeWritable currently restores file permissions with chmod(path, 0755) which grants unnecessary execute bits for data files like instances.json; update makeWritable to use file-appropriate mode (e.g., 0644) instead of 0755 so files become readable/writable by owner and readable by others, and keep makeReadOnly using 0555 only if you intentionally want to make a file non-writable and non-executable—adjust the call in makeWritable (the chmod invocation) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/cmd/workspace_remove.go`:
- Line 27: The production command imports the test-only runtime package
"github.com/kortex-hub/kortex-cli/pkg/runtime/fake"; remove that import and stop
depending on the fake runtime in workspace removal logic. Replace the direct
fake import/registration with one of: (a) wire in a real runtime implementation
(e.g., podman/docker) where the current runtime registration happens, (b)
implement a configuration-driven runtime selection/registration path that reads
configured/available runtimes and registers only those, or (c) add graceful
handling when no runtime is available (skip runtime cleanup and log a clear
warning) — update the runtime registration code referenced by the TODO phrase
"In production, register only the runtimes that are available/configured." so it
no longer imports or uses the fake package.
---
Outside diff comments:
In `@pkg/instances/manager.go`:
- Around line 428-457: Reconcile currently drops instances whose directories are
inaccessible but never attempts to clean up their runtime resources; update
manager.Reconcile to for each removed instance attempt runtime cleanup (e.g.,
call a manager runtime cleanup method such as
m.runtime.Cleanup(instance.GetID()) or an instance-level cleanup like
instance.CleanupRuntime()), handle/log any cleanup errors without aborting the
storage removal, and ensure saveInstances(accessible) still runs; if a runtime
cleanup API isn't present, add a cleanup hook on manager (e.g.,
CleanupRuntimeByID) and/or include the IDs of instances that still may have
runtime artifacts in the returned result or logs so callers can perform/manual
cleanup.
---
Nitpick comments:
In `@pkg/cmd/workspace_remove.go`:
- Around line 58-63: The TODO shows temporary test scaffolding where a fake
runtime is always registered via manager.RegisterRuntime(fake.New()); create a
tracking issue to implement proper production runtime discovery/registration,
then update this code: replace the unconditional fake.New() registration with a
production-aware registration path (use configuration/feature-flag or
environment to enumerate and register real runtimes via manager.RegisterRuntime)
and limit fake.New() to tests (use a build tag, test-only helper, or conditional
on a test flag); also annotate the TODO with the created issue ID/link so future
reviewers know the follow-up task.
In `@pkg/instances/manager_runtime_test.go`:
- Around line 693-699: The helper makeWritable currently restores file
permissions with chmod(path, 0755) which grants unnecessary execute bits for
data files like instances.json; update makeWritable to use file-appropriate mode
(e.g., 0644) instead of 0755 so files become readable/writable by owner and
readable by others, and keep makeReadOnly using 0555 only if you intentionally
want to make a file non-writable and non-executable—adjust the call in
makeWritable (the chmod invocation) accordingly.
In `@pkg/runtime/fake/fake_test.go`:
- Around line 258-264: Add t.Parallel() as the first statement inside the
subtest function passed to t.Run in TestFakeRuntime_InvalidParams so each
subtest runs concurrently; this is safe because the shared rt instance is used
only by rt.Create which operates on distinct entries keyed by tt.params.Name.
Locate the anonymous func(t *testing.T) in the loop over tests and insert
t.Parallel() at the top of that function to comply with the test parallelization
guideline.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 943efba9-4cf4-4e5d-ac94-acbf17a24c58
📒 Files selected for processing (5)
pkg/cmd/workspace_remove.gopkg/instances/manager.gopkg/instances/manager_runtime_test.gopkg/runtime/fake/fake.gopkg/runtime/fake/fake_test.go
Instances can now have associated runtimes that manage their lifecycle. The manager handles runtime registration, instance creation with runtimes, and starting/stopping runtime instances. Context is propagated through all runtime operations to support cancellation and timeouts. Co-Authored-By: Claude Code (Claude Sonnet 4.5) <noreply@anthropic.com> Signed-off-by: Philippe Martin <phmartin@redhat.com>
The workspace remove command was creating a manager without registering any runtimes, causing Delete() to skip backend cleanup with an empty runtime registry. Now registers the fake runtime in preRun() to enable proper cleanup operations. Additionally, made the fake runtime's Remove() method idempotent to handle the non-persistent nature of fake runtime instances (each New() creates separate in-memory state). This prevents errors in tests where instances are created with one runtime and removed with another. Co-Authored-By: Claude Code (Claude Sonnet 4.5) <noreply@anthropic.com> Signed-off-by: Philippe Martin <phmartin@redhat.com>
The Start and Stop methods were calling runtime operations before persisting state, which could leave runtime state and instances.json out of sync if saveInstances() failed. Now both methods use transactional rollback: if persistence fails, the runtime operation is reversed (Stop after Start, Start after Stop) to restore the original state, mirroring the pattern used in Add(). Co-Authored-By: Claude Code (Claude Sonnet 4.5) <noreply@anthropic.com> Signed-off-by: Philippe Martin <phmartin@redhat.com>
Delete now requires successful runtime cleanup before removing the instance from storage. Previously, the instance would be deleted even if runtime cleanup failed, leaving orphaned runtime instances. Added comprehensive tests with configurable failable runtime to verify transactional rollback in Start/Stop and proper Delete behavior when runtime operations fail. Co-Authored-By: Claude Code (Claude Sonnet 4.5) <noreply@anthropic.com> Signed-off-by: Philippe Martin <phmartin@redhat.com>
Ensures all errors in preRun methods use the outputErrorIfJSON helper to maintain consistent JSON error output when --output=json is specified. Also updates tests to use the new manager.Add() signature with context and runtime type. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Philippe Martin <phmartin@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/instances/instance.go (1)
157-166:⚠️ Potential issue | 🟡 MinorClone
Runtime.Infoon input/output.
GetRuntimeData()is defensive, butNewInstanceFromData()storesdata.Runtime.Infoby reference andDump()returns that same backing map back out. Mutating either the sourceInstanceDataor the dumped snapshot can still mutate the live instance.Also applies to: 225-231
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/instances/instance.go` around lines 157 - 166, Dump() and NewInstanceFromData() currently share the same Runtime.Info map by reference (even though GetRuntimeData() is defensive), allowing external mutation of the live instance; fix this by deep-copying the Runtime.Info map whenever you accept it into the instance and whenever you expose it: in NewInstanceFromData() (and any code path that sets i.Runtime.Info) allocate a new map[string]interface{} and copy all key/value pairs from data.Runtime.Info into it before assigning to i.Runtime.Info, and in Dump() return a copy of i.Runtime.Info (again allocate a new map and copy entries) so callers receive a snapshot rather than the live backing map; apply the same copy approach for the other location noted (the block around lines 225-231) where Runtime.Info is passed through.
🧹 Nitpick comments (1)
pkg/instances/manager_test.go (1)
64-70: Make the fake mirrorinstance.GetRuntimeData()semantics.The production implementation copies
Runtime.Info, but this fake returns the backing map directly. That mismatch can hide aliasing bugs or make tests pass on behavior the real type does not expose.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/instances/manager_test.go` around lines 64 - 70, The fake mirror's GetRuntimeData() on fakeInstance currently returns the backing f.runtime directly which allows callers to mutate Runtime.Info and hide aliasing bugs; change fakeInstance.GetRuntimeData to return a defensive copy of the runtime (at minimum deep-copy the Runtime.Info map) so its semantics match the production implementation that copies Runtime.Info—locate the fakeInstance type and its GetRuntimeData method and construct and return a new RuntimeData value with a copied Info map instead of returning f.runtime directly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/instances/manager.go`:
- Around line 185-190: The rollback/cleanup calls (e.g., m.saveInstances failure
handling that calls rt.Remove(ctx, runtimeInfo.ID)) must not reuse the caller's
ctx; create a fresh bounded context (e.g., via
context.WithTimeout(context.Background(), X)) for best-effort cleanup and ensure
you call its cancel() after use, then call rt.Remove/rt.Stop/rt.Start with that
cleanupCtx and surface both errors as before. Apply the same change to the other
compensation sites that call rt.Stop/rt.Remove/rt.Start so they all use a
dedicated timeout-bound cleanup context instead of the incoming ctx.
- Around line 308-318: After a successful call to rt.Stop(ctx,
runtimeData.InstanceID) but a failing rt.Info(ctx, runtimeData.InstanceID),
avoid returning the error directly because storage may still mark the instance
as running; instead, recover by persisting a conservative "stopped"
snapshot/state for runtimeData (or the storage record) before returning the
error. Concretely, in the function around the Stop and Info calls, detect when
Stop returns nil and Info returns an error, then update the stored runtime
status for runtimeData.InstanceID (or the runtimeData struct) to "stopped" (or
equivalent) and persist that state; alternatively, you may attempt to restart
the runtime (call rt.Start) and only return an error if restart also fails.
Ensure you reference rt.Stop, rt.Info, and runtimeData.InstanceID when making
the change so the storage reflects the actual stopped transition even if Info
failed.
- Around line 414-423: The Delete() flow can leave instances.json inconsistent
if m.saveInstances() fails after rt.Remove(); change to a two-phase, recoverable
delete: 1) mark the target instance with a transient state (e.g.,
Instance.Status = "deleting" or a Deleting flag) and call
m.saveInstances(updatedList) so the storage reflects an in-progress delete, 2)
call rt.Remove(ctx, runtimeInfo.InstanceID) and treat "not found" as success,
and 3) on successful removal clear the instance from the list and call
m.saveInstances(finalList); use the existing Delete() function, runtime removal
call rt.Remove, and m.saveInstances symbols to locate where to add the state
transition and the tolerant re-delete handling. Ensure atomic file writes
(temp+rename) are used by m.saveInstances if not already implemented so
intermediate saves are durable.
---
Outside diff comments:
In `@pkg/instances/instance.go`:
- Around line 157-166: Dump() and NewInstanceFromData() currently share the same
Runtime.Info map by reference (even though GetRuntimeData() is defensive),
allowing external mutation of the live instance; fix this by deep-copying the
Runtime.Info map whenever you accept it into the instance and whenever you
expose it: in NewInstanceFromData() (and any code path that sets i.Runtime.Info)
allocate a new map[string]interface{} and copy all key/value pairs from
data.Runtime.Info into it before assigning to i.Runtime.Info, and in Dump()
return a copy of i.Runtime.Info (again allocate a new map and copy entries) so
callers receive a snapshot rather than the live backing map; apply the same copy
approach for the other location noted (the block around lines 225-231) where
Runtime.Info is passed through.
---
Nitpick comments:
In `@pkg/instances/manager_test.go`:
- Around line 64-70: The fake mirror's GetRuntimeData() on fakeInstance
currently returns the backing f.runtime directly which allows callers to mutate
Runtime.Info and hide aliasing bugs; change fakeInstance.GetRuntimeData to
return a defensive copy of the runtime (at minimum deep-copy the Runtime.Info
map) so its semantics match the production implementation that copies
Runtime.Info—locate the fakeInstance type and its GetRuntimeData method and
construct and return a new RuntimeData value with a copied Info map instead of
returning f.runtime directly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6ef56b94-3202-4f11-b86e-f0463f5ff2bf
📒 Files selected for processing (11)
pkg/cmd/init.gopkg/cmd/workspace_list_test.gopkg/cmd/workspace_remove.gopkg/cmd/workspace_remove_test.gopkg/instances/instance.gopkg/instances/instance_runtime_test.gopkg/instances/manager.gopkg/instances/manager_runtime_test.gopkg/instances/manager_test.gopkg/runtime/fake/fake.gopkg/runtime/fake/fake_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- pkg/runtime/fake/fake.go
- pkg/cmd/workspace_list_test.go
- pkg/cmd/init.go
- pkg/runtime/fake/fake_test.go
- pkg/instances/manager_runtime_test.go
- pkg/cmd/workspace_remove.go
| if err := m.saveInstances(instances); err != nil { | ||
| // Cleanup: try to remove the runtime instance | ||
| removeErr := rt.Remove(ctx, runtimeInfo.ID) | ||
| if removeErr != nil { | ||
| return nil, fmt.Errorf("failed to save instance and cleanup runtime: save error: %w, remove error: %v", err, removeErr) | ||
| } |
There was a problem hiding this comment.
Use a fresh context for rollback and cleanup.
These compensation calls run after the runtime state has already changed. If the caller's ctx is already canceled or past deadline, Remove/Stop/Start will short-circuit and leave leaked or desynchronized runtime state behind. Use a fresh bounded context for best-effort recovery.
Also applies to: 257-264, 336-343
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/instances/manager.go` around lines 185 - 190, The rollback/cleanup calls
(e.g., m.saveInstances failure handling that calls rt.Remove(ctx,
runtimeInfo.ID)) must not reuse the caller's ctx; create a fresh bounded context
(e.g., via context.WithTimeout(context.Background(), X)) for best-effort cleanup
and ensure you call its cancel() after use, then call rt.Remove/rt.Stop/rt.Start
with that cleanupCtx and surface both errors as before. Apply the same change to
the other compensation sites that call rt.Stop/rt.Remove/rt.Start so they all
use a dedicated timeout-bound cleanup context instead of the incoming ctx.
| // Stop the runtime instance | ||
| err = rt.Stop(ctx, runtimeData.InstanceID) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to stop runtime instance: %w", err) | ||
| } | ||
|
|
||
| // Get updated runtime info | ||
| runtimeInfo, err := rt.Info(ctx, runtimeData.InstanceID) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get runtime info: %w", err) | ||
| } |
There was a problem hiding this comment.
Stop() needs recovery when Info() fails after a successful stop.
Once rt.Stop() returns nil, the backend has already transitioned. Returning on the later rt.Info() error leaves storage saying "running" while the runtime is actually stopped. Either persist a conservative "stopped" snapshot from the existing metadata or restart the runtime before returning the error.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/instances/manager.go` around lines 308 - 318, After a successful call to
rt.Stop(ctx, runtimeData.InstanceID) but a failing rt.Info(ctx,
runtimeData.InstanceID), avoid returning the error directly because storage may
still mark the instance as running; instead, recover by persisting a
conservative "stopped" snapshot/state for runtimeData (or the storage record)
before returning the error. Concretely, in the function around the Stop and Info
calls, detect when Stop returns nil and Info returns an error, then update the
stored runtime status for runtimeData.InstanceID (or the runtimeData struct) to
"stopped" (or equivalent) and persist that state; alternatively, you may attempt
to restart the runtime (call rt.Start) and only return an error if restart also
fails. Ensure you reference rt.Stop, rt.Info, and runtimeData.InstanceID when
making the change so the storage reflects the actual stopped transition even if
Info failed.
| // Remove runtime instance first - if this fails, we don't delete from storage | ||
| if err := rt.Remove(ctx, runtimeInfo.InstanceID); err != nil { | ||
| return fmt.Errorf("failed to remove runtime instance: %w", err) | ||
| } | ||
| } | ||
|
|
||
| // Only save the updated instances list after successful runtime cleanup | ||
| if err := m.saveInstances(filtered); err != nil { | ||
| return fmt.Errorf("runtime removed but failed to save instances: %w", err) | ||
| } |
There was a problem hiding this comment.
Delete() still loses consistency if the final save fails.
After rt.Remove() succeeds, a write error leaves instances.json pointing at a runtime that no longer exists. Retries are only safe if every backend treats "already removed" as success, so this path needs a recoverable intermediate state or tolerant re-delete semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/instances/manager.go` around lines 414 - 423, The Delete() flow can leave
instances.json inconsistent if m.saveInstances() fails after rt.Remove(); change
to a two-phase, recoverable delete: 1) mark the target instance with a transient
state (e.g., Instance.Status = "deleting" or a Deleting flag) and call
m.saveInstances(updatedList) so the storage reflects an in-progress delete, 2)
call rt.Remove(ctx, runtimeInfo.InstanceID) and treat "not found" as success,
and 3) on successful removal clear the instance from the list and call
m.saveInstances(finalList); use the existing Delete() function, runtime removal
call rt.Remove, and m.saveInstances symbols to locate where to add the state
transition and the tolerant re-delete handling. Ensure atomic file writes
(temp+rename) are used by m.saveInstances if not already implemented so
intermediate saves are durable.
Instances can now have associated runtimes that manage their lifecycle. The manager handles runtime registration, instance creation with runtimes, and starting/stopping runtime instances. Context is propagated through all runtime operations to support cancellation and timeouts.
Fixes #63