Skip to content

Support creating droplets from snapshots (clone workflow) #52

@gwpl

Description

@gwpl

AI Agent with Greg: We wanted to snapshot a perfectly configured droplet and spin up 10 clones from it — like a sysadmin photocopier. Turns out dropkit's create command always injects cloud-init, which re-runs on the snapshot and causes chaos (user creation fails, .zshrc gets overwritten, unconditional reboot). Time to teach dropkit the art of cloning. 🧬🖨️

Use Case

Snapshot → Clone N droplets — a common workflow for:

  • Spinning up pre-configured build/test environments
  • Creating identical workshop/training machines
  • Scaling a known-good configuration quickly
# The dream:
dropkit create my-worker-1 --from-snapshot 12345678 --size s-4vcpu-8gb
dropkit create my-worker-2 --from-snapshot 12345678 --size s-4vcpu-8gb
# ... or even:
for i in $(seq 1 10); do
  dropkit create "worker-$i" --from-snapshot 12345678
done

The Problem

dropkit create always renders and sends cloud-init user_data to the DigitalOcean API. When creating from a snapshot:

  1. Cloud-init re-runs — DO assigns a new droplet ID → instance-ID mismatch → cloud-init treats it as first boot
  2. The template is NOT idempotent — several critical issues:
    • users: directive fails or is skipped if user already exists
    • write_files: overwrites .zshrc (loses user customizations)
    • runcmd: ends with unconditional reboot
    • git config --global resets any user-modified values

Cloud-init is fundamentally a provisioning tool, not an idempotent configuration manager. Making the template fully idempotent is possible but would be a significant effort touching every directive.

Current State

Component Exists? Notes
api.create_droplet_from_snapshot() ✅ Yes Used by wake command, takes snapshot ID, no user_data
dropkit create --image ✅ Yes But always sends cloud-init; image is a slug, not snapshot ID
dropkit wake ✅ Yes Restores from hibernation snapshot only (expects dropkit-<name> naming + metadata tags)
Snapshot-based create without cloud-init ❌ No The missing piece

Proposed Approaches

Option A: --from-snapshot <id> flag on create (Recommended — simplest)

Add a --from-snapshot flag to dropkit create that:

  • Uses api.create_droplet_from_snapshot() instead of api.create_droplet()
  • Skips cloud-init rendering and sending entirely
  • Skips cloud-init completion monitoring
  • Still performs: wait for active, SSH config setup, project assignment
  • Optionally still runs Tailscale setup (snapshot may not have it)
# Mutually exclusive with --image
@app.command()
def create(
    ...
    from_snapshot: int | None = typer.Option(
        None, "--from-snapshot",
        help="Create from snapshot ID (skips cloud-init)"
    ),
    ...
):

Pros: Minimal change (~30 lines), reuses existing API method, clear intent
Cons: Slightly different code path within create, snapshot ID must be known by user

Option B: --no-cloud-init flag (More general)

A flag to skip cloud-init regardless of image source. Combined with --image <snapshot-id>:

dropkit create my-box --image 12345678 --no-cloud-init

Pros: More composable, works with any image scenario
Cons: Two flags needed, easy to forget --no-cloud-init with a snapshot (leading to the reboot-of-doom)

Option C: Make cloud-init template idempotent (Long-term)

Refactor the template to be safe for re-execution:

  • Guard user creation: id {{ username }} || useradd ...
  • Use marker files: [ -f /etc/dropkit/.initialized ] || ...
  • Remove unconditional reboot; use cloud-init-per instance
  • Make write_files conditional or append-only

Pros: dropkit create --image <snapshot-id> "just works"
Cons: Significant template refactor, hard to test all edge cases, changes behavior for fresh installs too

Option D: New dropkit clone command (Most ergonomic)

Dedicated command for the clone workflow:

dropkit clone my-worker --from my-golden-image --count 5 --size s-4vcpu-8gb

Pros: Best UX, can add clone-specific features (auto-naming, parallel creation)
Cons: Largest scope, new command surface area

Recommendation

Start with Option A (--from-snapshot). It's the smallest change, reuses existing infrastructure, and solves the immediate need. Options C and D can follow later as enhancements.

Happy to implement whichever approach the team prefers!

🤖 Generated with Claude Code — your AI that learned the hard way that cloud-init and snapshots are like mixing sudo with optimism

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions