Skip to content

open-gitagent/gitmachine-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitMachine

A Python package for running git-aware sandboxed virtual machines. Clone a repo into an isolated VM, run commands, and auto-commit results back -- with branch-based session persistence.

GitMachine is the infrastructure layer for GitAgent. It handles VM lifecycle + git operations so agent orchestration layers can focus on what matters.

Install

pip install gitmachine

Quick Start

import asyncio
from gitmachine import GitMachine, E2BMachine

machine = E2BMachine(api_key="e2b_...")

gm = GitMachine(
    machine=machine,
    repository="https://github.com/user/my-agent.git",
    token="ghp_...",
    session="feature-work",
    auto_commit=True,
    env={"ANTHROPIC_API_KEY": "sk-..."},
    on_start=lambda gm: gm.run("pip install -r requirements.txt"),
    on_event=lambda event, data, gm: print(f"[{event}]", data),
)

async def main():
    await gm.start()   # VM up -> repo cloned -> branch checked out -> on_start runs

    result = await gm.run("python agent.py --review")
    print(result.stdout)

    await gm.pause()    # auto-commits -> VM paused

    # ... later ...

    await gm.resume()   # VM comes back, repo state intact
    await gm.run("python agent.py --continue")

    await gm.stop()     # auto-commits -> pushes -> VM torn down

asyncio.run(main())

Architecture

+------------------------------------------+
|              GitMachine                   |
|  git clone / commit / push / sessions    |
|  auto-commit on pause/stop               |
+------------------------------------------+
|              Machine (ABC)               |
|  start / pause / resume / stop           |
|  execute / read_file / write_file        |
+------------------------------------------+
|            E2BMachine                    |
|  E2B AsyncSandbox wrapper               |
|  Full API access via get_sandbox()      |
+------------------------------------------+

Machine -- Abstract base class. Any VM provider (E2B, Fly, bare metal) implements this.

E2BMachine -- Concrete implementation using E2B sandboxes. Thin wrapper that exposes the full E2B SDK via get_sandbox().

GitMachine -- Wraps any Machine with git lifecycle management. Clones a repo on start, auto-commits on pause/stop, persists sessions as branches.

API

E2BMachine

machine = E2BMachine(
    api_key="...",          # defaults to E2B_API_KEY env var
    template="base",        # default "base"
    timeout=300,            # seconds, default 300
    envs={"KEY": "val"},    # sandbox env vars
    metadata={"k": "v"},   # sandbox metadata
)

Implements the Machine interface and exposes additional E2B-specific methods:

Method Description
get_sandbox() Raw AsyncSandbox instance for full API access
get_sandbox_id() Current sandbox ID
set_timeout(seconds) Extend sandbox timeout
get_info() Sandbox metadata
make_dir(path) Create directory
remove(path) Remove file/directory
exists(path) Check if path exists
is_running() Check sandbox status
get_host(port) Get host address for a sandbox port

Reconnect to an existing sandbox:

machine = await E2BMachine.connect("sandbox_id", api_key="...")

GitMachine

gm = GitMachine(
    machine=machine,                         # VM provider instance
    repository="https://github.com/u/r.git", # git repo URL (https)
    token="ghp_...",                          # PAT for git auth
    on_start=async_callback,                  # called after clone, before work
    on_end=async_callback,                    # called before stop
    env={"KEY": "val"},                       # env vars for all commands
    timeout=300,                              # seconds
    session="branch-name",                    # branch name for persistence
    auto_commit=True,                         # default True
    on_event=event_callback,                  # lifecycle event callback
)

Lifecycle

Method Description
start() Start VM -> clone repo -> checkout session branch -> run on_start
pause() Auto-commit (if enabled) -> pause VM
resume() Resume VM
stop() Auto-commit -> push -> run on_end -> kill VM

Git Operations

Method Description
diff() Git diff against HEAD
commit(message=None) Stage all + commit, returns SHA or None if clean
push() Push to origin
pull() Pull from origin
hash() Current HEAD SHA

All git operations use the while_running pattern -- if the machine is paused, they transparently resume, do the work, and re-pause.

Runtime

Method Description
run(command, options=None) Execute a command in the sandbox
update(env=None, on_update=None) Update environment variables
logs() Get command execution history

Properties

Property Description
state Current MachineState (idle/running/paused/stopped)
path Repo path inside the VM
id Machine / sandbox ID

Reconnect to an already-running GitMachine:

machine = await E2BMachine.connect(sandbox_id, api_key="...")
gm = await GitMachine.connect(
    machine,
    repository="https://github.com/user/repo.git",
    token="ghp_...",
    session="feature-work",
)

Events

The on_event callback receives lifecycle events:

Event Data
started {"session": ..., "repo_path": ...}
paused {}
resumed {}
stopping {}
stopped {}
committed {"sha": ..., "message": ...}
pushed {}
pulled {}
reconnected {"id": ...}

Sessions

Sessions map to git branches. When you specify a session:

  1. Start -- checks out the branch (creates it if new)
  2. Pause -- auto-commits all changes to the branch
  3. Stop -- auto-commits and pushes to the branch
  4. Next run -- resumes from the same branch state

This gives you persistent, resumable agent sessions backed by git.

Patterns

Sequential

await gm.start()
result = await gm.run("python agent.py")
await gm.pause()
# ... later ...
await gm.resume()
await gm.run("python agent.py --continue")
await gm.stop()

Fire-and-Forget

import asyncio

await gm.start()

# Kick off work in the background
async def background_work():
    await gm.run("python long_task.py")
    await gm.stop()

task = asyncio.create_task(background_work())

# Save the sandbox ID for later reconnection
sandbox_id = gm.id

# ... in a different process / request ...

machine = await E2BMachine.connect(sandbox_id, api_key="...")
gm = await GitMachine.connect(
    machine,
    repository="https://github.com/user/repo.git",
    token="ghp_...",
    session="bg-work",
)
print(gm.state)  # MachineState.RUNNING

Extending

Custom Machine Provider

from gitmachine import Machine, MachineState, ExecutionResult

class FlyMachine(Machine):
    @property
    def id(self) -> str | None: ...

    @property
    def state(self) -> MachineState: ...

    async def start(self) -> None: ...
    async def pause(self) -> None: ...
    async def resume(self) -> None: ...
    async def stop(self) -> None: ...

    async def execute(self, command, options=None) -> ExecutionResult: ...
    async def read_file(self, path) -> str: ...
    async def write_file(self, path, content) -> None: ...
    async def list_files(self, path) -> list[str]: ...

# Use it with GitMachine
gm = GitMachine(
    machine=FlyMachine(region="ord"),
    repository="...",
    token="...",
)

License

MIT

About

GitMachine for Python — git-aware sandboxed VM orchestration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages