Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .agents/skills/debug-openshell-cluster/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,10 @@ helm -n openshell get values openshell | grep -E 'repository|tag|supervisorImage

The gateway image and `server.supervisorImage` should use the same build tag in branch and E2E deploys. A stale supervisor image can make sandbox behavior lag behind gateway policy or proto changes.

For local/external pull mode (the default local path via `mise run cluster`), local images are tagged to the configured local registry base, pushed to that registry, and pulled by k3s via the `registries.yaml` mirror endpoint. The `cluster` task pushes prebuilt local tags (`openshell/*:dev`, falling back to `localhost:5000/openshell/*:dev` or `127.0.0.1:5000/openshell/*:dev`).

Gateway image builds stage a partial Rust workspace from `deploy/docker/Dockerfile.images`. If cargo fails with a missing manifest under `/build/crates/...`, or an imported symbol exists locally but is missing in the image build, verify that every current gateway dependency crate, including `openshell-driver-docker`, `openshell-driver-kubernetes`, and `openshell-ocsf`, is copied into the staged workspace there.

For plaintext local evaluation, confirm the chart has:

```bash
Expand Down
42 changes: 28 additions & 14 deletions architecture/gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,12 @@ workloads.

- Authenticate clients and sandbox callbacks.
- Serve gRPC APIs for sandbox lifecycle, provider management, policy updates,
settings, inference configuration, logs, and watch streams.
- Serve HTTP endpoints for health, SSH tunnel upgrades, and edge-auth flows.
settings, inference configuration, logs, watch streams, and relay forwarding.
- Serve HTTP endpoints for health, WebSocket tunnels, and edge-auth flows.
- Persist domain objects in SQLite or Postgres.
- Resolve provider credentials and inference bundles for sandbox supervisors.
- Coordinate supervisor relay sessions for connect, exec, and file sync.
- Coordinate supervisor relay sessions for connect, exec, file sync, and
service forwarding.

The gateway does not enforce agent network policy at request time. That happens
inside each sandbox, where the supervisor and proxy can observe local process
Expand Down Expand Up @@ -44,7 +45,7 @@ The gateway API is organized around platform objects and operational streams:

| Area | Examples |
|---|---|
| Sandbox lifecycle | Create, list, delete, watch, exec, SSH session bootstrap. |
| Sandbox lifecycle | Create, list, delete, watch, exec, SSH session bootstrap, ForwardTcp service forwarding. |
| Providers | Store provider records, discover credentials, resolve runtime environment. |
| Policy and settings | Get effective sandbox config, update sandbox policy, manage global settings. |
| Inference | Set gateway-level model/provider config and resolve sandbox route bundles. |
Expand Down Expand Up @@ -115,22 +116,35 @@ sequenceDiagram
participant CLI
participant GW as Gateway
participant SUP as Sandbox supervisor
participant SSH as Sandbox SSH socket
participant Target as Sandbox target

SUP->>GW: ConnectSupervisor stream
CLI->>GW: connect / exec / sync request
GW->>SUP: RelayOpen(channel)
CLI->>GW: ForwardTcp / exec / sync request
GW->>SUP: RelayOpen(channel, target)
SUP->>Target: Dial SSH socket or loopback service
SUP->>GW: RelayStream(channel)
SUP->>SSH: Bridge bytes to Unix socket
CLI->>GW: Client bytes
GW-->>CLI: Client bytes
GW->>SUP: Relay bytes
SUP-->>GW: Relay bytes
```

The same relay pattern backs interactive SSH, command execution, and file sync.
The gateway tracks live sessions in memory and persists session records so
tokens can expire or be revoked.
The same relay pattern backs interactive SSH, command execution, file sync, and
local service forwarding. The gateway tracks live sessions in memory and
persists session records so tokens can expire or be revoked.

`ForwardTcp` is the client-facing byte stream for SSH and service forwarding.
The first frame is a `TcpForwardInit` that carries the sandbox ID, an
authorization token from `CreateSshSession`, and an explicit target:
`target.ssh` for the sandbox SSH socket or `target.tcp` for a loopback service
inside the sandbox. The gateway validates the token and sandbox readiness,
sends a targeted `RelayOpen` to the supervisor, then bridges
`TcpForwardFrame::Data` to `RelayFrame::Data` until either side closes.

For `target.tcp`, the gateway only accepts loopback destinations such as
`localhost`, `127.0.0.0/8`, or `::1`. The gateway never needs to know or dial a
sandbox pod IP; supervisors connect outbound and bridge only the explicit target
requested for that relay.

## PKI Bootstrap

Expand All @@ -143,13 +157,13 @@ created. Both deployment paths use it:
| Filesystem | `--output-dir <DIR>` | `<dir>/{ca.crt, ca.key, server/tls.{crt,key}, client/tls.{crt,key}}`. Also copies client materials to `$XDG_CONFIG_HOME/openshell/gateways/openshell/mtls/` for CLI auto-discovery. |

On Kubernetes, the Helm chart runs the command via a pre-install/pre-upgrade
hook Job using the gateway image itself no separate cert-generation image,
hook Job using the gateway image itself -- no separate cert-generation image,
no extra mirror burden in air-gapped environments. On the RPM gateway, the
same command runs from the systemd unit's `ExecStartPre` to bootstrap PKI
into the user's state directory on first start.

Both modes share the same idempotency contract: all targets present skip;
partial state fail with a recovery hint; nothing present generate and
Both modes share the same idempotency contract: all targets present -> skip;
partial state -> fail with a recovery hint; nothing present -> generate and
write. This guards mTLS continuity across restarts and upgrades while still
recovering cleanly if an operator deletes everything and starts over.

Expand Down
1 change: 1 addition & 0 deletions crates/openshell-cli/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ tokio-tungstenite = { workspace = true }

# Streams
futures = { workspace = true }
tokio-stream = { workspace = true }
nix = { workspace = true }

# URL parsing
Expand Down
45 changes: 44 additions & 1 deletion crates/openshell-cli/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ use clap_complete::engine::ArgValueCompleter;
use clap_complete::env::CompleteEnv;
use miette::Result;
use owo_colors::OwoColorize;
use std::collections::HashMap;
use std::io::Write;
use std::path::PathBuf;

Expand Down Expand Up @@ -266,6 +267,7 @@ const FORWARD_EXAMPLES: &str = "\x1b[1mALIAS\x1b[0m
\x1b[1mEXAMPLES\x1b[0m
$ openshell forward start 8080
$ openshell forward start 3000 my-sandbox
$ openshell forward service my-sandbox --target-port 8000 --local 8000
$ openshell forward stop 8080
$ openshell forward list
";
Expand Down Expand Up @@ -1612,6 +1614,26 @@ enum ForwardCommands {
/// List active port forwards.
#[command(help_template = LEAF_HELP_TEMPLATE, next_help_heading = "FLAGS")]
List,

/// Forward a local TCP port to a loopback service inside a sandbox over gRPC.
#[command(help_template = LEAF_HELP_TEMPLATE, next_help_heading = "FLAGS")]
Service {
/// Sandbox name (defaults to last-used sandbox).
#[arg(add = ArgValueCompleter::new(completers::complete_sandbox_names))]
name: Option<String>,

/// Target service port inside the sandbox.
#[arg(long)]
target_port: u16,

/// Target service host inside the sandbox. Phase 1 accepts loopback only.
#[arg(long, default_value = "127.0.0.1")]
target_host: String,

/// Local bind address and port: `[bind_address:]port`. Defaults to the target port. Use port 0 for dynamic assignment.
#[arg(long)]
local: Option<String>,
},
}

#[tokio::main]
Expand Down Expand Up @@ -1854,6 +1876,27 @@ async fn main() -> Result<()> {
}
}
}
ForwardCommands::Service {
name,
target_port,
target_host,
local,
} => {
let ctx = resolve_gateway(&cli.gateway, &cli.gateway_endpoint)?;
let mut tls = tls.with_gateway_name(&ctx.name);
apply_auth(&mut tls, &ctx.name);
let name = resolve_sandbox_name(name, &ctx.name)?;
let local = local.unwrap_or_else(|| target_port.to_string());
run::service_forward_tcp(
&ctx.endpoint,
&name,
Some(&local),
&target_host,
target_port,
&tls,
)
.await?;
}
ForwardCommands::Start {
port,
name,
Expand Down Expand Up @@ -2237,7 +2280,7 @@ async fn main() -> Result<()> {
};

// Parse --label flags into a HashMap<String, String>.
let mut labels_map = std::collections::HashMap::new();
let mut labels_map = HashMap::new();
for label_str in &labels {
let parts: Vec<&str> = label_str.splitn(2, '=').collect();
if parts.len() != 2 {
Expand Down
Loading
Loading