Skip to content

feat(actor-template): per-container securityContext#73

Open
Davanum Srinivas (dims) wants to merge 4 commits into
agent-substrate:mainfrom
dims:feat/actor-template-capabilities
Open

feat(actor-template): per-container securityContext#73
Davanum Srinivas (dims) wants to merge 4 commits into
agent-substrate:mainfrom
dims:feat/actor-template-capabilities

Conversation

@dims
Copy link
Copy Markdown
Collaborator

@dims Davanum Srinivas (dims) commented May 24, 2026

Add an opt-in securityContext block on ActorTemplate.spec.containers[], plumbed through ateletpb to atelet's OCI bundle builder. Templates that omit it produce the same OCI bundle as before.

Two fields are exposed:

  • capabilities.add — Linux capabilities to grant on top of the default sandbox set (CAP_AUDIT_WRITE, CAP_KILL, CAP_NET_BIND_SERVICE). Entries may be written with or without the CAP_ prefix; case is normalised; duplicates collapse against the defaults.

  • runAsUser / runAsGroup — the UID and GID to start the container process as. Unset preserves atelet's existing default of root.

The motivating workload is NVIDIA OpenShell's openshell-sandbox supervisor, which needs CAP_NET_ADMIN, CAP_SETUID, CAP_SETGID to configure the actor's network and user namespaces, and a non-root start UID for the supervisor process itself. Capabilities alone are not enough — the entry point still runs as root until something drops privileges.

Test plan:

  • go vet clean on touched packages
  • Unit tests for resolveCapabilities: defaults, prefix normalisation, case folding, dedup, blank-entry skip
  • ContainerSecurityContext DeepCopy round-trip with pointer-isolation assertions for Capabilities.Add and RunAsUser
  • cmd/ateapi/internal/controlapi workflow tests pass with the new copy block in resume + suspend
  • kind end-to-end with a template that opts into both fields

Add an opt-in `securityContext.capabilities.add` field on
`ActorTemplate.spec.containers[]`. Empty templates produce the same
OCI bundle as before — the default sandbox set
(`CAP_AUDIT_WRITE`, `CAP_KILL`, `CAP_NET_BIND_SERVICE`) still applies
unconditionally; the `add` list extends it on top.

The field plumbs through `ateletpb` to atelet's OCI bundle builder:

  ActorTemplate.spec.containers[].securityContext.capabilities.add
    → ateletpb.Container.security_context.capabilities.add
    → resolveCapabilities() in cmd/atelet/oci.go
    → process.capabilities.{Bounding,Effective,Inheritable,Permitted}

`resolveCapabilities` normalises each entry to its `CAP_…` form so
templates may write either `NET_ADMIN` or `CAP_NET_ADMIN`, and
de-duplicates against the default set. The pause container always
uses the default set unmodified — it never carries the workload's
capabilities.

The motivating workload is NVIDIA OpenShell's `openshell-sandbox`
supervisor, which needs `CAP_NET_ADMIN`, `CAP_SETUID`, `CAP_SETGID`
to configure the actor's network and user namespaces before
launching the inner workload. A gVisor compatibility spike confirmed
runsc honours the OCI cap set exactly: granting `CAP_SETUID` and
`CAP_SETGID` unblocks `setresuid` inside the actor, while
`unshare(CLONE_NEWNET)` remains refused regardless of caps
(architectural refusal in the sentry, unrelated to capability bits).

Tests cover `resolveCapabilities` normalisation/dedup/blank-handling
and round-trip DeepCopy of `ContainerSecurityContext`.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Add `RunAsUser` and `RunAsGroup` (both `*int64`) to
`ContainerSecurityContext`, plumbed through `ateletpb.SecurityContext`
to atelet's OCI bundle builder. Unset fields preserve the existing
behaviour of `Process.User.{UID,GID} = 0` (root); a set value lands
directly in the OCI spec for that container.

The pause container always runs as root: `prepareOCIDirectory` is
called with `0, 0` from both the run and the restore paths. Only the
application container call sites read from
`ctr.GetSecurityContext().GetRunAs{User,Group}()`.

The proto fields are bare `int64` rather than `optional int64`. At the
proto boundary "unset" and "0" both mean root, and atelet's OCI bundle
builder collapses them into the same `Process.User` block, so the
extra nullability buys nothing on the wire. The CRD shape keeps
`*int64` so K8s users can express the usual "unset vs. explicit 0"
distinction in YAML even though the runtime ignores it.

This is the field that actually makes the actor *start* at a non-root
UID. `Capabilities.Add` alone (12b) only enables `setresuid` inside
the running process — useful for supervisors that drop privileges
mid-startup, but the entry point still runs as root until they do.
NVIDIA OpenShell's `prepare_filesystem` step requires `CAP_CHOWN` plus
this field together to chown the workload's `read_write` paths and
hand the namespace over to a non-root supervisor.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@dims Davanum Srinivas (dims) changed the title [WIP] Feat/actor template capabilities feat(actor-template): per-container securityContext May 24, 2026
CI's `./hack/verify/gofmt.sh` rejected the single-space alignment
around the `nil,` / `0, 0,` comments at the pause-container call sites.
gofmt wants the comment column aligned across the `nil,` and `0, 0,`
lines (the longer prefix wins), which means two spaces after `nil,`.

No behaviour change; whitespace only.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The `SecurityContext` field godoc on `Container` named a specific
downstream consumer as the motivating workload. Substrate's public
surface should describe the field in generic terms: any workload that
sets up its own network or user namespaces — for example a privileged
supervisor that hands off to a less-privileged inner process — may
need additional capabilities beyond the default sandbox set.

Also retitle the `TestContainerSecurityContextDeepCopy` fixture
container from a workload-specific name to a generic `app` /
`registry.example/app:test`.

Regenerated `manifests/ate-install/generated/ate.dev_actortemplates.yaml`
to flush the godoc change through to the CRD description.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Davanum Srinivas (dims) added a commit to dims/openshell-driver-substrate that referenced this pull request May 24, 2026
…text)

The substrate-side PR #73 — per-container `securityContext` on
`ActorTemplate.spec.containers[]` with both `capabilities.add` and
`runAsUser` / `runAsGroup` — is the field that lets this driver's
`synthesize_template` start emitting capability adds and a non-root
supervisor start UID once it merges. Empty templates produce the same
OCI bundle as before; opt-in per container.

Surface the PR in three places: the top-of-doc header in poc-intro
(alongside #66 and #67), the §3 "Companion changes" component table,
and the §9 "Where to next" item 8 that was previously an open TODO
about capability plumbing.

Also tidy the embedded `~/notes/...` references in poc-intro: the
local agent-substrate notes (kind-local-dev runbook, Shorewall recipe)
moved from `~/notes/` to `~/notes/agent-substrate/` to mirror the
existing `~/notes/openshell-on-substrate/` layout.

Signed-off-by: Davanum Srinivas <dsrinivas@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant