Sandbox pods reference

Exhaustive reference for pod-per-run sandbox execution: configuration flags, pod identity and quota, run-scoped GitHub token injection, pod naming, and the security properties of the model. For the reasoning behind these mechanics, see the Sandbox pod execution deep dive; for the operator/user view, see the Sandbox pod execution experience.

This page documents the sandbox-pod execution surface (where the agent turn runs). The broader sandbox isolation model — filesystem containment, governance, executor selection, and claim lifecycle — is the Sandbox deep dive, and operator install/config is Sandbox setup.

Configuration flags

Flag	Values	Default	Effect
`Sandbox:AgentExecutionMode`	`in-api`, `pod-per-run`	`in-api`	`in-api` runs the agent turn in-process in the API/worker (today's behavior, the rollback path). `pod-per-run` relocates each run's agent turn into its own Kata-isolated sandbox pod via the A2A bridge.
`Sandbox:ReleasePodOnSuspend`	`true`, `false`	`true`	When `pod-per-run` is active and the workflow graph suspends on an external gate (a HITL/review `RequestPort`, or the coordinator idling while it awaits child runs), `true` checkpoints the run and releases the pod back to the warm pool. `false` keeps the pod warm across the suspension for low-latency resume or debugging, at the cost of held capacity.
`Sandbox:Kubernetes:AgentHostClaimCreationGraceSeconds`	Positive integer seconds	`300`	Minimum age before the orphan reaper may delete an AgentHost claim that is absent from the active-run map. The effective grace is the larger of this value and `Sandbox:Kubernetes:AgentHostReadyTimeoutSeconds + 30` seconds.
`AgentHost:KeyVaultUri`	URI	(unset)	Vault URI for the legacy runtime user-token fetch fallback. With the dedicated KV-less sandbox identity (issue #471) this fallback fails closed; the run owner's token is delivered via the brokered `gitHubAccessToken` in `/configure`.
`AgentHost:ExecutionScratchRoot`	Absolute path	`/local-workspace`	Root of the disk-backed emptyDir used for pod-local execution workspaces and package caches.
`AgentHost:ExecutionScratchMinimumFreeBytes`	Non-negative integer bytes	`8589934592` (8 GiB)	Minimum available scratch space required before AgentHost prepares a local workspace. Failure returns typed reason `insufficient_ephemeral_storage`.
`Coordinator:AssemblyBuildTestTimeoutMinutes`	Positive number	`20`	Total assembly Build/Test wall-clock limit. Expiry cancels the gate and releases its retained AgentHost claim.
`Coordinator:AssemblyBuildTestStallTimeoutMinutes`	Positive number	`12`	Maximum interval without a forwarded Build/Test run event before the stall watchdog fails the gate.

Flag semantics

pod-per-run is the only value that activates the bridge. Any other value (the in-api default) keeps execution in-process. There is no separate "pod-per-turn" mode — granularity within pod-per-run is the hybrid model (warm across consecutive turns, release on suspend), governed by Sandbox:ReleasePodOnSuspend, not by a distinct execution-mode value.
ReleasePodOnSuspend only matters under pod-per-run. It is a tuning sub-flag; it never changes the execution-mode value. The release is internal behavior of pod-per-run.
Rollback is a flag flip, not a redeploy. Setting Sandbox:AgentExecutionMode=in-api restores in-process execution immediately. This is the documented mitigation for any instability in the -preview A2A transport — there is no alternate wire transport to deploy. See the A2A reference for the transport's preview status and pinning.

AgentHost user-token delivery is selected by AgentHost:KeyVaultUri in AKS. File/CSI settings exist only for local compatibility; the warm-pool path receives the run owner's token brokered by the API in /configure (issue #471), since the sandbox identity has no Key Vault access.

Pod identity and quota

A pod-per-run sandbox is the same Kata-isolated pod shape the sandbox subsystem already uses, claimed from a warm pool, but now hosting the full agent (worker agents and the coordinator's own agent turns) rather than only ad-hoc shell commands.

Property	Value / behavior
Runtime class	`kata-vm-isolation` — a VM boundary around the container, so each run's secret and execution live inside a per-run microVM and are destroyed with it.
Identity	Dedicated sandbox service account federated to `agentweaver-agenthost-identity`, a managed identity with no Key Vault role assignments (issue #471). Workload identity (federated OIDC) projects only the narrowly-scoped workload-identity token volume — not the full Kubernetes API service-account token — but it grants no vault access, so the sandbox cannot read any user's secrets.
Cluster API access	None. The pod does not automatically receive Kubernetes API credentials; the sandbox stays tokenless for the cluster API even when workload identity is enabled for the model endpoint.
Provisioning	Claimed from a warm pool via a `SandboxClaim`; the executor waits until the claim is bound to a concrete pod. AgentHost uses the shared `agentweaver-agent-host` pool (`replicas: 2`), then receives per-run context through `POST /configure` before `/healthz` is expected to become ready. No separate per-run template or per-run warm pool is created for AgentHost. A claim that stays unbound (pod Pending) while Kubernetes schedules is a legitimate wait — there is no app-side capacity pre-check — surfaced on the child run's stream via `sandbox.provisioning_pending` heartbeats (issue #217).
AgentHost readiness gate	Warm AgentHost pods start in standby. After binding, the executor calls `POST /configure` with run/user/token/KV secret context plus the workspace descriptor, then polls `GET {scheme}://{podIP}:8088/healthz` (bounded `Sandbox:Kubernetes:AgentHostReadyTimeoutSeconds`, default `90`s; `…ReadyPollIntervalMs`, default `1000`) before the first A2A turn. `/configure` is excluded from readiness and returns `409` if called again. The `a2a-sandbox-pod` HttpClient additionally retries connection-refused only.
Transient API resilience	The idempotent claim create and the bind/IP polls (`WaitForBoundAsync`, `GetPodIpAsync`) retry transient Kubernetes API faults up to `MaxK8sAttempts` (3 total) with exponential backoff + jitter (`ExecuteK8sWithRetryAsync`): connection resets (`SocketException 104`/`IOException`/`HttpRequestException`), `429`/`5xx`, and `HttpClient` timeouts. `409 Conflict` is not treated as transient — it is attempt-aware to preserve idempotency (a retry-`409` = our own create that committed before a reset, so the claim is configured, not reused). Caller cancellation is never retried. The non-idempotent `POST /configure` is intentionally excluded (issue #230).
A2A turn authentication	Run launch generates a 256-bit random turn bearer token, sends it to the claimed warm pod in `POST /configure`, and registers it in `IAgentHostTurnTokenRegistry`. `RemoteAgentProxy` sends `Authorization: Bearer {token}` on `message:stream`; each pod accepts only its configured run token.
Tool-approval return path	When the API-side durable approval gate reports `Unknown`, pod-per-run mode forwards the grant/deny to the owning AgentHost pod's authenticated root endpoint so its in-memory gate can resolve.
Per-pod resources	AgentHost requests `500m` CPU, `1Gi` memory, and `1Gi` ephemeral storage; limits are `2000m`, `4Gi`, and `8Gi`. The lower storage request avoids reserving the full workspace budget for each warm standby replica.
Quota	Namespace `ResourceQuota` (`k8s/base/quota.yaml`) bounds only object counts — pod count, sandbox-claim count, PVCs, and storage. It no longer caps CPU/memory: Kubernetes schedules on pod requests and the cluster autoscaler owns headroom, so a Pending pod waits for the pool to scale rather than being rejected on admission (issue #217). The object-count caps are raised deliberately via a reviewed manifest change, never a live patch.
Lifetime	Bounded by the run and the claim TTL. Under the hybrid model, a pod is released on suspend and a fresh pod is re-claimed on resume; pods never persist past the run.
Egress	Default-deny NetworkPolicy with a narrow allowlist (see Security properties).
Storage	Mounts the shared workspace volume plus a dedicated disk-backed `execution-scratch` emptyDir at `/local-workspace` (`sizeLimit: 8Gi`) for pod-local execution. Assembly Build/Test and preview use `LocalReadOnly`; implementation turns use `LocalWritable` and publish through the verified Git write-back flow. Existing disk-backed `tmp` and `home` emptyDirs remain separate.

Orphan reaper creation grace

An AgentHost claim missing from the active-run map is not reaped while its Kubernetes creationTimestamp is inside the effective creation-grace window. This keeps a newly bound claim alive through the readiness wait (AgentHostReadyTimeoutSeconds, default 90 seconds); a missing or unparseable timestamp receives no grace and remains eligible for cleanup.

Run-scoped GitHub token delivery

A pod-per-run sandbox acts as the run's signed-in user and needs a GitHub credential to clone/push the worktree and call GitHub API tools. In AKS, user tokens are stored in Azure Key Vault, resolved by the API, and delivered to the configured AgentHost pod in the one-time /configure call; they are not mounted via per-run CSI, and the sandbox identity itself has no Key Vault access (issue #471).

Sourcing

Each authenticated user's GitHub token is stored in Key Vault as ghtok-user--{base32(userId)}.
The executor resolves the run's submitting user, pre-resolves that user's GitHub token via the API-side token store, and passes it as gitHubAccessToken in /configure. If the user cannot be resolved or has no usable token, the launch fails before the first turn rather than falling back to another scope.
Installation scope remains for background/system work with no caller; user runs use the owning user's scope.

Delivery to the executing pod

The shared AgentHost warm pool (agentweaver-agent-host, replicas: 2) keeps pods in standby with no RunId.
The SandboxClaim binds one warm pod. Static config such as AgentHost__KeyVaultUri is already present because the pod needs the vault URI before configuration.
KubernetesSandboxExecutor calls POST /configure with run identity, credentials, and the shared/local workspace descriptor.
AgentHostRuntimeState.TryConfigure(...) stores those values once.
KeyVaultUserTokenProvider prefers the pre-resolved gitHubAccessToken from /configure and serves it to the runtime, caching it in memory for the pod lifetime. The legacy SecretClient + DefaultAzureCredential fetch of kvUserSecretName remains only as a fallback and fails closed under the KV-less sandbox identity (issue #471).

No per-run SecretProviderClass, cloned SandboxTemplate, CSI user-token volume, or per-run warm pool is created. The JSON secret value matches the old file-mounted format, so downstream consumers still see the same token-store contract.

`/configure` request body

POST /configure is the one-time warm-pool configuration call from KubernetesSandboxExecutor to the bound AgentHost pod.

Field	Required	Meaning
`runId`	Yes	The Agentweaver run this pod executes. Missing or blank values return `400`.
`userId`	No	Submitting user id for run-scoped GitHub token lookup.
`turnBearerToken`	No	Per-run bearer token required by `POST /a2a/agent/v1/message:stream`.
`kvUserSecretName`	No	Key Vault secret name for the submitting user's GitHub token.
`gitHubAccessToken`	No	API-pre-resolved GitHub access token; when present, the pod skips the Key Vault fetch.
`sharedWorkingDirectory`	No	API-visible run worktree (for example `/workspace/{worktree}`). Used directly in `Shared` mode and retained as the source-tree coordinate in local modes.
`workingDirectory`	No	Backward-compatible alias for `sharedWorkingDirectory`. It never represents a pod-local path.
`previewRunnerCredential`	No	Fresh per-run bearer for authenticated pod-root control calls, including tool-approval forwarding. It is persisted using `PreviewRunnerCredential.SecretKey(runId)`; inside the pod it is stored only in AgentHost memory.
`autoApproveTools`	No	Seeds the pod-local run-options store; defaults to `false`.
`purpose`	No	String enum: `Default`, `AssemblyBuildTest`, or `ImplementationTurn`.
`workspaceMode`	No	String enum: `Shared` (default), `LocalReadOnly`, or `LocalWritable`. Assembly requires `LocalReadOnly`; implementation turns require `LocalWritable`.
`sourceRepositoryPath`	Local modes	Shared repository path used as the git fetch remote. It is a source, never the execution cwd.
`sourceRef`	Local modes	Branch/ref shallow-fetched from `sourceRepositoryPath`; assembly passes the integration ref.
`baseCommitSha`	Local modes	Immutable commit SHA expected at `sourceRef` (40–64 hexadecimal characters).
`expectedTreeHash`	Local modes	Immutable tree object expected for `baseCommitSha` (40–64 hexadecimal characters).
`scratchRoot`	Local modes	Mounted execution-scratch root. AgentHost derives the local path inside the pod as `{scratchRoot}/{run-hash}/{tree-hash}`.

IRunSubmittingUserResolver.GetWorkingDirectoryAsync(runId) resolves the shared directory from the run row and strips coordinator suffixes such as -coordinator-decompose, so sibling child stages share the parent's worktree. Local execution sends the explicit source contract above. AgentHost verifies the fetched commit and tree before setup, derives the workspace path inside the pod, and exposes it as the runtime state's effective working directory. Preview resolves its command against the API-visible detached worktree and maps the relative cwd into this checkout.

`/configure` result	Meaning
`200`	Configuration and purpose-specific setup completed.
`400`	Malformed JSON or missing `runId`.
`409`	Pod was already configured, or local workspace preparation failed (including SHA/tree/scratch mismatch).
`422`	Required local workspace fields or purpose/mode policy were missing or invalid.
`507`	`execution-scratch` had less than `AgentHost:ExecutionScratchMinimumFreeBytes` available.

Lifetime and cleanup

Key Vault is the source of truth. OAuth callbacks and refreshes write the per-user Key Vault secret.
Pod cache is in-memory. The configured token is cached only for that pod lifetime and disappears when the pod is released.
No SPC cleanup. The reaper no longer deletes per-run SPCs or per-run templates/warm pools for AgentHost because they are no longer created.

Security trade-off

The previous CSI design isolated user tokens at the infrastructure layer: the pod filesystem contained only one projected file. The warm-pool design uses application-layer brokering: the API resolves the run owner's token and delivers it in the one-time /configure call, and the sandbox runs as a dedicated identity with no Key Vault access (issue #471), so it cannot read any vault secret. NetworkPolicy protects /configure, one-time configuration prevents retargeting, and message:stream still requires the per-run bearer token.

A2A turn bearer token

The A2A turn endpoint has a separate per-run bearer token from the GitHub user token above:

KubernetesSandboxExecutor creates 32 random bytes (256 bits) at AgentHost run launch.
The token is sent to the claimed warm pod in POST /configure and stored in AgentHostRuntimeState.
The same token is stored in IAgentHostTurnTokenRegistry for the owning run.
RemoteAgentProxy reads the registry and sends Authorization: Bearer {token} on all calls to POST /a2a/agent/v1/message:stream.
AgentHost rejects turn requests whose header does not exactly match its own AgentHostOptions.TurnBearerToken.

This is application-layer auth on top of the A2A NetworkPolicy/mTLS boundary. The important blast-radius property is that a stolen token from one run cannot be reused against another run's pod.

Tool-approval forwarding endpoints

These are internal API-to-AgentHost routes, not public client endpoints. The public caller continues to use /api/runs/{id}/tool-approvals and /api/runs/{id}/tool-denials.

Method	AgentHost path	Body	Purpose
`POST`	`/tool-approvals`	`runId`, `requestId`, `scope`	Grant the pod-local pending request. Unknown scope values use `once`; `always` is pod/run-scoped and does not survive restart.
`POST`	`/tool-denials`	`runId`, `requestId`	Deny the pod-local pending request.

Both routes accept the same pod-root bearer authorization used by PreviewRunner controls: either the configured turn bearer or the per-run previewRunnerCredential. A mismatched runId returns 409 state: "run_mismatch".

AgentHost response	Meaning
`200` with `resolved: true`	State is `approved`, `denied`, or `expired`
`404` with `state: "unknown"`	The pod-local gate does not know the request
`409` with `state: "pending"`	The request remains pending
`401`	The bearer did not match the configured pod credentials

The API locates the pod with IAgentHostOriginResolver, calls it through the a2a-sandbox-pod client, and caps the decision call at 10 seconds. Missing origins, timeouts, transport failures, 5xx responses, and invalid responses surface publicly as 503 state: "agenthost_unreachable". Terminal forwards cause the API to emit tool.approval_resolved for the owning run.

The credential's secret-store key is derived by PreviewRunnerCredential.SecretKey(runId) with the prefix preview-runner-cred--; KubernetesSandboxExecutor mints it, persists it, and delivers its value in-memory through /configure.

Sources: apps/Agentweaver.AgentHost/Program.cs:287-288,486-588, apps/Agentweaver.Api/Sandbox/AgentHostApprovalHttpClient.cs:28-112, apps/Agentweaver.Api/Endpoints/RunEndpoints.cs:2590-2718, apps/Agentweaver.Api/Sandbox/Preview/PreviewRunnerCredential.cs:22-35, and apps/Agentweaver.Api/Sandbox/KubernetesSandboxExecutor.cs:706-759.

Pod naming and the executing-pod surface

A run's executing pod name is tracked so the UI can show where a run is running.

PodNameRegistry is an in-memory map from run id → bound pod name. It is populated by the Kubernetes sandbox executor once a SandboxClaim reports its Ready condition True, and the entry is removed when the claim is deleted (e.g. on run cleanup or release).
The registry is consumed in two places:
- the system runtime endpoint (GET /api/system/runtime) returns { kubernetes, podName }, where podName is the API/host pod name when running inside Kubernetes — the global fallback; and
- the run graph endpoint (GET /api/runs/{id}/graph) populates an executionPodName field on each node from the registry, so a per-run/per-node pod name overrides the global fallback as the pod-per-run rollout begins carrying the correct per-pod value automatically.
The frontend resolves node.executionPodName ?? globalPodName and renders it as a small pod pill (the "executing pod name" surfaced on agent boxes). The pill renders only on Kubernetes — when not running in-cluster (kubernetes: false) or when the pod name is null, nothing is shown, so local/dev runs stay clean. See the experience doc for the rendered behavior.

Field	Source	Meaning
`kubernetes`	`GET /api/system/runtime`	Whether the backend is running inside Kubernetes; gates whether any pod pill is shown.
`podName` (global)	`GET /api/system/runtime`	The host/API pod name — the fallback pill when no per-node value exists.
`executionPodName` (per node)	`GET /api/runs/{id}/graph`, topology deltas, `subtask.*` events	The bound sandbox pod name for that run/node, from `PodNameRegistry`; overrides the global fallback.

The same PodNameRegistry also lets preview/port-forward tooling locate a run's pod. That preview path is documented in the Sandbox deep dive and, for its API surface, in Sandbox preview port-forward below.

Sandbox preview port-forward (Feature 017)

Dedicated pages: this feature now has its own Reference, User Guide, and Deep Dive. The summary below stays here for context within the sandbox-pods surface.

A preview port-forward exposes a port of a run's sandbox pod back through the API, so an operator can reach a server the agent started inside the pod (a dev server, a built app, a debug endpoint) as a live preview scoped to that one run's pod. PortForwardService shells out to kubectl port-forward --address 127.0.0.1 pod/{podName} :{targetPort} -n {namespace} (it does not use the Kubernetes API), parses the Forwarding from 127.0.0.1:<port> -> line to learn the local port, and probes loopback TCP until ready. The pod is the same one KubernetesSandboxExecutor provisions through the agent-sandbox controller — the preview tunnels into that pod, not an MXC local sandbox.

This surface is Kubernetes-only: it tunnels through the Kubernetes claim backend's pod, located by run id via the PodNameRegistry. On local/dev backends (no claim pod) there is nothing to forward, and the start call fails with a conflict — "the run must be in_progress with an active Kubernetes sandbox". Every call also verifies the run exists and the caller owns it (403/404 otherwise).

Endpoints

Method & path	Body	Returns	Effect
`POST /api/runs/{runId}/sandbox/port-forward`	`{ "target_port": <1..65535> }`	`PortForwardSessionDto`	Starts a `kubectl port-forward` from the run's pod's `target_port` to a loopback port on the API, and returns the new session. `429` when a session cap is hit; `409` when the run has no active sandbox pod.
`GET /api/runs/{runId}/sandbox/port-forward`	—	`PortForwardSessionDto[]`	Lists the active preview sessions for the run.
`DELETE /api/runs/{runId}/sandbox/port-forward/{sessionId}`	—	`{ session_id, stopped: true }`	Stops the identified session and tears down its tunnel.

`PortForwardSessionDto`

Field	Meaning
`session_id`	Identifier for this preview session; used as `{sessionId}` to stop it via `DELETE`.
`local_port`	The loopback port on the API host that `kubectl` bound; what the API forwards from. The backend returns this port, not a public URL.
`target_port`	The port inside the sandbox pod that is being forwarded.
`pod_name`	The bound sandbox pod the tunnel targets (from `PodNameRegistry`).
`started_at`	When the session started.
`preview_url` / `previewUrl`	Web-only, optional. The frontend reads these to render an embedded iframe, but the backend does not currently populate them; the UI explicitly says so when no proxied URL is returned.

Behavior

Per-port, explicit. A session forwards one target_port; opening another preview is a second POST. Sessions are listed and stopped individually.
Scoped to the run's pod. A session can only reach that run's sandbox pod — the run id resolves to a single bound pod, so a preview never crosses into another run's pod.
Inbound only, no egress widening. The tunnel is an inbound path the operator opens to the pod; it does not alter the pod's default-deny egress allowlist (see Security properties).
Capped per run and globally. Default 3 concurrent sessions per run (Sandbox:PortForward:MaxConcurrentSessionsPerRun, fallback :MaxPerRun) and 20 globally (Sandbox:PortForward:MaxConcurrentSessionsGlobal, fallback :MaxGlobal); exceeding either raises PortForwardLimitExceededException → 429.
In-memory, no TTL. Sessions live only in PortForwardService's in-process maps (_sessions / _sessionsByRun); there is no expiry timer. They end only on explicit DELETE, run end (via RunWatchLoopService, which also unregisters the pod), the kubectl process exiting on its own, or Dispose() at shutdown.
Bounded by the pod. A session is only valid while the run's pod is bound; releasing or replacing the pod (suspend/resume, run end) ends forwarding, and a new preview must be started against the re-claimed pod.

Security properties

Property	Pod-per-run guarantee
Execution isolation	Each run's agent turn, tools, shell, and file ops run in the run's own Kata-isolated pod (`kata-vm-isolation`), not a shared process.
Control-plane isolation	The orchestration graph, HITL decisions, and run record stay in the worker; a compromised pod cannot alter what happens next.
Credential blast radius	The pod holds only a short-lived, run-scoped credential — never a broker key, never refresh material, never another run's or user's scope. There is no `CapabilityTokenService` and no central token broker.
A2A turn auth	`message:stream` requires `Authorization: Bearer {per-run token}`. The token is delivered only to the claimed AgentHost pod via `/configure` and removed from the registry when the pod is released.
GitHub token exposure	Brokered by the API for the configured run owner only and delivered in the one-time `/configure` call, then cached in memory for the pod lifetime; the sandbox identity has no Key Vault access (issue #471), and no CSI user-token file or shared workspace copy exists.
Egress	Default-deny with a narrow allowlist: model endpoint, the API/worker bridge endpoint, and the run's legitimate git remote(s). The database is not reachable from sandbox pods — all run-state I/O flows through the worker.
At rest / past run	Token material does not persist past the pod lifetime; no per-run Secret/SPC is created, and the bearer token is no longer written to `SandboxClaim.spec.env` in etcd.
Reversibility	The whole mode is gated by `Sandbox:AgentExecutionMode`; flipping to `in-api` restores in-process execution with no redeploy.

Sandbox setup — operator install/config of the sandbox backends.
API reference — the endpoints surfaced above.
A2A reference — the -preview transport (experimental) that carries agent turns.
Sandbox pod execution deep dive — the reasoning.
Sandbox pod execution experience — the user/operator view.
Sandbox browser preview — preview routes (start/keepalive/stop) that expose a pod-internal server over a public HTTPS reverse proxy.
Tool Approval SSE Contract — public approval outcomes and coordinator-to-child routing.

Sandbox pods reference ​

Configuration flags ​

Flag semantics ​

Pod identity and quota ​

Orphan reaper creation grace ​

Run-scoped GitHub token delivery ​

Sourcing ​

Delivery to the executing pod ​

/configure request body ​

Lifetime and cleanup ​

Security trade-off ​

A2A turn bearer token ​

Tool-approval forwarding endpoints ​

Pod naming and the executing-pod surface ​

Sandbox preview port-forward (Feature 017) ​

Endpoints ​

PortForwardSessionDto ​

Behavior ​

Security properties ​

Related reference ​