Challenge packs
Sandbox & E2B
Pack sandbox fields, worker sandbox provider selection, and how native execution reaches E2B today.
Native runs execute agent tool calls inside an isolated sandbox provider implementing sandbox.Provider (backend/internal/sandbox/sandbox.go). Production commonly uses E2B (backend/internal/sandbox/e2b/provider.go); local development may run unconfigured no-op providers so queues drain without real VMs.
Pack-level version.sandbox
Struct SandboxConfig (challengepack/bundle.go):
| Field | Purpose |
| --- | --- |
| network_access | Boolean gate paired with tool policy network tools |
| network_allowlist | CIDR strings; invalid entries fail ValidateBundle |
| env_vars | Literal environment injection into sandbox (secret placeholders rejected during native executor setup) |
| additional_packages | APT package names (aptPackagePattern validation) |
| sandbox_template_id | Optional override of default E2B template id per pack version |
Remember: entire sandbox block is illegal for prompt_eval packs.
Worker configuration knobs
From environment / backend/internal/worker/config.go (mirror of the searchable Config reference tables):
| Variable | Effect |
| --- | --- |
| SANDBOX_PROVIDER | e2b vs unconfigured (noop) |
| E2B_API_KEY | Credentials |
| E2B_TEMPLATE_ID | Default template when pack omits sandbox_template_id |
| E2B_API_BASE_URL | Optional API override |
| E2B_REQUEST_TIMEOUT | HTTP budget for control-plane calls |
Misconfiguration does not rewrite your YAML—it causes clear worker errors or /doctor warnings when local sandboxes cannot start.
Tool policy vs network flags
Even if http_request is allowed by allowed_tool_kinds: [network], outbound traffic still respects:
- global sandbox network toggles
- CIDR allowlists
- provider-level enforcement inside E2B machines
Think of tool policy as “model may ask” and sandbox as “infrastructure may permit”.
Secrets & environment
Native executor refuses ${secrets.*} inside sandbox.env_vars because files and process listings could leak them; keep secrets in tool args that go through hardened paths (notably http_request header sanitation—see primitive_secrets.go comments).
Failure modes to expect
- Template drift — changing
additional_packageswithout rebuilding templates can cause first-run apt noise; pin templates when stable. - Allowlist too tight — model receives policy errors from
http_requestif DNS resolves but CIDR blocks egress. - No provider —
SANDBOX_PROVIDER=unconfiguredmeans native runs do not execute real tools; useful for API-only integration tests, misleading if you expect live sandboxes.