Challenge packs

Tools, primitives & policy

How tool_policy and tools.custom map to engine primitives in backend/internal/engine and sandbox.ToolPolicy.

AgentClash stacks three tool notions (also summarized in Tools, network, and secrets):

  1. Workspace tool resources — org-level infrastructure objects (not covered by pack YAML)
  2. Pack composed toolstools.custom[] entries that expand to a JSON Schema plus an implementation
  3. Engine primitives — the concrete executors the worker runs (registered in nativePrimitiveTools, backend/internal/engine/primitive_tools.go)

Only (2) and (3) are pack-controlled.

Tool policy shape

Set version.tool_policy in your pack YAML. Two keys drive it (the JSON hydrates sandbox.ToolPolicy, backend/internal/sandbox/sandbox.go):

  • allowed_tool_kinds — the list of capability groups (kinds) the model may use
  • allow_shell — a separate boolean that gates the exec primitive

Recognized kind strings

allowed_tool_kinds accepts exactly these six values; anything else fails validation (the set is supportedToolKinds in backend/internal/challengepack/validation.go):

browser, build, data, file, network, terminal

Shell is not a kind—enable it with allow_shell: true instead.

Empty allowlist semantics

An empty allowed_tool_kinds means allow every kind (handled by allowsToolKind in primitive_helpers.go). In practice, prefer explicit lists so validation errors catch typos early.

Mode guardrails

A pack's version.execution_mode is one of native, prompt_eval, responses, or multi_turn. Of these, prompt_eval packs must omit tool_policy entirely (prompt-eval runs never call tools)—see Bundle YAML reference.

Built-in primitive names

These are the primitives the worker can run, and the policy that unlocks each (declared in executor_builders.go, registered in nativePrimitiveTools):

PrimitiveGated by
submitAlways available (final answer)
read_file, write_file, list_files, search_files, search_textfile kind
query_json, query_sqldata kind
http_requestnetwork kind (+ runtime network flags)
run_tests, buildbuild kind
execallow_shell

Browser tooling exists in policy (toolKindBrowser)—ensure your template + worker build includes whatever browser bridge your pack expects before relying on it in production.

terminal marks packs that pair with Try CLI — interactive disposable terminal demos for README badges and human try-before-install flows. Eval runs still use standard primitives like exec inside the sandbox.

Composed tools (tools.custom[])

Each item:

yaml
1tools:
2  custom:
3    - name: call_support_api
4      description: Fetch ticket JSON
5      parameters:
6        type: object
7        properties:
8          ticket_id: { type: string }
9        required: [ticket_id]
10        additionalProperties: false
11      implementation:
12        primitive: http_request
13        args:
14          method: GET
15          url: https://api.example.com/tickets/${ticket_id}
16          headers:
17            Authorization: Bearer ${secrets.SUPPORT_TOKEN}

Validation rules each composed tool must satisfy (enforced by validateComposedToolConfig):

  • A non-mock tool's implementation.primitive must name a different tool than itself (prevents the self-delegation footgun)
  • implementation.args must be an object; its templates are validated for placeholder safety
  • parameters must be a JSON Schema that passes templateutil.ValidateToolParameterSchema
  • The delegation graph cannot contain cycles, and delegation depth cannot exceed 8 jumps

Mock implementations

Set implementation.type: mock to skip primitive resolution—useful for dry-run packs or policy-only testing. Mocks bypass cycle detection.

Workspace tools vs pack tools

Pack tools are not the same records as API tools resources—they are bundle-local contracts interpreted entirely inside the worker.

Secret placeholders

Composed args may reference ${secrets.NAME}, which resolve through workspace secret stores—never place secret material inline. Sandbox env_vars must be literal strings: any value containing a ${...} placeholder is rejected at validation, and ${secrets.*} in particular errors out (enforced by validateEnvVarLiterals in backend/internal/engine/executor_sandbox.go). Environment leaks are too easy, so inject credentials through http_request headers instead.

Provider visibility

The model only ever sees tools that pass both the tool policy and the pack manifest—the provider-specific tool definitions (OpenAI, Anthropic, etc.) are built from the registry's visible map (buildToolRegistry).

See also