Agent Skills
Challenge Pack Tools Sandbox Skill

Use when defining AgentClash challenge pack tool access, sandbox runtime needs, filesystem expectations, network policy, command execution, and secret references.
Canonical source: web/content/agent-skills/challenge-pack-skills/agentclash-challenge-pack-tools-sandbox/SKILL.md
Markdown export: /docs-md/agent-skills/challenge-pack-skills/agentclash-challenge-pack-tools-sandbox
Use This Skill When

Use when defining AgentClash challenge pack tool access, sandbox runtime needs, filesystem expectations, network policy, command execution, and secret references.
Full SKILL.md

markdown
1---
2name: agentclash-challenge-pack-tools-sandbox
3description: Use when defining AgentClash challenge pack tool access, sandbox runtime needs, filesystem expectations, network policy, command execution, and secret references.
4metadata:
5  agentclash.role: challenge-pack-tools
6  agentclash.version: "1"
7  agentclash.requires_cli: "true"
8---
9
10# AgentClash Challenge Pack Tools And Sandbox
11
12## Purpose
13Define the native execution surface a challenge pack needs: pack-defined custom tools, broad tool policy, sandbox network/package/env settings, and safe secret references.
14
15Use this skill only when a pack truly needs native files, tools, network, packages, or sandbox behavior. Keep the runtime surface narrow enough that failures are attributable to the agent, not to an over-broad environment.
16
17## Use When
18- A challenge pack needs top-level `tools.custom`.
19- The pack needs `version.tool_policy.allowed_tool_kinds`.
20- The pack needs `version.sandbox` for network access, CIDR allowlists, environment variables, apt packages, or a sandbox template.
21- A coding agent needs exact source-backed YAML shapes without reading the AgentClash source repo.
22- A reviewer needs to check that no raw secrets or unsupported tool kinds are being introduced.
23
24## Do Not Use When
25- The pack is `prompt_eval` and only needs prompt/final-output evaluation.
26- The task is workspace infrastructure setup with `agentclash infra tool ...`; use `agentclash-runtime-resources-setup`.
27- The task is artifact declaration, scoring validators, LLM judges, validation/publish, or eval running; use the focused downstream skills.
28
29## Environment
30Use hosted production for CLI examples unless the user intentionally targets a local or self-hosted backend.
31
32```bash
33export AGENTCLASH_API_URL="https://api.agentclash.dev"
34```
35
36## Validation Commands
37Validate after adding or changing tools, tool policy, or sandbox settings.
38
39```bash
40agentclash challenge-pack validate path/to/pack.yaml
41agentclash challenge-pack validate path/to/pack.yaml --json
42```
43
44Human output prints `Challenge pack is valid` or `Challenge pack has errors`. Use `--json` for structured `valid` and `errors` fields.
45
46## Execution Mode Rules
47`prompt_eval` packs cannot use challenge-pack tools or sandbox settings.
48
49Do not include these in `prompt_eval`:
50- top-level `tools`
51- `version.tool_policy`
52- `version.sandbox`
53
54Use `native` when the task needs files, tool calls, network policy, extra packages, sandbox templates, file validators, directory checks, or code execution.
55
56```yaml
57version:
58  number: 1
59  execution_mode: native
60```
61
62## Tool Policy
63`version.tool_policy.allowed_tool_kinds` accepts only these broad kinds:
64
65```yaml
66version:
67  tool_policy:
68    allowed_tool_kinds:
69      - browser
70      - build
71      - data
72      - file
73      - network
74```
75
76Supported values are exactly `browser`, `build`, `data`, `file`, and `network`. Do not use `shell`; the current validator rejects it.
77
78Use the narrowest set possible:
79- `file` for reading/writing workspace files.
80- `build` for build/test style operations.
81- `network` for outbound HTTP or API access.
82- `browser` for browser interaction.
83- `data` for structured data access tools.
84
85## Pack-Defined Custom Tools
86Challenge-pack custom tools live at top-level `tools.custom`, not under `version`.
87
88```yaml
89tools:
90  custom:
91    - name: check_inventory
92      description: Check inventory for a SKU.
93      parameters:
94        type: object
95        properties:
96          sku:
97            type: string
98        required:
99          - sku
100      implementation:
101        primitive: http_request
102        args:
103          method: GET
104          url: "https://api.example.com/inventory/${sku}"
105          headers:
106            Authorization: "Bearer ${secrets.INVENTORY_API_KEY}"
107```
108
109Source-backed fields:
110- `tools.custom[]` entries are the supported pack-defined tool shape.
111- `name` should be stable and unique in the pack.
112- `parameters` must be valid JSON Schema when provided. If omitted, validation defaults to an empty object schema, but authoring explicit parameters is clearer.
113- `implementation` is required.
114- Non-`mock` implementations require `implementation.primitive`.
115- Non-`mock` implementations require `implementation.args`, and `args` must be a JSON/YAML object.
116- `implementation.primitive` cannot equal the tool's own `name`.
117- Tool delegation cycles are rejected, and delegation depth greater than 8 is rejected.
118
119Mock tools are the only exception to primitive/args validation:
120
121```yaml
122tools:
123  custom:
124    - name: fake_lookup
125      parameters:
126        type: object
127      implementation:
128        type: mock
129```
130
131## Template Placeholders And Secrets
132Template placeholders are validated inside `implementation.args`.
133
134Allowed placeholder forms:
135- `${sku}` or `${sku.id}` when `sku` is declared in `parameters.properties`.
136- `${parameters}` for the full parameters object.
137- `${secrets.INVENTORY_API_KEY}` for a runtime secret reference.
138
139Rejected placeholder forms:
140- `${missing}` when `missing` is not declared in `parameters.properties`.
141- `${}` empty placeholders.
142- unclosed placeholders such as `${sku`.
143
144Never paste raw secret values into YAML, chat, commits, or examples. Use secret names only. If a secret value is not already configured, ask the user to set it through the workspace secret flow without revealing the value in chat.
145
146## Sandbox Settings
147`version.sandbox` is valid only for `native` packs.
148
149```yaml
150version:
151  execution_mode: native
152  sandbox:
153    network_access: true
154    network_allowlist:
155      - 203.0.113.0/24
156    env_vars:
157      DATASET_MODE: fixture
158      API_BASE_URL: https://api.example.com
159    additional_packages:
160      - jq
161      - python3-venv
162    sandbox_template_id: codex
163```
164
165Source-backed sandbox fields:
166- `network_access`: boolean.
167- `network_allowlist`: list of CIDR ranges. Hostnames such as `api.example.com` are not valid allowlist entries.
168- `env_vars`: string map. Keys must match `[A-Za-z_][A-Za-z0-9_]*`.
169- `additional_packages`: apt-style package names.
170- `sandbox_template_id`: optional template identifier string.
171
172Keep `network_access: false` or omit sandbox network settings unless the case truly needs outbound network. If network is needed, use the smallest CIDR allowlist available.
173
174## Filesystem Expectations
175`version.filesystem` exists as a raw map on the bundle model, but the current challenge-pack validator does not define a source-backed schema for it. Do not invent `version.filesystem` subfields in a skill-authored pack. Prefer explicit assets, case inputs, sandbox package/env settings, and scoring file evidence until the user or product docs provide an exact filesystem contract.
176
177Use file-related behavior through:
178- `version.assets` and case `inputs[].artifact_key`.
179- `version.tool_policy.allowed_tool_kinds: [file]`.
180- scoring file validators that target `file:<post_execution_check_key>`.
181- `version.evaluation_spec.post_execution_checks` for file or directory capture.
182
183## Compatibility Checklist
184Before validating:
185
186- Execution mode is `native` if `tools`, `tool_policy`, or `sandbox` are present.
187- `allowed_tool_kinds` contains only `browser`, `build`, `data`, `file`, and `network`.
188- No `shell` tool kind is present.
189- Every custom tool has a stable `name`, parameter schema, `implementation.primitive`, and object `implementation.args`, unless it is a deliberate `type: mock` tool.
190- Every `${...}` placeholder in tool args is declared as a parameter, is `${parameters}`, or starts with `${secrets.}`.
191- No raw secret values are present.
192- `network_allowlist` uses CIDR ranges.
193- `env_vars` keys are valid environment variable names.
194- `additional_packages` names are valid apt package names.
195- Native settings are backed by a smoke case that proves the environment actually works.
196
197## Common Validation Failures
198- A `prompt_eval` pack includes `tools`, `version.tool_policy`, or `version.sandbox`.
199- `version.tool_policy.allowed_tool_kinds` includes `shell`, `code`, or provider-specific tool names.
200- `allowed_tool_kinds` is not an array of strings.
201- A non-mock custom tool omits `implementation.primitive` or `implementation.args`.
202- `implementation.args` is a string/list instead of an object.
203- Tool args use unknown placeholders such as `${order_id}` without declaring `order_id` in `parameters.properties`.
204- A tool delegates to itself or creates a delegation cycle.
205- `network_allowlist` contains a hostname instead of CIDR.
206- `env_vars` contains a key like `api-key` that is not a valid environment variable name.
207- `additional_packages` includes an invalid apt package name.
208
209## Authoring Procedure
2101. Confirm whether `prompt_eval` is enough. If yes, omit tools and sandbox.
2112. If native behavior is required, set `version.execution_mode: native`.
2123. Add only the needed `allowed_tool_kinds`.
2134. Define `tools.custom` only for pack-defined tools; use workspace infra skills for reusable workspace tools.
2145. Write explicit JSON Schema parameters for each custom tool.
2156. Use `${parameter}` and `${secrets.KEY}` placeholders in `implementation.args`; never raw secrets.
2167. Add `version.sandbox` only for real network/env/package/template requirements.
2178. Add a smoke case that proves the tool or sandbox dependency is reachable.
2189. Run `agentclash challenge-pack validate ... --json` and fix every returned field error.
21910. Hand off to artifacts, scoring, or validation/publish skills.
220
221## Report Back Format
222```text
223Execution mode:
224Tool policy:
225Custom tools:
226- name:
227  primitive:
228  parameters:
229  secret references:
230Sandbox:
231Network:
232Packages:
233Filesystem/artifact dependencies:
234Smoke case:
235Validation command:
236Validation result:
237Ready for scoring/publish: <yes/no>
238Open issues:
239```
240
241## Related Skills
242- `agentclash-runtime-resources-setup`
243- `agentclash-challenge-pack-yaml-author`
244- `agentclash-challenge-pack-input-sets`
245- `agentclash-challenge-pack-artifacts`
246- `agentclash-challenge-pack-scoring-validators`
247- `agentclash-challenge-pack-validation-publish`