Guides
Write a Challenge Pack
Author a challenge-pack bundle in YAML, validate it against the current parser, and publish it into a workspace.
Goal: write a pack that the current AgentClash parser, validator, and publish flow will accept.
Prerequisites:
- You have the CLI installed and logged in.
- You selected a workspace with
agentclash workspace use <WORKSPACE_ID>. - You know whether the pack should be
prompt_evalornative.
1. Start from the current minimum shape
This is the smallest honest starting point based on the current bundle parser and tests:
pack:
slug: support-eval
name: Support Eval
family: support
version:
number: 1
execution_mode: prompt_eval
evaluation_spec:
name: support-v1
version_number: 1
judge_mode: deterministic
validators:
- key: exact
type: exact_match
target: final_output
expected_from: challenge_input
scorecard:
dimensions: [correctness]
challenges:
- key: ticket-1
title: Ticket One
category: support
difficulty: medium
instructions: |
Read the request and produce the final answer.
input_sets:
- key: default
name: Default Inputs
cases:
- challenge_key: ticket-1
case_key: sample-1
inputs:
- key: prompt
kind: text
value: hello
expectations:
- key: answer
kind: text
source: input:prompt
This is not a glamorous pack. It is a good pack skeleton because it matches the current parser shape.
2. Add execution policy only when you need it
If the pack is native, you can add runtime sections like tool_policy, sandbox, and tools.
Example:
version:
number: 2
execution_mode: native
tool_policy:
allowed_tool_kinds:
- file
- shell
- network
sandbox:
network_access: true
network_allowlist:
- 203.0.113.0/24
evaluation_spec:
name: support-v2
version_number: 2
judge_mode: hybrid
validators:
- key: exact
type: exact_match
target: final_output
expected_from: challenge_input
scorecard:
dimensions: [correctness]
tools:
custom:
- name: check_inventory
description: Check inventory by SKU
parameters:
type: object
properties:
sku:
type: string
implementation:
primitive: http_request
args:
method: GET
url: https://api.example.com/inventory/${sku}
headers:
Authorization: Bearer ${secrets.INVENTORY_API_KEY}
Use these sections deliberately.
tool_policydecides what kinds of tools are even available.sandbox.network_accessandnetwork_allowlistcontrol outbound networking.tools.customdefines the tool contract the agent sees.implementation.primitivepicks the executor primitive that actually runs.
3. Add assets when inputs should point at files
If the pack needs files, declare them as assets instead of hardcoding mystery paths all over the bundle.
Example:
version:
number: 1
execution_mode: native
assets:
- key: fixtures
path: fixtures/workspace.zip
media_type: application/zip
You can also back an asset with an uploaded artifact by setting artifact_id instead of only relying on a repository path.
Then cases and expectations can refer to those assets by key.
4. Validate before you publish
The current CLI command is:
agentclash challenge-pack validate support-eval.yaml
This calls the workspace-scoped validation endpoint and checks the same parser and validation logic the publish path uses.
Typical failures the current code will catch early:
- unknown placeholders like
${missing} - invalid CIDR entries in
network_allowlist - self-referencing or cyclic composed tools
- invalid tool parameter schemas
- unknown artifact keys or nonexistent stored artifact IDs
5. Publish the bundle
Once validation passes:
agentclash challenge-pack publish support-eval.yaml
The publish response returns concrete IDs, including:
challenge_pack_idchallenge_pack_version_idevaluation_spec_idinput_set_ids- optional
bundle_artifact_id
Those IDs matter later because run creation asks for a pack version, not a filename.
6. Confirm the workspace can see it
agentclash challenge-pack list
If the pack published cleanly, it should show up in the workspace list with its versions.
Verification
You should now have:
- a bundle YAML file the current parser accepts
- a successful
validateresult - a published pack version ID you can use in run creation
Troubleshooting
Validation says a tool placeholder is unknown
Your implementation.args template is referencing a variable that is not declared by the tool parameter schema or available template context.
Validation says a tool references itself or forms a cycle
Your composed tool graph is recursive. Break the cycle and delegate to a primitive or a non-cyclic tool chain.
Validation says an artifact key is missing
You referenced an asset or artifact key in a case or expectation that was never declared in the pack.
The pack needs internet access
Do not assume that adding http_request is enough. You also need the relevant sandbox/network policy in the pack and runtime path.