Agent Skills
Hub Skill

Use when starting any AgentClash eval, CLI, or challenge-pack task. Load this skill first for the full workflow map, skill dependency order, product UI links, hosted defaults, and pointers to every other AgentClash skill.
Canonical source: web/content/agent-skills/agentclash-hub/SKILL.md
Markdown export: /docs-md/agent-skills/agentclash-hub
Use This Skill When

Full SKILL.md

markdown
1---
2name: agentclash-hub
3description: Use when starting any AgentClash eval, CLI, or challenge-pack task. Load this skill first for the full workflow map, skill dependency order, product UI links, hosted defaults, and pointers to every other AgentClash skill.
4metadata:
5  agentclash.role: hub
6  agentclash.version: "1"
7  agentclash.requires_cli: "false"
8---
9
10# AgentClash Hub
11
12## Purpose
13Give coding agents maximum context to run AgentClash evals through the CLI and guide humans to the right web UI pages — without reading the AgentClash source repository.
14
15## Use When
16- A user asks to evaluate agents, run evals, compare models, or use AgentClash for the first time.
17- You need to pick the right downstream skill before acting.
18- You need hosted defaults, UI links, or the end-to-end eval workflow in one place.
19
20## Do Not Use When
21- A narrower skill already matches (e.g. only CLI auth repair → `agentclash-cli-setup`).
22- The task is only to edit AgentClash product source code in the monorepo.
23
24## Environment
25Use production unless the user explicitly runs a local stack:
26
27```bash
28export AGENTCLASH_API_URL="https://api.agentclash.dev"
29agentclash auth login --device
30agentclash link
31agentclash quickstart
32```
33
34Install the CLI: `npm i -g agentclash` or see `/docs-md/getting-started/quickstart`.
35
36Portable bundle install (copy skills to another agent host without the integration command): export them with `agentclash skills export --dir ./bundle --host <agent>`. Canonical source repo: https://github.com/agentclash/agentclash
37
38## Procedure
391. Load this hub to pick the next skill.
402. Run `agentclash quickstart` if CLI readiness is unknown.
413. Follow dependency order for setup → pack → run → review → regression → CI.
424. Send the user to the matching UI page when they need a visual surface.
43
44## Drive The CLI Like An Agent
45When you (a coding agent) drive AgentClash non-interactively, lean on the machine contract:
46
47- **Ask for machine output.** Pass `--json` (or `-o json`) on every command. Data goes to
48  **stdout**; human progress and diagnostics go to **stderr** — parse stdout only.
49- **Discover the surface first.** `agentclash schema --json` returns the full command tree, flags,
50  and documented exit codes. Prefer it over scraping `--help` prose.
51- **Branch on structured errors, never on prose.** On failure the CLI prints a JSON envelope:
52  `error.code` is a stable machine key — branch on it; `error.next_step` is the remediation to act
53  on; `error.details` carries specifics (quota `limit`/`used`/`remaining`, `reset_at`, `plan_key`).
54- **Gate readiness on `agentclash doctor`.** It exits non-zero when the environment isn't ready.
55  `agentclash quickstart --json` is informational and exits 0 even when `ready:false` — use `doctor`
56  as the hard gate in CI/agent flows.
57- **Stay headless.** Set `AGENTCLASH_TOKEN`, `AGENTCLASH_WORKSPACE`, and `AGENTCLASH_API_URL` in the
58  environment so no command needs an interactive prompt.
59
60## End-To-End Eval Workflow (CLI)
61
62```text
631. agentclash-cli-setup              → auth, workspace, doctor
642. agentclash-quickstart             → readiness checks + next command
653. agentclash-runtime-resources-setup → provider, model alias, runtime profile, secrets
664. agentclash-agent-build-author     → build spec + ready build version
675. agentclash-agent-deployment-setup → deployment ID for runs
686. challenge-pack skills             → plan, YAML, validate, publish pack
697. agentclash-eval-runner            → eval start / run create / follow / sessions / series
708. agentclash-scorecard-reader       → rankings, scorecards, replay, artifacts
719. agentclash-compare-and-triage     → baseline, compare latest/gate, replay triage
7210. agentclash-regression-flywheel   → promote failures, suite-only reruns
7311. agentclash-ci-release-gate       → CI manifest + gate (optional)
74```
75
76Optional branches (load when the workflow applies):
77
78```text
79• agentclash-multi-turn-operator     → human takeover during multi_turn runs
80• agentclash-dataset-workflows       → dataset eval, gate, traces, regression sync
81• agentclash-prompt-eval-playground  → prompt-eval YAML + playground experiments
82• agentclash-agent-harness-setup     → E2B coding-agent harness tasks and suites
83• agentclash-workspace-admin         → org/workspace CRUD and membership (teams)
84• agentclash-security-evaluation     → client-side security stress harnesses
85```
86
87Human-friendly shortcut after setup:
88
89```bash
90agentclash quickstart
91agentclash eval start --follow
92agentclash baseline set
93agentclash eval scorecard
94agentclash compare latest --gate
95agentclash replay triage
96```
97
98## Skill Dependency Order
99Read skills in this order when multiple apply:
100
1011. `agentclash-hub` (this file)
1022. `agentclash-cli-setup`
1033. `agentclash-quickstart`
1044. `agentclash-runtime-resources-setup`
1055. `agentclash-agent-build-author`
1066. `agentclash-agent-deployment-setup`
1077. `agentclash-challenge-pack-planner`
1088. `agentclash-challenge-pack-yaml-author`
1099. `agentclash-challenge-pack-input-sets`
11010. `agentclash-challenge-pack-tools-sandbox`
11111. `agentclash-challenge-pack-artifacts`
11212. `agentclash-challenge-pack-scoring-validators`
11313. `agentclash-challenge-pack-llm-judges`
11414. `agentclash-challenge-pack-validation-publish`
11515. `agentclash-eval-runner`
11616. `agentclash-scorecard-reader`
11717. `agentclash-compare-and-triage`
11818. `agentclash-regression-flywheel`
11919. `agentclash-ci-release-gate`
12020. `agentclash-agent-harness-setup`
12121. `agentclash-multi-turn-operator`
12222. `agentclash-dataset-workflows`
12323. `agentclash-prompt-eval-playground`
12424. `agentclash-workspace-admin`
12525. `agentclash-security-evaluation`
126
127> To author or change skills, browse the web catalog at
128> https://agentclash.dev/docs/agent-skills — it is documentation, not an
129> installable skill.
130
131Each skill folder name matches its `name` in frontmatter. When a skill lists **Related Skills**, load those before mutating remote state.
132
133## All Skills In The Catalog
134
135| Skill folder | When to load |
136| --- | --- |
137| `agentclash-hub` | First — workflow map and UI links |
138| `agentclash-quickstart` | Readiness checks and suggested next command |
139| `agentclash-cli-setup` | Install, auth, workspace, doctor |
140| `agentclash-runtime-resources-setup` | Provider accounts, models, runtime profiles, secrets |
141| `agentclash-agent-build-author` | Agent build specs and build versions |
142| `agentclash-agent-deployment-setup` | Create/select deployments |
143| `agentclash-challenge-pack-planner` | Plan a pack before YAML |
144| `agentclash-challenge-pack-yaml-author` | Write pack YAML |
145| `agentclash-challenge-pack-input-sets` | Cases and input sets |
146| `agentclash-challenge-pack-tools-sandbox` | Tools and sandbox policy |
147| `agentclash-challenge-pack-artifacts` | Assets and artifact refs |
148| `agentclash-challenge-pack-scoring-validators` | Validators |
149| `agentclash-challenge-pack-llm-judges` | LLM judges |
150| `agentclash-challenge-pack-validation-publish` | Validate and publish |
151| `agentclash-eval-runner` | Start and follow evals, sessions, series |
152| `agentclash-scorecard-reader` | Interpret results |
153| `agentclash-compare-and-triage` | Baselines, compare, replay triage |
154| `agentclash-regression-flywheel` | Promote failures to regression suites |
155| `agentclash-ci-release-gate` | CI/CD gates |
156| `agentclash-agent-harness-setup` | E2B coding-agent harness tasks, suites, failure review |
157| `agentclash-multi-turn-operator` | Human takeover turns in multi_turn packs |
158| `agentclash-dataset-workflows` | Dataset eval, CI gate, traces, regression sync |
159| `agentclash-prompt-eval-playground` | Prompt eval YAML and playground experiments |
160| `agentclash-workspace-admin` | Org/workspace CRUD and membership administration |
161| `agentclash-security-evaluation` | Security pack stress-run and vault harnesses |
162
163Nested folders: `agent-build-skills/` and `challenge-pack-skills/` mirror the table rows above.
164
165## Product UI — Where To Send The User
166
167Base URL: **https://agentclash.dev**
168
169| User goal | UI path |
170| --- | --- |
171| Sign in / account | https://agentclash.dev |
172| Docs home | https://agentclash.dev/docs |
173| Quickstart | https://agentclash.dev/docs/getting-started/quickstart |
174| First eval walkthrough | https://agentclash.dev/docs/getting-started/first-eval |
175| Agent skills (web catalog) | https://agentclash.dev/docs/agent-skills |
176| CLI reference | https://agentclash.dev/docs/reference/cli |
177| Challenge packs guide | https://agentclash.dev/docs/guides/write-a-challenge-pack |
178| Multi-turn packs | https://agentclash.dev/docs/challenge-packs/multi-turn |
179| Interpret results | https://agentclash.dev/docs/guides/interpret-results |
180| CI/CD gates | https://agentclash.dev/docs/guides/ci-cd-agent-gates |
181| Workspace runs (after login) | App dashboard → Runs list |
182| Live run events | Run detail page while status is running |
183| Scorecards & comparisons | Run detail → scorecard / ranking views after completion |
184
185When you create a run via CLI, tell the user:
186
187```text
188Open https://agentclash.dev and navigate to your workspace runs, or search for run ID <RUN_ID> after signing in.
189```
190
191## AgentClash Concepts (30-Second Model)
192
193- **Challenge pack** — versioned eval workload (cases, scoring, tools policy).
194- **Input set** — which cases run in a given eval.
195- **Agent build / deployment** — the agent under test (model + runtime + tools).
196- **Run** — one execution of pack × input set × deployments.
197- **Eval session** — repeated runs (`eval start --repetitions N` or `run series create`).
198- **Scorecard** — structured results, comparisons, release gate input.
199- **Baseline bookmark** — workspace default run/agent for `compare latest`.
200- **Regression suite** — promoted failures for suite-only reruns.
201
202## Expected Output
203After loading this skill you can name the next skill, 1–3 CLI commands, and the UI page the human should open.
204
205## Failure Modes
206- Skipping `agentclash-cli-setup` when auth or workspace is unset → commands fail with workspace errors.
207- Running evals before pack publish → no runnable pack version.
208- Using localhost API URL by mistake → empty workspace or auth failures against the wrong backend.
209
210## Safety Notes
211- Confirm before production-scale evals, publishes, or CI runs that spend budget.
212- Never paste tokens, secrets, or customer data into chat.
213- Prefer `agentclash doctor` and read-only list commands before writes.
214
215## Report Back Format
216```text
217Hub loaded: yes
218Next skill: <skill-folder-name>
219CLI status: <auth/workspace/doctor summary>
220UI for user: <https://agentclash.dev/...>
221Next commands: <1-3 commands>
222```
223
224## Related Skills
225Load all skills listed in **Skill Dependency Order** as needed; start with `agentclash-cli-setup` if CLI is not configured.
226
227## Related Docs
228- `/docs-md/agent-skills`
229- `/docs-md/agent-skills/agentclash-hub`
230- `/docs-md/guides/use-with-ai-tools`
231- `/docs-md/getting-started/quickstart`
232- `/docs-md/getting-started/first-eval`
PreviousSkill Catalog NextQuickstart Skill