AgentClash

Agent Skills

Challenge Pack LLM Judges Skill

Use when configuring AgentClash LLM-as-judge scoring, judge prompts, rubrics, dimensions, evidence inputs, abstention behavior, and judge result interpretation.

Canonical source: web/content/agent-skills/challenge-pack-skills/agentclash-challenge-pack-llm-judges/SKILL.md

Markdown export: /docs-md/agent-skills/challenge-pack-skills/agentclash-challenge-pack-llm-judges

Use This Skill When

Use when configuring AgentClash LLM-as-judge scoring, judge prompts, rubrics, dimensions, evidence inputs, abstention behavior, and judge result interpretation.

Full SKILL.md

---
name: agentclash-challenge-pack-llm-judges
description: Use when configuring AgentClash LLM-as-judge scoring, judge prompts, rubrics, dimensions, evidence inputs, abstention behavior, and judge result interpretation.
metadata:
  agentclash.role: challenge-pack-judging
  agentclash.version: "1"
  agentclash.requires_cli: "true"
---

# AgentClash Challenge Pack LLM Judges

## Purpose
Add LLM judges when deterministic validators cannot capture the whole evaluation.

## Use When
- Quality depends on reasoning, style, relevance, or nuanced task completion.
- A deterministic validator would be brittle or incomplete.
- The scorecard needs judge rationale tied to replay evidence.

## Inputs Needed
- Dimension being judged.
- Rubric with pass, partial, and fail examples.
- Evidence fields available to the judge.
- Desired numeric, boolean, or categorical output mode.

## Procedure
1. Use LLM judges for subjective dimensions only.
2. Keep judge prompts narrow and evidence-bound.
3. Specify the expected output schema.
4. Define abstention behavior when evidence is insufficient.
5. Pair judges with deterministic validators for hard constraints.

## Output Shape
```text
Judge name:
Dimension:
Evidence:
Rubric:
Output mode:
Abstention rule:
Failure examples:
```

## Related Skills
- `agentclash-challenge-pack-scoring-validators`
- `agentclash-scorecard-reader`