Security packs, multi-turn eval, and replay polish
Security-family challenge packs, stress-run CLI, and vault boundary harnesses established AgentClash as a security eval surface. Multi-turn evaluation with human takeover and case templating extended the engine beyond single-shot runs.
- Security eval
- Multi-turn eval
- Case templating
- Replay polish
What shipped
Added
- Multi-turn evaluation — user simulators, hybrid executor, human takeover, calibration reviews, and post-run arena.
- Case template renderer and bundle validation for code-execution challenge packs.
- Multi-turn conversation events with transcript helpers in the event model.
Improved
- Replay trace output upgraded with IDE-level syntax highlighting and rich text rendering.
- Planted secrets from security packs wired into sandbox provisioning.
Security
- SecurityPolicy schema and SecurityScore dimension for security-family challenge packs.
- Canonical secret-hygiene and prompt-injection challenge packs shipped.
- CLI `stress-run` subcommand with Anthropic Messages provider and --no-system-guard leak surfacing.
- agent-vault-stress harness with real Vault SDK, function calling, campaign mode, and bundled HTTP mock.
- Infisical and HashiCorp Vault boundary packs for vault-framed canary leak testing.
Merged pull requests
21 PRs- #856feat: add opencode PR review bot with intelligent model selection
- #855fix(multi-turn): preserve submit tool history and human-input UI
- #853feat(runevents): multi-turn run events + transcript model (#842)
- #852feat(challengepack): user_simulator manifest schema + validation (#841)
- #851feat(challengepack): per-case {{placeholder}} templating for test_command (#840)
- #838feat(replay): IDE-level syntax highlighting and rich text rendering in trace output
- #832feat(security): runtime-stress harness — real Vault SDK + function calling (#815)
- #831docs(security): comprehensive security-eval findings + methodology (#815)
- #830fix(packs): anchor forbidden_output patterns to compliance-context (#815)
- #829fix(stress): Anthropic empty-content refusal → synthetic refusal marker (#815)
- #827feat(packs): hashicorp-vault-boundary.yaml — Vault-framed canary pack (#815)
- #826feat(packs): infisical-boundary.yaml — does naming the vault hold the line? (#815)
- #825feat(stress): Anthropic Messages API provider + frontier-model leak data (#815)
- #824feat(stress): --no-system-guard surfaces real leaks (gpt-4o-mini 100% leak at 15 iter) (#815)
- #823test(stress): prove scorer fires per-incident-kind under stub-LLM leak injection (#815)
- #822fix(packs): broaden refusal-patterns after real stress-run calibration (#815)
- #820feat(cli): security stress-run subcommand (PR 5/10 — #815)
- #819feat(examples): prompt-injection-classic canonical pack (PR 4/10 — #815)
- #818feat(examples): canonical secret-hygiene-env security pack (PR 3/10 — #815)
- #817feat(securityscore): pure scorer for SecurityPolicy (PR 2/10 — #815)
- #816feat(challengepack): SecurityPolicy schema for security-family packs (PR 1/10 — #815)