Race context, CLI distribution, and a redesigned public site
Live race standings injected into running agents, the CLI shipped through npm with production defaults, and the marketing site got a full redesign — /why, pricing, docs foundation, and arena-style run views.
- Race context
- CLI & npm
- Public site redesign
- Docs foundation
What shipped
Added
- Live race context — standings injection at step boundaries, newswire formatting, and token split between agents vs race context.
- CLI `--race-context` flags and UI toggle for race standings during live runs.
- Workflow-first eval commands in the CLI.
- Dedicated /why manifesto page and pricing section with tier cards and trial CTA.
- Released CLI binaries now default to https://api.agentclash.dev.
Improved
- AgentClash-branded auth with liquid-glass login, starfield hero, and 3D clash mark on landing.
- Workspace navigation made instant with client-side prefetching.
- Run detail and replay pages restyled with instrument-panel aesthetics.
- Playground comparison workspace refactored for clearer side-by-side evals.
Merged pull requests
97 PRs- #579fix(billing): stringify Dodo metadata values
- #578[codex] fix billing Dodo error bodies
- #576fix(billing): hardcode Dodo checkout quantity to 1
- #574fix(ci): preserve cwd for action source fallback
- #573fix(ci): harden action CLI resolution
- #572Finish AgentClash CI setup profiles and conflict handling
- #571Create CI setup pull requests from the workspace UI
- #570fix: unblock Claude harness pull requests
- #569Build CI setup UI generator
- #567fix: use reachable app URL in AgentClash PR comments
- #566feat: link AgentClash PR comments to app views
- #564feat: add AgentClash CI PR comments
- #563fix: accept CI default branch metadata
- #561feat(web): surface CI curation metadata
- #559feat(cli): preserve ci curation links
- #557feat(cli): add ci failure taxonomy metadata
- #555fix(releasegate): distinguish invalid scorecards
- #553feat(scoring): surface evaluator validity
- #552fix(cli): render json errors
- #551fix(cli): stream json follow run creation
- #550fix(challenge-packs): allow reused evaluation spec names
- #549feat(regressions): surface remediation hints
- #548fix(regressions): batch suite case counts
- #547feat(regressions): add proposed case queue
- #546feat(ci): auto-detect GitHub labels
- #545feat(failures): classify dependency resolution regressions
- #544feat(regression): expose proposed validation runs in web
- #543feat(regression): validate proposed cases explicitly
- #542feat(regression): capture production failures
- #541feat(ci): surface failure taxonomy
- #540docs: expand CI release gate skill
- #539feat(regression): surface maintenance status
- #538docs: add regression flywheel skill
- #537feat(ci): surface failure cluster trends
- #536docs: add scorecard reader skill
- #535docs: add eval runner skill
- #534feat(regression): surface validation signal
- #533docs: add challenge pack validation publish skill
- #532docs: expand challenge pack llm judges skill
- #531feat(regression): surface failure provenance
- #530docs: expand challenge pack scoring validators skill
- #529feat(ci): filter failures by cluster key
- #528docs: expand challenge pack artifacts skill
- #527feat(web): surface failure cluster rollups
- #526Skills: challenge pack tools and sandbox
- #524Add Claude Agent Harness runner
- #522Skills: challenge pack input sets
- #518docs(ci): refresh CI release gate skill
- #517feat(ci): add reusable GitHub Actions gate
- #516feat(ci): propose regression candidates from failing gates
- #515feat(ci): publish gate summaries and artifacts
- #514feat(ci): attach GitHub metadata to CI runs
- #513feat(ci): run manifest gates from the CLI
- #512feat: complete Dodo billing integration
- #511feat(ci): validate manifest resource IDs against API
- #510feat(ci): define baseline resolution flow
- #509docs(ci): add real agent workload recipes
- #507feat(ci): add manifest should-run precheck
- #495fix: materialize artifact-backed pack assets in native sandbox
- #494fix(challengepack): reject mixed-challenge input sets
- #493fix(scoring): support run tool call count collector
- #492fix(api): include challenge pack slug in list response
- #486feat(cli): add --repetitions flag to eval start
- #485Skills: challenge pack YAML authoring
- #484Docs: expand challenge pack planner skill
- #483Internal: run blind AgentClash harnesses for skill changes
- #482feat(cli): add AgentClash CI manifest contract
- #481Skills: agent deployment setup
- #480Skills: agent build author
- #479feat(docs): expand runtime resources skill
- #478feat(docs): expand CLI setup skill
- #477[codex] Let GitHub harnesses open draft PRs
- #476feat: add GitHub App repo picker foundation
- #474Fix Agent Harness diff capture and validator workdirs
- #473feat(e2b): add Codex fullstack template build
- #472fix(agent-harnesses): allow long E2B process streams
- #471fix(agent-harnesses): prevent Codex execution activity timeout
- #470fix: harden Codex Agent Harness execution on E2B
- #469fix: simplify Agent Harness creation
- #464[codex] Complete agent harness executions
- #461feat: add Agent Harnesses for Codex on E2B
- #460Docs: challenge pack depth, typography, and non-Mermaid diagrams
- #459feat(docs): add skill catalog contract
- #439[codex] Add frontend billing UX
- #438[codex] Add copyable AgentClash skills catalog
- #435feat(cli): expose regression and artifact surfaces
- #434[codex] Implement billing entitlements and Dodo gates
- #431Replace failures-evals lens with particle flywheel animation
- #429[codex] Fix same-run agent scorecard comparison
- #428[codex] Fix CLI auth identity label source
- #422[codex] Refactor playground comparison workspace
- #420[codex] Restore semantic lucide icons
- #419[codex] Replace web icons with Nourico vectors
- #418feat(web): clash streak login shader with cursor + gyro reactivity
- #417[codex] Brand login as AgentClash while preserving WorkOS AuthKit
- #415fix(web): redirect /workspaces/{id} in middleware to avoid React #310
- #414[codex] Add workflow-first CLI eval commands