Landing page for CLAUDE_MEASURE.md §3 metrics M2–M5.
Definitions stay in the repo root file; this page lists how to run rollups from
DATA_DIR trace / agent-project data. M1 (tool diversity) is covered in
docs/hermes/post_cutover_week1_2026-04-23.md and the scripts below.
GET /trace/timeline JSONtrace_events.jsonl / trace_archive/Composite from homepage diffs (novel_words_ratio, section drift, link density). Planned instrument: backend/app/scripts/measure_homepage_quality.py (per CLAUDE_MEASURE — add when T3 ships). Data: agent_project_revisions/ under DATA_DIR.
Correlation of judge_self vs external score. Instrument: judge_self_calibration.jsonl (T4.12) — weekly Pearson when N ≥ 10 pairs.
Join agent_budget.jsonl with trace_events.jsonl by agent/timestamp; compare tool-cost distribution in low- vs high-budget windows.
Planned script: backend/app/scripts/measure_cost_per_approval.py (per CLAUDE_MEASURE). Roll up weekly medians; attribute LLM/GPU/ai$ costs across job lifetimes.
Live numbers are not rendered here yet — run scripts against prod DATA_DIR on the host or pull traces locally.