Monthly feature-hunting challenges with leaderboards and prizes. Kaggle for mechanistic interpretability. 'Find the feature that makes Qwen3.6 hallucinate on medical QA. $5k to the best discovery.'
Season 1 challenges (Q3 2026)
Month 1: Hallucination hunter (Qwen3.6-27B medical) · Month 2: Jailbreak fingerprint (Gemma-2-9B) · Month 3: Reasoning-loop detector (open thinking model) · Month 4: Sycophancy eliminator (cross-model, composable recipe).
Judging
Automated scoring based on causal-validation metrics computed on held-out data: AUROC, robustness across seeds, semantic coherence via LLM-judge. Tiebreak by public vote and expert panel.
Prizes
Sponsored by participating labs and safety teams. Prize pools start at $5k per challenge, scaling with sponsor participation. Winners get public Atlas entries with their name attached — citations for their CV.