Back home12-MONTH ROADMAP

Built in public, quarter by quarter.

Each milestone ships something shippable. No vaporware, no roadmap promises without a prototype. Every quarter ends with a demo that someone uses.

NOW

Q2 2026

MCP infrastructure + probes + paper-1 in review

QUARTER 1 / 4

→openinterp-mcp v0.1.0 live · 8 typed tools · bring-your-own-agent · privacy-first
→FabricationGuard v0.2.0 live · 0.88 AUROC cross-task hallucination · ~1ms
→agent-probe-guard v0.1 · capability + thinking detection · skip-21% @ 86% accuracy
→ProbeBench v0.0.1 · 5 reference probes · 7-axis ProbeScore · anti-Goodhart norms
→ICML MI Workshop paper-1 submitted · "Hallucination-Induction, Not Calibration"

Q3 2026NEXT

More probes + integrations

QUARTER 2 / 4

→DeceptionGuard v0.1 · Apollo methodology applied to Qwen3.6
→CoTGuard v2 · causal-mediation methodology (Lanham 2023 truncation)
→BehaviorGuard · CoT-vs-action consistency for agentic systems
→vLLM plugin · inference-time probe scoring (~5ms overhead)
→NeurIPS MI Workshop submission · paper-2 (grokking) + paper-3 (multi-probe GRPO)

Q4 2026COVERAGE

Cross-substrate + cross-model

QUARTER 3 / 4

→ProbeBench v0.1 · register Qwen-Scope, Gemma Scope SAEs as upstream substrates
→Cross-model probe transfer · FabricationGuard methodology on Llama / DeepSeek
→Probe registry API · external probes can register against any substrate
→Auto-interp pipeline · Claude/GPT-4 feature labels on Qwen-Scope features
→ICLR 2027 paper submission

Q1 2027DEPLOY

Regulated industries + revenue

QUARTER 4 / 4

→Medical adapter · EU AI Act Article 14 + FDA SaMD compliance probe pack
→Financial adapter · audit trail + immutable feature activation logs
→Cursor / Cline / agent integrations · BehaviorGuard for tool-use
→Pro tier $99/mo · auto-tuning, custom benches, SLA
→First revenue in · OSS tier permanently funded

Help shape the next milestone.

Roadmap priorities shift based on what the community uses and what Watchtower partners need. Open a GitHub issue, comment on the manifesto, or email us — loud feedback moves items forward.

Read the manifesto Open a GitHub issue