Back home12-MONTH ROADMAP

Built in public, quarter by quarter.

Each milestone ships something shippable. No vaporware, no roadmap promises without a prototype. Every quarter ends with a demo that someone uses.

NOW
Q2 2026

MCP infrastructure + probes + paper-1 in review

QUARTER 1 / 4
  • openinterp-mcp v0.1.0 live · 8 typed tools · bring-your-own-agent · privacy-first
  • FabricationGuard v0.2.0 live · 0.88 AUROC cross-task hallucination · ~1ms
  • agent-probe-guard v0.1 · capability + thinking detection · skip-21% @ 86% accuracy
  • ProbeBench v0.0.1 · 5 reference probes · 7-axis ProbeScore · anti-Goodhart norms
  • ICML MI Workshop paper-1 submitted · "Hallucination-Induction, Not Calibration"
Q3 2026NEXT

More probes + integrations

QUARTER 2 / 4
  • DeceptionGuard v0.1 · Apollo methodology applied to Qwen3.6
  • CoTGuard v2 · causal-mediation methodology (Lanham 2023 truncation)
  • BehaviorGuard · CoT-vs-action consistency for agentic systems
  • vLLM plugin · inference-time probe scoring (~5ms overhead)
  • NeurIPS MI Workshop submission · paper-2 (grokking) + paper-3 (multi-probe GRPO)
Q4 2026COVERAGE

Cross-substrate + cross-model

QUARTER 3 / 4
  • ProbeBench v0.1 · register Qwen-Scope, Gemma Scope SAEs as upstream substrates
  • Cross-model probe transfer · FabricationGuard methodology on Llama / DeepSeek
  • Probe registry API · external probes can register against any substrate
  • Auto-interp pipeline · Claude/GPT-4 feature labels on Qwen-Scope features
  • ICLR 2027 paper submission
Q1 2027DEPLOY

Regulated industries + revenue

QUARTER 4 / 4
  • Medical adapter · EU AI Act Article 14 + FDA SaMD compliance probe pack
  • Financial adapter · audit trail + immutable feature activation logs
  • Cursor / Cline / agent integrations · BehaviorGuard for tool-use
  • Pro tier $99/mo · auto-tuning, custom benches, SLA
  • First revenue in · OSS tier permanently funded

Help shape the next milestone.

Roadmap priorities shift based on what the community uses and what Watchtower partners need. Open a GitHub issue, comment on the manifesto, or email us — loud feedback moves items forward.