Q4 2026

Feature Firehose API

Low-latency streaming API. Your production LLM calls pass through; per-token feature activations emit on WebSocket. ClickHouse-backed dashboards. Pricing from $2 per 1M tokens monitored. The enterprise tier that funds the OSS platform.

Latency target

p95 < 25ms added to your LLM call. We achieve this with: pre-loaded SAE weights on dedicated GPUs, batched forward passes, and INT4 quantized SAE inference (lossless per our benchmarks). Your user-facing latency stays within SLA.

Deployment modes

Cloud (we host, you pipe) · VPC-peered (low-latency inside AWS/GCP) · On-prem Docker (regulated industries). Watchlist updates push via MQTT so you never need to re-deploy.

What emits

Per-token: top-K feature activations with IDs, values, and semantic tags from Atlas. Aggregated: hourly feature histograms, anomaly scores, watchlist trigger counts. Everything queryable via SQL on ClickHouse.

Request early access

We prioritize researchers, educators, and safety teams who will use it publicly. Tell us what you want to build; we'll reach out when the beta opens.

Email hi@openinterp.org