For researchers

Use the tools — and the knowledge.

Everything behind the arc is open and runnable. Replicate a paper in one command, run probe-causality experiments from your own agent on your own GPU, or pull the methodology as agent skills. We never see your model, data, or keys.

Replicate a paper openinterp-mcp All repos →

one command

Reproduce a paper

Every paper in the arc replicates from a single command on the Colab CLI, with the verdict auto-checked against the published numbers.

$ oilab replicate lever-is-late

Open

openinterp-mcp v0.1.0

Run your own experiments

An MCP server that lets any agent — Claude Code, Cursor, Cline — run probe-causality experiments on your own GPU session. We never see your model, data, or keys.

$ pip install "openinterp-mcp[server]"

Open

9 Claude Code skills

The operational knowledge, as skills

The methodology is packaged as agent skills: how to capture activations, decompose them into named SAE features, steer a direction, and run the four causality checks — including the structure-matched control + naming gate — that separate a real result from a confounded or epiphenomenal one.

Open

self-contained

Notebooks

A ladder of runnable notebooks, from your first SAE in 30 minutes to the full-stack experiments behind the papers — each opens in Colab.

Open

The methodology, packaged

Nine Claude Code skills.

Drop these into any agent and it inherits the operational knowledge — including the four causality checks (with the structure-matched control + naming gate) that separate a real result from a confounded or epiphenomenal one. Each maps to a typedopeninterp-mcptool.

Install all 9 into your terminal

$ curl -fsSL https://openinterp.org/install-skills.sh | sh

Downloads each SKILL.md into ~/.claude/skills — writes only markdown, runs no code. Use -s -- --project for a repo-local ./.claude/skills, or inspect the script first.

colab-attach

Attach to your running Colab / vast.ai openinterp session via its public HTTPS URL — the first step of any run.

tool: colab_attach() openinterp-mcp

colab-status

Health of the active session — loaded model, probes, and captures in memory — before you spend a forward pass.

tool: colab_status() openinterp-mcp

capture-acts

Capture residual-stream activations at chosen layers and token positions during a forward pass.

tool: capture_acts() openinterp-mcp

list-probes

Inventory the probes loaded in the backend — model, layer, position, source — so you know what to evaluate or steer.

tool: list_probes() openinterp-mcp

probe-eval

Apply a loaded linear probe to a stored capture; returns per-sample scores and AUROC when labels are given.

tool: probe_eval() openinterp-mcp

sae-lookup

Decompose a captured activation into its top-K SAE features and read their names — the bridge from a residual vector to human-readable concepts (full-stack SAE on Qwen3.6-27B).

tool: sae_lookup() openinterp-mcp

steer

Inject direction×alpha into the residual stream and observe the behavioral effect — causal, not correlational.

tool: steer() openinterp-mcp

causality-protocol

The four mandatory checks — random-feature baseline, control-token norm, structural-rigidity α-sweep, and the structure-matched control + naming gate — that separate a causal probe from a confounded or epiphenomenal one.

tool: causality_protocol() openinterp-mcp

openinterp-lab

Operate a full mech-interp lab from the terminal — provision Colab GPUs via the Google Colab CLI, run the loops, replicate the papers.

tool: oilab() openinterp-lab

Install: pip install "openinterp-mcp[server]" (v0.1.0) · point your agent's MCP config at it · the skills live in each repo's skills/.

Build on it — and tell us what breaks.

Everything is Apache-2.0 and reproducible. Extend a probe, replicate a result, or disagree with one — the methodology is built to be argued with.

Read the research first Collaborate