FindingQwen/Qwen3.6-27B-Instruct2026-05-11 · by caiovicentino

Saturation-direction principle — 5 empirical classes of probe causality (Qwen3.6-27B)

Unifies 8 probes into 5 causality classes. Saturation-direction principle: probes lever in the direction of baseline residual saturation. L55 reversal in Phase 11e (pushdown→pushup when saturation flips) strongly confirms principle.

Numbers

probes_mapped
8
classes_identified
5
cross_distribution_sites_validated
4
phase_11e_l55_reversal_confirmed
true

Methodology check

Output of the causality_protocol primitive when it was run on this artifact. See paper-6 for the 3-baseline methodology and the 5-class verdict spec.

baselines_run
random_direction_random_actsrandom_direction_real_actsshuffled_labels
control_token_normalization
✓ yes
structural_rigidity_sweep
✓ yes

Artifacts

phase11d_codeforces.jsonphase11e_multisite.jsonphase12_persona_falsifier.json

Cite

Content-only sha256 below. Verifiable: re-hash the JSON manifest (with manifest_sha256 set to null, sort_keys=True) and you get the same digest. Zenodo DOI pending.

manifest_sha256
03a6e70bfd06a5336ef881c9942a1e19fe000acb9cb6c44c57c5cc07671797d0
Atlas URL
https://openinterp.org/atlas/03a6e70bfd
Raw manifest
https://raw.githubusercontent.com/OpenInterpretability/registry/main/atlas/2026/03a6e70bfd.json

Reproduce this in your agent

In an agent session attached to your Colab via openinterp-mcp:

from openinterp_mcp.atlas import load_entry

entry = load_entry("03a6e70bfd")
print(entry.methodology_check)

# Re-run the causality protocol against the linked HF artifact:
# (no HF artifact attached — replicate from methodology alone)