FindingQwen/Qwen3.6-27B-Instruct2026-05-11 · by caiovicentino
Probe-detected grokking in multi-probe DPO (Qwen3.6-27B nb37 v2)
Phase transition (ratio 2.596) in fresh-probe AUROC across 11 nb37 v2 checkpoints. Original FG/RG probes show ZERO effect — DPO learning orthogonal to task-probe axes. Construct-then-compress pattern.
Numbers
phase_transition_ratio
2.596
fresh_probe_auroc_pre
0.472
fresh_probe_auroc_post
0.528
original_probe_effect
0
checkpoints
11
Artifacts
nb37_v2_checkpointsnb41_v2_grokking_extended
These files live in the linked HF dataset. Open dataset →
Cite
Content-only sha256 below. Verifiable: re-hash the JSON manifest (with manifest_sha256 set to null, sort_keys=True) and you get the same digest. Zenodo DOI pending.
manifest_sha256
7019cff91255b679077964591a24794705ec2b20bb58374d2f265af010ca886cAtlas URL
https://openinterp.org/atlas/7019cff912Raw manifest
https://raw.githubusercontent.com/OpenInterpretability/registry/main/atlas/2026/7019cff912.jsonReproduce this in your agent
In an agent session attached to your Colab via openinterp-mcp:
from openinterp_mcp.atlas import load_entry
entry = load_entry("7019cff912")
print(entry.methodology_check)
# Re-run the causality protocol against the linked HF artifact:
from openinterp_mcp.judge import reproduce
reproduce(entry, hf_repo_id="caiovicentino1/openinterp-37v2-multiprobe-dpo-extended")