Feature packs

Catalog

Each pack is a validated set of helpful + harmful SAE features discovered via contrastive correctness analysis. To appear here, a pack must pass Stage Gate 1 (Spearman ρ ≥ 0.30 on held-out data).

qwen3.5-4b/reasoning_pack

Qwen/Qwen3.5-4B · GSM8K (math reasoning)

Validated
Spearman ρ
0.540
Pearson r
0.726
n (held-out)
100
Features
10 helpful + 10 harmful
Cohen's d range: [+2.06, +2.16] / [−2.47, −2.06]
Discovered on: 50 GSM8K responses (raw Q/A)

qwen3.6-35b-a3b/reasoning_pack

Qwen/Qwen3.6-35B-A3B · SuperGPQA (science/engineering)

Validated
Spearman ρ
0.522
Pearson r
0.537
n (held-out)
100
Features
10 helpful + 10 harmful
Cohen's d range: [+1.72, +1.84] / [−1.73, −1.36]
Discovered on: 50 SuperGPQA responses (thinking mode)

gemma-4-e4b/reasoning_pack

Google/Gemma-4-E4B · GSM8K (pending)

Pending G1
Spearman ρ
Pearson r
n (held-out)
Features
Cohen's d range: pending contrastive discovery
Discovered on:

Contribute a pack

If you train a SAE on a new model + architecture and run Stage Gate 1 on a labeled benchmark, open a PR to the catalogs/ directory. Packs that meet the ρ ≥ 0.30 threshold on an independent held-out set will be merged and appear here. See the pack template for required fields.