SAE Registry
Trained SAEs
Every SAE we ship is TopK, residual-stream, and hook-accessible via standard HuggingFace output_hidden_states=True. No TransformerLens dependency — works on hybrid architectures that TL doesn't support.
Qwen/Qwen3.6-27B
Dense reasoning-tuned · 3 layers in parallel (L11/L31/L55)
Paper-grade 3-layer SAE on Qwen3.6-27B · held-out var_expl L11 0.843 · L31 0.714 · L55 0.816 · AuxK
Layer
Residual post-L11 · L31 · L55
d_sae
65,536
k (TopK)
128
Expansion
13×
Training tokens
200M
var_exp
0.843
G1 Spearman ρ
—
huggingface.co/caiovicentino1/qwen36-27b-sae-papergrade
Qwen/Qwen3.5-4B
Hybrid Gated DeltaNet
First TopK residual-stream SAE for hybrid GDN
Layer
Residual post-L18
d_sae
40,960
k (TopK)
128
Expansion
16×
Training tokens
200M
var_exp
0.866
G1 Spearman ρ
0.540
huggingface.co/caiovicentino1/Qwen3.5-4B-SAE-L18-topk
Google/Gemma-4-E4B
Ensemble MoE
First public SAE for Gemma-4 ensemble-MoE
Layer
Residual post-L21
d_sae
32,768
k (TopK)
128
Expansion
16×
Training tokens
1B
var_exp
0.939
G1 Spearman ρ
—
huggingface.co/caiovicentino1/Gemma-4-E4B-SAE-L21-topk
Qwen/Qwen3.6-35B-A3B
Triple-hybrid (MoE + GDN + Gated Attention)
First public SAE on triple-hybrid MoE+GDN+Gated-Attention. No precedent in literature.
Layer
Residual post-L23
d_sae
32,768
k (TopK)
128
Expansion
16×
Training tokens
92M (WIP)
var_exp
0.835
G1 Spearman ρ
0.522
huggingface.co/caiovicentino1/Qwen3.6-35B-A3B-SAE-L23-topk-wip
Quickstart loading
import torch
from huggingface_hub import hf_hub_download
ckpt = hf_hub_download(
repo_id="caiovicentino1/Qwen3.5-4B-SAE-L18-topk",
filename="sae_final.pt",
)
state = torch.load(ckpt, map_location="cuda", weights_only=True)
W_enc, W_dec = state["W_enc"], state["W_dec"]
b_enc, b_dec = state["b_enc"], state["b_dec"]
k = int(state["k"])
def encode(h):
pre = (h - b_dec) @ W_enc + b_enc
topv, topi = torch.topk(pre, k, dim=-1)
out = torch.zeros_like(pre).scatter_(-1, topi, topv)
return torch.relu(out)