SAE Registry

Trained SAEs

Every SAE we ship is TopK, residual-stream, and hook-accessible via standard HuggingFace output_hidden_states=True. No TransformerLens dependency — works on hybrid architectures that TL doesn't support.

Qwen/Qwen3.6-27B

Dense reasoning-tuned · 3 layers in parallel (L11/L31/L55)

Paper-grade 3-layer SAE on Qwen3.6-27B · held-out var_expl L11 0.843 · L31 0.714 · L55 0.816 · AuxK

Residual post-L11 · L31 · L55

Training tokens

huggingface.co/caiovicentino1/qwen36-27b-sae-papergrade

Qwen/Qwen3.5-4B

Hybrid Gated DeltaNet

First TopK residual-stream SAE for hybrid GDN

Residual post-L18

Training tokens

huggingface.co/caiovicentino1/Qwen3.5-4B-SAE-L18-topk

Google/Gemma-4-E4B

Ensemble MoE

First public SAE for Gemma-4 ensemble-MoE

Residual post-L21

Training tokens

huggingface.co/caiovicentino1/Gemma-4-E4B-SAE-L21-topk

Qwen/Qwen3.6-35B-A3B

Triple-hybrid (MoE + GDN + Gated Attention)

Training in progress

First public SAE on triple-hybrid MoE+GDN+Gated-Attention. No precedent in literature.

Residual post-L23

Training tokens

huggingface.co/caiovicentino1/Qwen3.6-35B-A3B-SAE-L23-topk-wip

Quickstart loading

import torch
from huggingface_hub import hf_hub_download

ckpt = hf_hub_download(
    repo_id="caiovicentino1/Qwen3.5-4B-SAE-L18-topk",
    filename="sae_final.pt",
)
state = torch.load(ckpt, map_location="cuda", weights_only=True)
W_enc, W_dec = state["W_enc"], state["W_dec"]
b_enc, b_dec = state["b_enc"], state["b_dec"]
k = int(state["k"])

def encode(h):
    pre = (h - b_dec) @ W_enc + b_enc
    topv, topi = torch.topk(pre, k, dim=-1)
    out = torch.zeros_like(pre).scatter_(-1, topi, topv)
    return torch.relu(out)