SAE Registry

Trained SAEs

Every SAE we ship is TopK, residual-stream, and hook-accessible via standard HuggingFace output_hidden_states=True. No TransformerLens dependency — works on hybrid architectures that TL doesn't support.

Quickstart loading

import torch
from huggingface_hub import hf_hub_download

ckpt = hf_hub_download(
    repo_id="caiovicentino1/Qwen3.5-4B-SAE-L18-topk",
    filename="sae_final.pt",
)
state = torch.load(ckpt, map_location="cuda", weights_only=True)
W_enc, W_dec = state["W_enc"], state["W_dec"]
b_enc, b_dec = state["b_enc"], state["b_dec"]
k = int(state["k"])

def encode(h):
    pre = (h - b_dec) @ W_enc + b_enc
    topv, topi = torch.topk(pre, k, dim=-1)
    out = torch.zeros_like(pre).scatter_(-1, topi, topv)
    return torch.relu(out)