SAEs for use with the SAELens library

This repository contains the following SAEs:

Llama-3.2-3B_blocks.21.hook_resid_pre_18432_topk_64_0.0001_49_fineweb_512
Llama-3.2-3B_blocks.21.hook_resid_pre_18432_topk_64_0.0001_49_faithful-llama3.2-3b_512
Llama-3.2-3B_blocks.21.hook_resid_pre_18432_topk_64_0.0001_42_fineweb_512
Llama-3.2-3B_blocks.21.hook_resid_pre_18432_topk_64_0.0001_42_faithful-llama3.2-3b_512

Load these SAEs using SAELens as below:

from sae_lens import SAE

sae, cfg_dict, sparsity = SAE.from_pretrained("seonglae/Llama-3.2-3B-sae", "<sae_id>")

Citation

@inproceedings{cho2025faithfulsae,
  title={Faithful{SAE}: Towards Capturing Faithful Features with Sparse Autoencoders 
without External Datasets Dependency},
  author={Seonglae Cho and Harryn Oh and Donghyun Lee and Luis Eduardo Rodrigues Vieira 
and Andrew Bermingham and Ziad El Sayed},
  booktitle={ACL 2025 Student Research Workshop},
  year={2025},
  url={https://openreview.net/forum?id=tBn9ChHGG9}
}

seonglae
/

Llama-3.2-3B-sae

SAEs for use with the SAELens library

Citation

Collection including seonglae/Llama-3.2-3B-sae

Faithful SAEs