SAEs for use with the SAELens library
This repository contains the following batch topk Matryoshka SAEs for Gemma-2-9b. All SAEs have 32k width and are trained with k=60 on 750M tokens from the Pile using SAELens. The SAEs were trained with Matryoshka layers of width 128, 512, 2048, 8192, and 32768.
This repository contains the following SAEs:
layer | path | width | l0 | explained_variance |
---|---|---|---|---|
0 | blocks.0.hook_resid_post | 32768 | 60 | 0.942129 |
1 | blocks.1.hook_resid_post | 32768 | 60 | 0.900656 |
2 | blocks.2.hook_resid_post | 32768 | 60 | 0.869154 |
3 | blocks.3.hook_resid_post | 32768 | 60 | 0.84077 |
4 | blocks.4.hook_resid_post | 32768 | 60 | 0.816605 |
5 | blocks.5.hook_resid_post | 32768 | 60 | 0.826656 |
6 | blocks.6.hook_resid_post | 32768 | 60 | 0.798281 |
7 | blocks.7.hook_resid_post | 32768 | 60 | 0.796018 |
8 | blocks.8.hook_resid_post | 32768 | 60 | 0.790385 |
9 | blocks.9.hook_resid_post | 32768 | 60 | 0.775052 |
10 | blocks.10.hook_resid_post | 32768 | 60 | 0.756327 |
11 | blocks.11.hook_resid_post | 32768 | 60 | 0.741264 |
12 | blocks.12.hook_resid_post | 32768 | 60 | 0.718319 |
13 | blocks.13.hook_resid_post | 32768 | 60 | 0.714065 |
14 | blocks.14.hook_resid_post | 32768 | 60 | 0.709635 |
15 | blocks.15.hook_resid_post | 32768 | 60 | 0.706622 |
16 | blocks.16.hook_resid_post | 32768 | 60 | 0.687879 |
17 | blocks.17.hook_resid_post | 32768 | 60 | 0.695821 |
18 | blocks.18.hook_resid_post | 32768 | 60 | 0.691723 |
19 | blocks.19.hook_resid_post | 32768 | 60 | 0.690914 |
20 | blocks.20.hook_resid_post | 32768 | 60 | 0.684599 |
21 | blocks.21.hook_resid_post | 32768 | 60 | 0.691355 |
22 | blocks.22.hook_resid_post | 32768 | 60 | 0.705531 |
23 | blocks.23.hook_resid_post | 32768 | 60 | 0.702293 |
24 | blocks.24.hook_resid_post | 32768 | 60 | 0.707655 |
25 | blocks.25.hook_resid_post | 32768 | 60 | 0.721022 |
26 | blocks.26.hook_resid_post | 32768 | 60 | 0.721717 |
27 | blocks.27.hook_resid_post | 32768 | 60 | 0.745809 |
28 | blocks.28.hook_resid_post | 32768 | 60 | 0.753267 |
29 | blocks.29.hook_resid_post | 32768 | 60 | 0.76466 |
30 | blocks.30.hook_resid_post | 32768 | 60 | 0.763025 |
31 | blocks.31.hook_resid_post | 32768 | 60 | 0.765932 |
32 | blocks.32.hook_resid_post | 32768 | 60 | 0.760822 |
33 | blocks.33.hook_resid_post | 32768 | 60 | 0.73323 |
34 | blocks.34.hook_resid_post | 32768 | 60 | 0.746912 |
35 | blocks.35.hook_resid_post | 32768 | 60 | 0.738031 |
36 | blocks.36.hook_resid_post | 32768 | 60 | 0.730805 |
37 | blocks.37.hook_resid_post | 32768 | 60 | 0.722875 |
38 | blocks.38.hook_resid_post | 32768 | 60 | 0.715494 |
39 | blocks.39.hook_resid_post | 32768 | 60 | 0.7044 |
40 | blocks.40.hook_resid_post | 32768 | 60 | 0.711277 |
Load these SAEs using SAELens as below:
from sae_lens import SAE
sae, cfg_dict, sparsity = SAE.from_pretrained("chanind/gemma-2-9b-batch-topk-matryoshka-saes-w-32k-l0-60", "<sae_id>")
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support