tim-lawson
/

sae-pythia-160m-deduped-x64-k32-layers-11

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions

tim-lawson commited on Dec 2, 2024

Commit

2d4eb0c

·

verified ·

1 Parent(s): aecd78c

Push model using huggingface_hub.

Files changed (3) hide show

README.md +41 -3
config.json +1 -1
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,9 +1,47 @@
 ---
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 ---
+language: en
+library_name: mlsae
+license: mit
 tags:
+- arxiv:2409.04185
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
+# Model Card for tim-lawson/sae-pythia-160m-deduped-x64-k32-layers-11
+A Multi-Layer Sparse Autoencoder (MLSAE) trained on the residual stream activation
+vectors from [EleutherAI/pythia-160m-deduped](https://huggingface.co/EleutherAI/pythia-160m-deduped) with an
+expansion factor of R = 64 and sparsity k = 32, over 1 billion
+tokens from [monology/pile-uncopyrighted](https://huggingface.co/datasets/monology/pile-uncopyrighted).
+This model is a PyTorch TopKSAE module, which does not include the underlying
+transformer.
+### Model Sources
+- **Repository:** <https://github.com/tim-lawson/mlsae>
+- **Paper:** <https://arxiv.org/abs/2409.04185>
+- **Weights & Biases:** <https://wandb.ai/timlawson-/mlsae>
+## Citation
+**BibTeX:**
+```bibtex
+@misc{lawson_residual_2024,
+  title         = {Residual {{Stream Analysis}} with {{Multi-Layer SAEs}}},
+  author        = {Lawson, Tim and Farnik, Lucy and Houghton, Conor and Aitchison, Laurence},
+  year          = {2024},
+  month         = oct,
+  number        = {arXiv:2409.04185},
+  eprint        = {2409.04185},
+  primaryclass  = {cs},
+  publisher     = {arXiv},
+  doi           = {10.48550/arXiv.2409.04185},
+  urldate       = {2024-10-08},
+  archiveprefix = {arXiv}
+}
+```

config.json CHANGED Viewed

@@ -5,5 +5,5 @@
   "k": 32,
   "n_inputs": 768,
   "n_latents": 49152,
-  "standardize": true
 }

   "k": 32,
   "n_inputs": 768,
   "n_latents": 49152,
+  "standardize": false
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:47837fa0b51039b32e6da02f5b839d6a92c0f0e750f16cbfb27ba8271ad6b096
 size 301993232

 version https://git-lfs.github.com/spec/v1
+oid sha256:044c358d9def910f9138a1d8e880211eb8b06d493e24e456d93fdd0461d21d2a
 size 301993232