ZymScope Release

Sparse Autoencoders

ZymScope is a set of Sparse Autoenocders trained on ZymCTRL, a pLM used for the generation of artificial enzymes. The SAEs have been trained on the specific data mixture used for the base model, until convergence. The resulting SAEs allow for faithful reconstruction of the activations while preserving sparsity over the latents.

Model Description

All the SAEs from ZymScope are based on the BatchTopK SAE architecure, and have a k=32 which is a good tradeoff between reconstruction accuracy and sparsity. Each one of the SAEs has been trained on a layer from ZymCTRL. We selected layers {5,10,15,25,30,35} and fitted the SAE to reconstruction the Residual Stream before the attention block.

Each SAE has been trained until convergence, ranging from 100mll tokens for earlier layers to 1B tokens for latter layers.

Dead neurons are a common problematic in Sparse Autoencoders, we’ve found the number of dead neurons to vary widely with the expressivity of the layer.

Training specs

ZymScope was trained on multiple A100. All the SAEs have an expansion size of 12 from the model dimensional of 1280 to 15360.

The optimizer used was Adam (beta1 = 0.9, beta2 = 0.999) with a learning rate of 0.5e-04.

Contact

We are the AI for Protein Design group at the Centre for Genomic Regulation (https://www.aiproteindesign.com/). .