Model Card for esm3-sm-open-v1

esm3-sm-open-v1 is trained on 2.78 billion natural proteins. With synthetic data augmentation, this led to 3.15 billion protein sequences, 236 million protein structures, and 539 million proteins with function annotations, totaling 771 billion tokens. esm3-sm-open-v1 is a generative model capable of designing proteins conditioned on partial prompts of sequence, structure and function.

Safety is an important part of our model - data related to viruses has been removed from the training dataset, as well as some proteins belonging to organisms on the USDA Select Agents and Toxins list. The function decoder has been filtered for potentially harmful keywords.

Usage

Using ESM3 requires esm

pip install esm

Please refer to the readme and notebooks in the esm repository for details on how to use the model.

License

This repository is under a custom non-commercial license.

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support