--- license: apache-2.0 --- ## About RNA-FM (RNA Foundation Model) is a state-of-the-art **pretrained language model for RNA sequences**, serving as the foundation for an integrated RNA research ecosystem. Trained on **23+ million non-coding RNA (ncRNA) sequences** via self-supervised learning, RNA-FM extracts comprehensive structural and functional information from RNA sequences *without* relying on experimental labels. **[mRNA‑FM](https://arxiv.org/abs/2204.00300)** is a direct extension of RNA-FM, trained exclusively on 45 million mRNA coding sequences (CDS). It is specifically designed to capture information unique to mRNA and has demonstrated excellent performance in related tasks. Consequently, RNA-FM generates **general-purpose RNA embeddings** suitable for a broad range of downstream tasks, including but not limited to secondary and tertiary structure prediction, RNA family clustering, and functional RNA analysis. The full codes are available at GitHub: https://github.com/ml4bio/RNA-FM. ## Citation If you use the model in your research, please cite our paper with the following. ``` @article{chen2022interpretable, title={Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions}, author={Chen, Jiayang and Hu, Zhihang and Sun, Siqi and Tan, Qingxiong and Wang, Yixuan and Yu, Qinze and Zong, Licheng and Hong, Liang and Xiao, Jin and Shen, Tao and others}, journal={arXiv preprint arXiv:2204.00300}, year={2022} } @article{shen2024accurate, title={Accurate RNA 3D structure prediction using a language model-based deep learning approach}, author={Shen, Tao and Hu, Zhihang and Sun, Siqi and Liu, Di and Wong, Felix and Wang, Jiuming and Chen, Jiayang and Wang, Yixuan and Hong, Liang and Xiao, Jin and others}, journal={Nature Methods}, pages={1--12}, year={2024}, publisher={Nature Publishing Group US New York} } @article{chen2020rna, title={RNA secondary structure prediction by learning unrolled algorithms}, author={Chen, Xinshi and Li, Yu and Umarov, Ramzan and Gao, Xin and Song, Le}, journal={arXiv preprint arXiv:2002.05810}, year={2020} } ```