Retentive Network promotes efficient RNA language modeling of long sequences

Caprara, M. G. & Nilsen, T. W. RNA: versatility in form and function. Nat. Struct. Biol. 7, 831–833 (2000).

Holbrook, S. R. RNA structure: the long and the short of it. Curr. Opin. Struct. Biol. 15, 302–308 (2005).

Chen, Z., Ain, N. U., Zhao, Q. & Zhang, X. From tradition to innovation: conventional and deep learning frameworks in genome annotation. Brief. Bioinform. 25, bbae138 (2024).

Google Scholar

Vaswani, A. et al. Attention is all you need. In Adv. Neural Inf. Process. Syst. 30, 5998–6008 (NIPS, 2017).

Min, B. et al. Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. 56, 30 (2023).

Google Scholar

Chen, J. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. Preprint at https://arxiv.org/abs/2204.00300 (2022).

Zhang, Y. et al. Multiple sequence alignment-based RNA language model and its application to structural inference. Nucleic Acids Res 52, e3 (2024).

Google Scholar

Wang, N. et al. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nat. Mach. Intell. 6, 548–557 (2024).

Google Scholar

Shen, T. et al. Accurate RNA 3D structure prediction using a language model-based deep learning approach. Nat. Methods 21, 2287–2298 (2024).

Google Scholar

Wang, X. et al. Uni-RNA: universal pre-trained models revolutionize RNA research. Preprint at https://www.biorxiv.org/content/10.1101/2023.07.11.548588v1 (2023).

Penić, R. J., Vlašić, T., Huber, R. G., Wan, Y. & Šikić, M. RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks. Nat. Commun. 16, 5671 (2025).

Google Scholar

Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FlashAttention: fast and memory-efficient exact attention with IO-Awareness. In Adv. Neural Inf. Process. Syst. 35, 16344–16359 (NIPS, 2022).

Dao, T. FlashAttention-2: faster attention with better parallelism and work partitioning. In International Conference on Learning Representations (ICLR, 2024).

Avsec, Ž et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).

Google Scholar

Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).

Google Scholar

Sun, Y. et al. Retentive Network: a successor to Transformer for large language models. Preprint at https://arxiv.org/abs/2307.08621 (2023).

Sweeney, B. et al. RNAcentral 2021: secondary structure integration, improved sequence search and new member databases. Nucleic Acids Res. 49, D212–D220 (2021).

Google Scholar

Kirk, J. M. et al. Functional classification of long non-coding RNAs by k-mer content. Nat. Genet. 50, 1474–1482 (2018).

Google Scholar

van der Maaten, L. & Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

Google Scholar

Wen, M., Cong, P., Zhang, Z., Lu, H. & Li, T. DeepMirTar: a deep-learning approach for predicting human miRNA targets. Bioinformatics 34, 3781–3787 (2018).

Google Scholar

Pla, A., Zhong, X. & Rayner, S. miRAW: A deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts. PLoS Comput. Biol. 14, e1006185 (2018).

Google Scholar

Gu, T., Zhao, X., Barbazuk, W. B. & Lee, J. miTAR: a hybrid deep learning-based approach for predicting miRNA targets. BMC Bioinform 22, 96 (2021).

Google Scholar

Akiyama, M. & Sakakibara, Y. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genom. Bioinform. 4, lqac12 (2022).

Google Scholar

Bartel, D. P. MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233 (2009).

Google Scholar

Li, J., Liu, S., Zhou, H., Qu, L. & Yang, J. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res, 42, D92–D97 (2014).

Google Scholar

Tan, Z., Fu, Y., Sharma, G. & Mathews, D. H. TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Res. 45, 11570–11581 (2017).

Google Scholar

Sloma, M. F. & Mathews, D. H. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures. RNA 22, 1808–1818 (2016).

Google Scholar

Danaee, P. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res. 46, 5381–5394 (2018).

Google Scholar

Szikszai, M., Wise, M., Datta, A., Ward, M. & Mathews, D. H. Deep learning models for RNA secondary structure prediction (probably) do not generalize across families. Bioinformatics 38, 3892–3899 (2022).

Google Scholar

Fu, L. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res. 50, e14 (2022).

Google Scholar

Gruber, A. R., Lorenz, R., Bernhart, S. H., Neuböck, R. & Hofacker, I. L. The Vienna RNA websuite. Nucleic Acids Res. 36, W70–W74 (2008).

Google Scholar

Kerpedjiev, P., Hammer, S. & Hofacker, I. L. Forna (force-directed RNA): Simple and effective online RNA secondary structure diagrams. Bioinformatics 31, 3377–3379 (2015).

Google Scholar

Zuker, M. & Stiegler, P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res 9, 133–148 (1981).

Google Scholar

Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).

Google Scholar

Kang, Y. et al. CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res. 45, W12–W16 (2017).

Google Scholar

Wang, L. et al. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 41, e74 (2013).

Google Scholar

Subramanian, K., Payne, B., Feyertag, F. & Alvarez-Ponce, D. The codon statistics database: a database of codon usage bias. Mol. Biol. Evol. 39, msac157 (2022).

Google Scholar

Dominguez, D. et al. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell. 70, 854–867 (2018).

Google Scholar

Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).

Google Scholar

Ba, J. L., Kiros, J. R. & Hinton, G. E. Layer normalization. Preprint at https://arxiv.org/abs/1607.06450 (2016).

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).

Sun, Y. et al. A length-extrapolatable Transformer. In 61st Annual Meeting of the Association-for-Computational-Linguistics 14590-14604 (ACL, 2023)

Fan, Q., Huang, H., Chen, M., Liu, H. & He, R. RMT: Retentive Networks meet Vision Transformers. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR, 2024).

Hendrycks, D. & Gimpel, K. Gaussian error linear units (GELUs). Preprint at https://arxiv.org/abs/1606.08415 (2016).

Kenton, J. & Toutanova, L. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. NAACL-HLT 2019 Vol. 1 (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).

Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR, 2019).

Ning, W. CatIIIIIIII/RNAErnie: v.1.0. Zenodo https://doi.org/10.5281/zenodo.10847621 (2024).

Nowakowski, J. & Tinoco, I. RNA structure and stability. Semin. Virol. 8, 153–165 (1997).

Google Scholar

Chen, X., Li, Y., Umarov, R., Gao, X. & Song, L. RNA secondary structure prediction by learning unrolled algorithms. In International Conference on Learning Representations (ICLR, 2020).

Kuhn, H. The Hungarian Method for the assignment problem. Nav. Res. Logist. 52, 7–21 (2005).

Google Scholar

Ventola, G. M. M. et al. Identification of long non-coding transcripts with feature selection: a comparative study. BMC Bioinform. 18, 187 (2017).

Google Scholar

Cock, P. J. A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

Google Scholar

Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

Google Scholar

Shen, Y. RNAret – Datasets and Model Weights [Data set]. Zenodo https://doi.org/10.5281/zenodo.18313475 (2026).

Shen, Y. DrBlackZJU/RNAret: Retentive Network promotes efficient RNA language modeling of long sequences (v1.0). Zenodo https://doi.org/10.5281/zenodo.18271233 (2026).

Source link

Retentive Network promotes efficient RNA language modeling of long sequences

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Leave a Reply Cancel reply

Recent Posts

Recent Comments

You Might Also Like

Migrate Apache Spark Workloads to GPUs at Scale on Amazon EMR with Project Aether

[2603.08506] Oracle-Guided Soft Shielding for Safe Move Prediction in Chess

A new tool is revealing the invisible networks inside cancer

CapTrack: Multifaceted Evaluation of Forgetting in LLM Post-Training