Retentive Network promotes efficient RNA language modeling of long sequences

Dataemia
10 Min Read


  • Caprara, M. G. & Nilsen, T. W. RNA: versatility in form and function. Nat. Struct. Biol. 7, 831–833 (2000).


    Google Scholar
     

  • Holbrook, S. R. RNA structure: the long and the short of it. Curr. Opin. Struct. Biol. 15, 302–308 (2005).


    Google Scholar
     

  • Chen, Z., Ain, N. U., Zhao, Q. & Zhang, X. From tradition to innovation: conventional and deep learning frameworks in genome annotation. Brief. Bioinform. 25, bbae138 (2024).


    Google Scholar
     

  • Vaswani, A. et al. Attention is all you need. In Adv. Neural Inf. Process. Syst. 30, 5998–6008 (NIPS, 2017).

  • Min, B. et al. Recent advances in natural language processing via large pre-trained language models: a survey. ACM Comput. Surv. 56, 30 (2023).


    Google Scholar
     

  • Chen, J. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. Preprint at https://arxiv.org/abs/2204.00300 (2022).

  • Zhang, Y. et al. Multiple sequence alignment-based RNA language model and its application to structural inference. Nucleic Acids Res 52, e3 (2024).


    Google Scholar
     

  • Wang, N. et al. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nat. Mach. Intell. 6, 548–557 (2024).


    Google Scholar
     

  • Shen, T. et al. Accurate RNA 3D structure prediction using a language model-based deep learning approach. Nat. Methods 21, 2287–2298 (2024).


    Google Scholar
     

  • Wang, X. et al. Uni-RNA: universal pre-trained models revolutionize RNA research. Preprint at https://www.biorxiv.org/content/10.1101/2023.07.11.548588v1 (2023).

  • Penić, R. J., Vlašić, T., Huber, R. G., Wan, Y. & Šikić, M. RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks. Nat. Commun. 16, 5671 (2025).


    Google Scholar
     

  • Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FlashAttention: fast and memory-efficient exact attention with IO-Awareness. In Adv. Neural Inf. Process. Syst. 35, 16344–16359 (NIPS, 2022).

  • Dao, T. FlashAttention-2: faster attention with better parallelism and work partitioning. In International Conference on Learning Representations (ICLR, 2024).

  • Avsec, Ž et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).


    Google Scholar
     

  • Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).


    Google Scholar
     

  • Sun, Y. et al. Retentive Network: a successor to Transformer for large language models. Preprint at https://arxiv.org/abs/2307.08621 (2023).

  • Sweeney, B. et al. RNAcentral 2021: secondary structure integration, improved sequence search and new member databases. Nucleic Acids Res. 49, D212–D220 (2021).


    Google Scholar
     

  • Kirk, J. M. et al. Functional classification of long non-coding RNAs by k-mer content. Nat. Genet. 50, 1474–1482 (2018).


    Google Scholar
     

  • van der Maaten, L. & Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).


    Google Scholar
     

  • Wen, M., Cong, P., Zhang, Z., Lu, H. & Li, T. DeepMirTar: a deep-learning approach for predicting human miRNA targets. Bioinformatics 34, 3781–3787 (2018).


    Google Scholar
     

  • Pla, A., Zhong, X. & Rayner, S. miRAW: A deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts. PLoS Comput. Biol. 14, e1006185 (2018).


    Google Scholar
     

  • Gu, T., Zhao, X., Barbazuk, W. B. & Lee, J. miTAR: a hybrid deep learning-based approach for predicting miRNA targets. BMC Bioinform 22, 96 (2021).


    Google Scholar
     

  • Akiyama, M. & Sakakibara, Y. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genom. Bioinform. 4, lqac12 (2022).


    Google Scholar
     

  • Bartel, D. P. MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233 (2009).


    Google Scholar
     

  • Li, J., Liu, S., Zhou, H., Qu, L. & Yang, J. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res, 42, D92–D97 (2014).


    Google Scholar
     

  • Tan, Z., Fu, Y., Sharma, G. & Mathews, D. H. TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Res. 45, 11570–11581 (2017).


    Google Scholar
     

  • Sloma, M. F. & Mathews, D. H. Exact calculation of loop formation probability identifies folding motifs in RNA secondary structures. RNA 22, 1808–1818 (2016).


    Google Scholar
     

  • Danaee, P. et al. bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res. 46, 5381–5394 (2018).


    Google Scholar
     

  • Szikszai, M., Wise, M., Datta, A., Ward, M. & Mathews, D. H. Deep learning models for RNA secondary structure prediction (probably) do not generalize across families. Bioinformatics 38, 3892–3899 (2022).


    Google Scholar
     

  • Fu, L. et al. UFold: fast and accurate RNA secondary structure prediction with deep learning. Nucleic Acids Res. 50, e14 (2022).


    Google Scholar
     

  • Gruber, A. R., Lorenz, R., Bernhart, S. H., Neuböck, R. & Hofacker, I. L. The Vienna RNA websuite. Nucleic Acids Res. 36, W70–W74 (2008).


    Google Scholar
     

  • Kerpedjiev, P., Hammer, S. & Hofacker, I. L. Forna (force-directed RNA): Simple and effective online RNA secondary structure diagrams. Bioinformatics 31, 3377–3379 (2015).


    Google Scholar
     

  • Zuker, M. & Stiegler, P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res 9, 133–148 (1981).


    Google Scholar
     

  • Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).


    Google Scholar
     

  • Kang, Y. et al. CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res. 45, W12–W16 (2017).


    Google Scholar
     

  • Wang, L. et al. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res. 41, e74 (2013).


    Google Scholar
     

  • Subramanian, K., Payne, B., Feyertag, F. & Alvarez-Ponce, D. The codon statistics database: a database of codon usage bias. Mol. Biol. Evol. 39, msac157 (2022).


    Google Scholar
     

  • Dominguez, D. et al. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell. 70, 854–867 (2018).


    Google Scholar
     

  • Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).


    Google Scholar
     

  • Ba, J. L., Kiros, J. R. & Hinton, G. E. Layer normalization. Preprint at https://arxiv.org/abs/1607.06450 (2016).

  • He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).

  • Sun, Y. et al. A length-extrapolatable Transformer. In 61st Annual Meeting of the Association-for-Computational-Linguistics 14590-14604 (ACL, 2023)

  • Fan, Q., Huang, H., Chen, M., Liu, H. & He, R. RMT: Retentive Networks meet Vision Transformers. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR, 2024).

  • Hendrycks, D. & Gimpel, K. Gaussian error linear units (GELUs). Preprint at https://arxiv.org/abs/1606.08415 (2016).

  • Kenton, J. & Toutanova, L. BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. NAACL-HLT 2019 Vol. 1 (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).

  • Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR, 2019).

  • Ning, W. CatIIIIIIII/RNAErnie: v.1.0. Zenodo https://doi.org/10.5281/zenodo.10847621 (2024).

  • Nowakowski, J. & Tinoco, I. RNA structure and stability. Semin. Virol. 8, 153–165 (1997).


    Google Scholar
     

  • Chen, X., Li, Y., Umarov, R., Gao, X. & Song, L. RNA secondary structure prediction by learning unrolled algorithms. In International Conference on Learning Representations (ICLR, 2020).

  • Kuhn, H. The Hungarian Method for the assignment problem. Nav. Res. Logist. 52, 7–21 (2005).


    Google Scholar
     

  • Ventola, G. M. M. et al. Identification of long non-coding transcripts with feature selection: a comparative study. BMC Bioinform. 18, 187 (2017).


    Google Scholar
     

  • Cock, P. J. A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).


    Google Scholar
     

  • Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).


    Google Scholar
     

  • Shen, Y. RNAret – Datasets and Model Weights [Data set]. Zenodo https://doi.org/10.5281/zenodo.18313475 (2026).

  • Shen, Y. DrBlackZJU/RNAret: Retentive Network promotes efficient RNA language modeling of long sequences (v1.0). Zenodo https://doi.org/10.5281/zenodo.18271233 (2026).



  • Source link

    Share This Article
    Leave a Comment

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    error: Content is protected !!