Home BioMap Unveils RNAGenesis: A Foundation Model Powering the Next Generation of RNA Therapeutics

BioMap Unveils RNAGenesis: A Foundation Model Powering the Next Generation of RNA Therapeutics

Jul 11, 2025 20:09 CST Updated 20:09
BioMap

Developer of Innovative Drug R&D Platform

BioMap Technology Vice President Zhang Xiaoming:Tackling major challenges in life sciences increasingly relies on the deep integration of bio-computation and bio-experimentation. BioMap firmly believes that only through open collaboration can we drive the continuous advancement of biological intelligence. With years of profound expertise in large-scale models, BioMap’s foundational life science large model xTrimo V3 surpasses 210 billion parameters, covering seven mainstream modalities in life sciences. From the ultra-large xTrimo-100B to lightweight inference-optimized versions, the xTrimo series reflects our systematic thinking and engineering capabilities in the intelligent transformation of life sciences. We understand well that the complexity of life sciences far exceeds the complete mastery of any single institution. Therefore, the xTrimo model adheres to the principles of openness, sharing, and collaborative innovation, sincerely inviting global research institutes, universities, and industry partners to participate. By leveraging cutting-edge AI technologies, we aim to drive a paradigm shift in life science research and build a new future for "AI for Life Science."



Recently, BioMap (BioMap) Collaborated with Professor Cong Le from Stanford University, Professor Wang Mengdi from Princeton University, postdoctoral fellow Zhang Zaixi, and multiple teams to jointly release the integration of sequence understanding, structural prediction, and de novo design.RNABase Large Model——RNAGenesis. The model has successfully designed a targeted affinity of4.02 nMEfficient aptamer molecules, and achieveCRISPRThe Highest Improvement in Gene Editing Efficiency2.5Times, forRNADrug development provides a new paradigm, assistingRNADrug R&D becomes more efficient and faster. This research is titled "RNAGenesis: A Generalist Foundation Model for Functional RNA Therapeutics"}" was published on the preprint platform.



Breaking the Deadlock: The "Intelligent Engine" of RNA Drug Design


Non-coding RNAs play an important role in gene regulation, but the complex relationship among their "sequence-structure-function" has long constrained rational design. Traditional methods are time-consuming and labor-intensive, akin to "finding a needle in a haystack." Based on BioMap's xTrimo large model platform, which deeply integrates AI design with wet-lab experimental validation, BioMap and its collaborative team have constructed a universal foundational model for enabling rational design of RNA molecules — RNAGenesis:

  • Achieved 11 global firsts in the 13 tasks across the three major categories of BEACON, comprehensively surpassing existing models.
  • Construction of RNATx-Bench, including over 100,000 experimental data, encompassing RNA drug modalities such as siRNA, circRNA, shRNA, and ASO. RNAGenesis, with 1 billion parameters, surpasses models like Evo2 40B.
  • Outperform expert models like RhoDesign+ and AlphaFold3 in structure-related tasks such as structure prediction and sequence prediction based on structure.
  • The aptamer molecule targeting IGFBP-3 generated by RNAGenesis achieved an affinity breakthrough of 4.02 nM (nearly 2 times higher than traditional molecules).
  • Optimize sgRNA backbone to boost CRISPR base/prime editing efficiency by up to 2.5 times.


Technical Core: Hybrid Tagging Cracks the RNA Code


RNA has only four types of nucleotides, and the small vocabulary limits representational capacity when transferring NLP models. RNAGenesis pioneered a hybrid N-gram tokenization technique that captures both single-nucleotide features and functional conserved modules (3-5nt) through multi-scale convolutional kernels. Compared to traditional methods, the model converges faster with significant breakthroughs in key task prediction accuracy: APA site prediction R²=89.03, non-coding RNA classification accuracy 97.82%, and ribosome load prediction R²=85.83.


Image


Application Breakthrough: From Algorithm to Therapy


  • Aptamer Design: Stability and Affinity Combined


Aptamer molecules, with their programmability and high affinity for targeting proteins, have become powerful tools in therapeutics, diagnostics, and synthetic biology. Compared to natural aptamers, the aptamer sequences designed by RNAGenesis exhibit higher sequence homology, lower minimum free energy, and optimized GC content. Among them, RGen-aptamer-8 and RGen-aptamer-9 show binding affinities for the IGFBP-3 target protein as low as 4.02 nM and 6.06 nM, respectively, significantly outperforming molecules obtained through traditional experimental screening (11.6 nM).


Image



  • Gene Editing: sgRNA Efficiency Leap


Through strict screening of the generated sequences, RNAGenesis produces higher-quality candidate sequences compared to other models, significantly enhancing the gene editing efficiency of the CRISPR-Cas9 system.

RNAGenesis experimentally validated the top-ranked backbone sequences: In the validation at endogenous sites, the RGen-6 backbone achieved more effective targeting of the B2M and AAVS1 genes under various sgRNA dosage conditions.Knockout, for example, at AAVS1Under medium dose conditions, efficiency increases by approximately 2-fold. In prime editing requiring more complex RNA designs, RGen-6 improves efficiency by up to 1.2-fold compared to wild-type pegRNA.


Image





Universal Breakthrough: Comprehensive Enhancement of Cutting-edge Gene Editing Technology

The power of RNAGenesis is not limited to the CRISPR-Cas9 system; its designed RNA scaffold also performs exceptionally well in more advanced and complex gene-editing technologies, achieving a breakthrough in "zero-shot generalization" — meaning the model design principles remain applicable even without targeted training. For instance, in the field of base editing, the application of RGen-6 increased the efficiency of cytosine base editors (CBE) by more than 2.5 times compared to wild-type sgRNA. Meanwhile, it also achieved robust efficiency improvements in adenine base editors (ABE).

  • RNATx-Bench

Image


The project also builds RNATx-Bench, which includes more than 100,000 experimental data, covering RNA drug modalities such as siRNA, circRNA, shRNA, and ASO. RNAGenesis has 1 billion parameters.In the downstream task prediction metrics,Outperforming models like Evo2 40B, for instance, RNAGenesis has increased the prediction accuracy of shRNA drug candidates by over 10%. In the potency prediction of ASO and siRNA, it demonstrates superior capability across nearly all key clinical targets compared to all baseline methods.


The success of the RNAGenesis model marks that in the field of RNA therapy design, optimizing model architecture and having a deep understanding of RNA biological characteristics may be more important than simply increasing model parameters. It achieves better predictive performance at a lower computational cost, demonstrating its strong potential to guide and accelerate the design and development of next-generation RNA drugs.


In the future, BioMap will continue to deepen the integration and innovation of bio-computation and experimental technologies.xTrimoWith a series of large models as the cornerstone, we empower breakthrough explorations in the global life sciences field. We look forward to collaborating with more like-minded research institutions and industry partners to jointly address the complex challenges in the life sciences domain.

Contact for cooperation: info@biomap.com


Understand the Project:

  • Biorxiv:

    https://www.biorxiv.org/content/10.1101/2024.12.30.630826v3

  • Github:

    https://github.com/zaixizhang/RNAGenesis

  • Professor Mengdi Wang: https://ece.princeton.edu/people/mengdi-wang

  • Professor Cong Le:https://profiles.stanford.edu/186687

  • Dr. Zhang Zaixi:https://zaixizhang.github.io/