
Developer of Innovative Drug R&D Platform
Recently, BioMap (BioMap) Collaborated with Professor Cong Le from Stanford University, Professor Wang Mengdi from Princeton University, postdoctoral fellow Zhang Zaixi, and multiple teams to jointly release the integration of sequence understanding, structural prediction, and de novo design.RNABase Large Model——RNAGenesis. The model has successfully designed a targeted affinity of4.02 nMEfficient aptamer molecules, and achieveCRISPRThe Highest Improvement in Gene Editing Efficiency2.5Times, forRNADrug development provides a new paradigm, assistingRNADrug R&D becomes more efficient and faster. This research is titled "RNAGenesis: A Generalist Foundation Model for Functional RNA Therapeutics"}" was published on the preprint platform.
Breaking the Deadlock: The "Intelligent Engine" of RNA Drug Design
Non-coding RNAs play an important role in gene regulation, but the complex relationship among their "sequence-structure-function" has long constrained rational design. Traditional methods are time-consuming and labor-intensive, akin to "finding a needle in a haystack." Based on BioMap's xTrimo large model platform, which deeply integrates AI design with wet-lab experimental validation, BioMap and its collaborative team have constructed a universal foundational model for enabling rational design of RNA molecules — RNAGenesis:
Technical Core: Hybrid Tagging Cracks the RNA Code
RNA has only four types of nucleotides, and the small vocabulary limits representational capacity when transferring NLP models. RNAGenesis pioneered a hybrid N-gram tokenization technique that captures both single-nucleotide features and functional conserved modules (3-5nt) through multi-scale convolutional kernels. Compared to traditional methods, the model converges faster with significant breakthroughs in key task prediction accuracy: APA site prediction R²=89.03, non-coding RNA classification accuracy 97.82%, and ribosome load prediction R²=85.83.
Application Breakthrough: From Algorithm to Therapy
Aptamer Design: Stability and Affinity Combined
Aptamer molecules, with their programmability and high affinity for targeting proteins, have become powerful tools in therapeutics, diagnostics, and synthetic biology. Compared to natural aptamers, the aptamer sequences designed by RNAGenesis exhibit higher sequence homology, lower minimum free energy, and optimized GC content. Among them, RGen-aptamer-8 and RGen-aptamer-9 show binding affinities for the IGFBP-3 target protein as low as 4.02 nM and 6.06 nM, respectively, significantly outperforming molecules obtained through traditional experimental screening (11.6 nM).
Gene Editing: sgRNA Efficiency Leap
Through strict screening of the generated sequences, RNAGenesis produces higher-quality candidate sequences compared to other models, significantly enhancing the gene editing efficiency of the CRISPR-Cas9 system.
RNAGenesis experimentally validated the top-ranked backbone sequences: In the validation at endogenous sites, the RGen-6 backbone achieved more effective targeting of the B2M and AAVS1 genes under various sgRNA dosage conditions.Knockout, for example, at AAVS1Under medium dose conditions, efficiency increases by approximately 2-fold. In prime editing requiring more complex RNA designs, RGen-6 improves efficiency by up to 1.2-fold compared to wild-type pegRNA.
RNATx-Bench
The project also builds RNATx-Bench, which includes more than 100,000 experimental data, covering RNA drug modalities such as siRNA, circRNA, shRNA, and ASO. RNAGenesis has 1 billion parameters.In the downstream task prediction metrics,Outperforming models like Evo2 40B, for instance, RNAGenesis has increased the prediction accuracy of shRNA drug candidates by over 10%. In the potency prediction of ASO and siRNA, it demonstrates superior capability across nearly all key clinical targets compared to all baseline methods.
In the future, BioMap will continue to deepen the integration and innovation of bio-computation and experimental technologies.xTrimoWith a series of large models as the cornerstone, we empower breakthrough explorations in the global life sciences field. We look forward to collaborating with more like-minded research institutions and industry partners to jointly address the complex challenges in the life sciences domain.
Contact for cooperation: info@biomap.com
Understand the Project:
Biorxiv:
https://www.biorxiv.org/content/10.1101/2024.12.30.630826v3
Github:
https://github.com/zaixizhang/RNAGenesis
Professor Mengdi Wang: https://ece.princeton.edu/people/mengdi-wang
Professor Cong Le:https://profiles.stanford.edu/186687
Dr. Zhang Zaixi:https://zaixizhang.github.io/