Last week, the biology AI model jointly released by Stanford University, NVIDIA, and other institutions in the United StatesEvo2It has attracted widespread attention and is hailed as the "biological version of Deepseek." As researchers around the world热议这一突破性成果, a special note in the paper reveals the strength of China's AI — from a Chinese company.BioMap xTrimo Series Large Models, was listed by the researchers of Evo2 as"Competitors with larger parameter scales but not yet open source", Revealing the Rise of China's Large Biological Models. In fact, BioMap has long been recognized by the U.S. market as a pioneer in foundational large models for life sciences, proactively laying out its strategy in this field since 2020. In October 2024, BioMap announced the launch of its revolutionary product——Full-Modality Biological Large Model xTrimo V3,In21 million parametersSet a new record for the world's largest-scale AI foundation model in the life sciences. This modal biolinguistic large model is building high-quality AI task models with lower data and cost requirements, achieving biological data modeling from DNA, RNA, and proteins to the cellular level for the first time. It also supports the parsing of 128K ultra-long DNA sequences, redefining the competitive rules for large biological models. More importantly, BioMap will also open source the 100 billion parameter version in the near future, surpassing Evo2 to become the world's largest open-source biological model. This also means that, in this global arms race to decode the secrets of life, China is transforming into a leader. AI Large Model that Understands "Life Language" Better
Dual Leap in Parameter Scale and Modality
As we all know, the number of parameters plays a crucial role in the development of models, and its scale directly affects the learning ability of the model. Compared with Evo2's 40 billion parameter scale, BioMap xTrimo V3...More than 5 times the volumeAdvantages Build Up the Super Brain of Life Science. Since 2020, BioMap has been constructing a large-scale life science exclusive data map - by integratingProtein interaction networks, single-cell sequencing, genomics, clinical data, and other multi-dimensional, multi-modal bioinformatics form a structured knowledge base covering over a million species and tens of billions of biological entities, with a data scale more than a hundred times the industry benchmark. Notably, in terms of the protein modality alone, its single model already has 10 billion parameters and is cited in the Evo2 paper as“xTrimo large”, fully demonstrating its leading advantage in model scale. Figure: Comparison of different model parameters in the Evo2 paper A solid data foundation also sets BioMap apart from overseas models like ESM and Evo, which primarily focus on a single modality of protein or DNA sequences. However, xTrimo V3 is capable of comprehensively modeling various biological data, ranging from molecules and metabolic networks to cells and even multicellular levels. This also means that xTrimo can deeply analyze the underlying laws of life systems through cross-modal alignment technology, breaking the limitations of a single data modality and achieving full-chain modeling from molecules to biological systems. Today, xTrimo V3 coversDNA, RNA, Proteins, Cells, Compound-Protein Interactions, Protein-Protein Interactions, and Life SystemsIncluding seven major mainstream modalities, achieving full-scale modeling from base pairs to cell clusters. Figure: xTrimo Foundation Model Family
Deepening Biological Algorithms, Unlocking the Potential of Scaling Laws
If the number of parameters is the crucial fuel that determines model performance, then the model's algorithms and architecture are like the internal combustion engine, directly deciding how to efficiently leverage the power of data. In terms of technical architecture, the xTrimo series of models fully consider the unique characteristics of bioinformatics data, constructing a large-scale, multi-modal, multi-scale model system. Generally speaking, the more parameters a model has, the more likely it is to remember small errors during training, leading to abnormal performance when facing new problems.BioMap's unique MoE architecture and biology-knowledge-guided training framework further unleash the potential of ultra-large-scale data, enabling the model to maintain efficient learning capabilities even during parameter expansion. To better leverage the Scaling Law, in the design of DNA modality models, BioMap no longer solely pursues the expansion of model scale. Instead, it adopts a knowledge-guided heuristic design based on biological insights, enhancing the intelligence level of "small models" through more rational network structures. This effectively bridges the gap between machine learning technology and biological understanding, significantly boosting downstream application performance. In the model architecture design, BioMap has pioneered the introduction of a multi-window scale attention mechanism and native double-stranded DNA modeling technology to address the issue of significant differences in sequence lengths among different genes and their regulatory elements. Unlike the reverse complementary data augmentation strategy of Evo2, this model directly supports DNA double helix structure modeling from the architectural level and adds a local perception module to capture 3D spatial information. These unique network designs for DNA show great advantages under the Scaling Law effect. With the same amount of training data and computation, these innovations enable xTrimoDNA to demonstrate stronger learning capabilities under equivalent computing conditions.Data shows that in core tasks such as gene mutation scanning, the 1 billion parameter xTrimoDNA outperforms Evo1/Evo2. Fig: a) The scaling law of large models, the relationship between the total computational FLOPs and evaluation perplexity (PPL) under different architectures. The green solid line indicates that our improved multi-scale Transformer architecture consistently outperforms Transformer, Mamba (Caduceus), and StripedHyena (Evo) across different computational scales. b) Zero-shot performance of different models on DNA/RNA DMS tasks. c) Zero-shot performance of different models on Protein DMS tasks.
The first to achieve value transformation, with over 400 users globally
If the development of large models cannot be effectively transformed into practical applications, their technical value will become an empty concept. In this regard, BioMap positions itself as"A world-leading provider of AI models in the life sciences"。While the vast majority of basic large models are still in the laboratory stage, BioMap is taking the lead in exploring the commercialization of AI large model platform infrastructure and application scenarios. This set of technical systems has already generated significant value at the industry level. The xTrimo platform, across more than 200 task models in fields such as AI target discovery, protein design, and strain modification, has supported customers in achieving over 20 validated antibody/enzyme designs and more than 10 innovative target authorizations, among other breakthrough results. In the field of large models in life sciences, BioMap has also initiated the first benchmark cooperation, gaining endorsement from top international pharmaceutical companies. November 2023,BioMapAnnounced a Large-Scale Strategic Agreement with Multinational Pharmaceutical Company Sanofi,The two parties will jointly develop cutting-edge models for biopharmaceutical discovery based on BioMap's large life science model. In this collaboration, BioMap will receive$10 millionThe prepayment, with a total transaction amount exceeding$1 billion. This is the first collaboration in the life sciences industry based on a foundational large model, proposing milestones based on model development rather than drug research progress.Marks the first time that a Chinese AI biology model has entered the core环节 of the global biopharmaceutical industry chain as a "basic research tool." So far, BioMap has servedMore than 400 global users, 60 QS100 universities,The Potential Value of Signed Orders is Nearly 2 Billion US Dollars, its clients include top pharmaceutical companies, research institutions, and biomanufacturing enterprises, covering multiple fields such as drug research and development, agrochemicals, and environmental protection. This proves that BioMap can not only export its technical capabilities globally, but also means that its AI capabilities have been transformed into commercially viable, mass-producible, and replicable solutions. Its innovative achievements have made breakthroughs in multiple fields, particularly highlighting their value in three major directions: Antibody and Cell Gene Therapy Drugs FieldBioMap innovatively integrates structural prediction algorithms with generative design technology to establish a full-process design platform covering peptides, small proteins, and nanobodies. Particularly for the world-class challenge of de novo design of nanobodies targeting given epitopes, the team has made breakthrough progress in GPCR epitope design without antigen-antibody complex crystal structures —The positive rate of the designed sequence is more than 3 times higher than that of the open-source method. Verified by N-glycan scanning, the obtained VHH antibody exhibitsNanomolar Affinity, This achievement marks that China has entered the international forefront in the field of computational antibody design. In terms of target discoveryBased on its self-developed large cell system model, BioMap has constructed an intelligent discovery pathway from omics data analysis to target validation. This model can precisely identify core regulatory genes driving cell state transitions by deeply mining disease-related multi-omics data, significantly improving the efficiency of target screening. Relying on a high-throughput protein drug generation platform, multiple immune combination targets or tumor-specific targets have been successfully validated and licensed, with some projects already entering the preclinical research stage.Microbiology Research Field, BioMap collaborates with its partners to deeply integrate the xTrimoDNA large model with a million-level microbial genome database, developing aApplication of Large Microbial Models, demonstrating excellent predictive capabilities in multiple aspects. After fine-tuning, it also demonstrates outstanding fitting capabilities in gene annotation, metabolic pathway analysis, phenotype prediction, and other aspects. Based on this type of microbial large model technology foundation, it is expected to assist research in the microbiome and biomanufacturing fields, enabling targeted strain modification and significantly reducing the cycle time.
Opening a New Era in Life Sciences
From billions of parameters to full modality coverage, from target discovery to industrial strain modification, BioMap's xTrimo V3 is undoubtedlyAIAn important milestone in the field of life sciences. Led by DeepSeek, the open-source trend of large models is shifting the competitive landscape of large models from "technological exclusivity" to "ecosystem co-construction." As the open-source version with 100 billion parameters approaches, BioMap's xTrimo series of models also contribute an important Chinese force to global life science research.By building competitive advantages through ecological collaboration, we believe that it will surely spark a new wave of enthusiasm for life science research on a global scale. Positioned as a platform-based company, the xTrimo foundational large model possesses cross-domain knowledge transfer characteristics. The underlying technology behind it not only accelerates breakthroughs in traditional fields such as drug discovery and precision medicine but also extends to emerging areas like materials science and environmental governance. Currently, synthetic biology and biomanufacturing have broad market prospects in China. BioMap is expected to provide corresponding innovative services and solutions for customers of different scales and needs in the future. In the future, AI will no longer be confined to the "high walls and deep courtyards" of a few fields, but will become an inclusive tool for decoding the secrets of life, benefiting the treatment of rare diseases and precision medicine. This will not only accelerate the development of industries such as drug research and development and biomanufacturing, bringing higher benefits to enterprises, but more importantly, it is expected to open up broader prospects for the health and well-being of all humanity, allowing the progress of life sciences to benefit everyone. —The End— Recommended Reading