
Developer of Innovative Drug R&D Platform

2020In the year, Li Yanhong led the establishment of BioMap.
With nearly30Years ofAIEmpowering the continuous focus on life sciences, Li Yanhong firmly believes that the improvement of computing power and the decrease in gene sequencing costs will bring new possibilities to life science research.
And at that time, for such a company"Alternative" companies, many in the industry are watching with interest, but few truly understand.
To the surprise of many, it is this very startup that, in just a few short years, has created innovations in China and even the world.AI For Life ScienceThe Miracle.
2022Year and2023Year,CompanyPublished successivelyxTrimo V1、V2`, achieving the world's first life science foundational large model with tens of billions of parameters`,The number of parameters is higher than the second place.7Times more, surpassing the industry in dozens of tasks to achieveSOTALevel。
2024Year, the all-new generation of foundational large models in life sciencesxTrimo V3Birth, Parameters Expanded to2100Million, covering proteins,DNA、RNASeven major mainstream modalities in life sciences, including200More than 10 tasks have been achievedSOTALevel.
And the legend continues.
The recently concluded Global First Virtual Cell Challenge (Virtual Cell Challenge), BioMap凭借自主研发的模型xTrimoSCPerturb, from around the globe114Country,1200The remaining teams stood out and won the championship.
In response to this outcome, BioMapVice President、AIR&D DirectorZhang Xiaoming"Not particularly surprising, as BioMap had already laid out its virtual cell strategy four years ago. This championship win is the fruit of many years of deep accumulation."
"We didn't put in much extra effort for this competition," Zhang Xiaoming said.
Virtual cells are not a new concept, but in recent years, with the advent of multi-omics dataRapid growth andAITechnicalBreakthrough, this field isBrewing a new round of explosion。
Looking ahead, Zhang Xiaoming believes that virtual cells“AlphaFold"Moments" is expected to be promising in the future3–5Arriving within the year, this field is related toAILike proteins, it is expected to change the research paradigm and has the potential to冲击诺奖.
From a broader perspective, the multi-scale, cross-modal life science foundational model system constructed by BioMap is becoming increasingly sophisticated.Zhang Xiaoming compared it to"Lighting Up a Christmas Tree": Progressively Scaling Up from the Molecular LevelCell,Tissue, Organ。
This is an extremely imaginative feat, accompanied by immense challenges.
In reality, high-quality data is extremely scarce, cross-scale mechanisms are still unclear, and the transformation chain from algorithm innovation to experimental validation is both lengthy and complex, causing many tech giants to back away.
TargetingThese fundamental challenges are being explored and addressed one by one by BioMap, and have already been applied in multiple specific scenarios.ObtainedValidation, won multinational pharmaceutical companies、Biotechetc.CustomerWeHighly recognized.
The development journey of BioMap is a case study of Chinese technology enterprises driving paradigm shifts through long-term technical dedication in cutting-edge interdisciplinary fields.
ThroughThis10000Character Interview,The contours of a cutting-edge technology, an innovative company, and an emerging industry are becoming clearer.

The following is our conversation with Zhang Xiaoming, Vice President of BioMap and Head of AI R&D.
Zhang Shichen: First of all, congratulations to BioMap for winning the championship of the world's first Virtual Cell Challenge. What is the biggest gain for the team from participating in this competition? Was winning the championship within your expectations?
Zhang Xiaoming:To be honest, winning the championship exceeded our expectations, as there were many uncertainties in the competition and the global competitors were quite strong.
But we were confident of achieving good results from the outset, because although the Virtual Cell Challenge is being held for the first time this year,BioMap's technical accumulation in this direction began four years ago.。As early as2021In the year, we had already launched the development of a single-cell pre-trained large model and began constructing a dedicated model for target perturbation prediction. We also established a specialized cell laboratory to carry out data production and experimental validation, forming a closed-loop capability for model iteration.
Compared to accolades, our greatest gain is the further validation of our judgment on the overall technical strategic direction, as well as a clearer understanding of our technological standing globally.It proves that BioMap still possesses a technical team capable of tackling cutting-edge scientific research challenges, even under the premise of multiple concurrent operations.If I had to sum up this experience in one word, it would be"Technical Confidence."
Zhang Shichen: When it comes to virtual cells, many people may still feel very unfamiliar with the concept, but in fact, virtual cells are not a new idea. What do you think is different this time around that has made this field hot again?
Zhang Xiaoming: Virtual cells are constructed through computational methods to create a runnable, simulatable, and predictable cellular system.This concept is extremely popular now, and behind it isAIPromotion.
AIVirtual Cell (AIVC) The approach is as follows: it integrates the genome, transcriptome, and proteome.etc.These multi-omics data are integrated to establish a unified multi-modal, multi-scale model. With this model, various perturbations can be simulated.——Such asUseDrug, Knockout of a Gene——Then observe how the cells will react. More importantly, it can also combine mechanism reasoning and generativeAI, to "predict" those cell states that have not yet been observed experimentally.
In simple terms, it is usingAITo simulate cells. And once we can simulate cellsPrecision, the next step canInSimulated Tissues and Organs。
BioMap's Technical Layout in Large Models, andAIVirtual CellIn concept, they are actually highly compatible.,This is preciselyWhy We Are Planning Virtual Cells。
Zhang Shichen: What Impact Can Virtual Cells Have on Biomedicine?
Zhang Xiaoming:With virtual cells, I believe there will be a fundamental paradigm shift in drug development, bringing about several disruptive impacts.
First,The cycle of drug research and development will be significantly shortened.
Because virtual cells can complete tens of millions of cell perturbations and mechanism validations within computational space, directly breaking through the bottlenecks of throughput and cost in traditional experiments.Previously, experimental verification that might have taken several months to complete can now be simulated in a few days or even hours, with significantly reduced costs.
Second, the ability to discover innovative drugs will be greatly enhanced.
With such high efficiency"Search Engine," we can explore the vast chemical and biological molecular space, and even find molecules that do not exist in nature. In this way, not only can new drugs be found more quickly, but also some candidate drugs with entirely new mechanisms of action and great value can be discovered.
Third, more complex and precise innovative therapies can be designed.
The system framework of virtual cells supports the design of therapeutic strategies across scales and time sequences. In the past, we could only intervene at a single molecular or cellular level, but now we can design drug combinations across different scales and time sequences.Ultimately allowing the treatment strategy to be changed from the original"Single-point effect" has evolved into a cross-pathway, cross-cell, and even cross-time-and-space joint intervention solution.
Fourth, and a particularly crucial point: it can help bridge the gap in drug development from preclinical to clinical stages."Valley of Death".
Virtual cells further develop into virtual tissues and virtual organs.Then, many problems that originally had to wait until the clinical stage to be verified——Such as toxicity, drug resistanceEtc.——In fact, all of these can be pre-rehearsed in the virtual system.
This is equivalent toAITo run through clinical trials first, significantly improving the quality of drugs entering the real clinical stage, thereby markedly reducing the failure rate in later stages and overall R&D costs.
Ultimately, all these changes are aimed at enabling more diseases to be cured.
Zhang Shichen:2024In the year, BioMap released the academic version of the large-scale single-cell foundational model.scFoundation, What is the significance of this important achievement for BioMap and the industry?
Zhang Xiaoming:scFoundationThe release is actually a very crucial piece of the BioMap large model family's overall plan. It marks our official upgrade from pre-training at the molecular scale (such as proteins) to large-scale pre-training at the cellular level.
scFoundationIt also lays the foundation for the subsequent construction of a unified multi-modal, cross-scale model architecture. This achievement has become ourAIThe technical foundation in virtual cells. Relying on it, we have rapidly constructed a batch of downstream task models with leading effects in the cell field, including perturbation prediction, and made good progress in practical business scenarios such as target discovery.
At the same time, we have combined model capabilities with experimental validation, utilizing"Closed-loop of dry and wet" for continuous iteration and upgrading.
For the entire industry, we have alreadyscFoundation 's reasoning ability is open to the public, and its reasoning code has been open-sourced.
In this way, it can help more teams move forward more quickly."The Era of Single-Cell Pretraining" — Enabling the rapidly growing single-cell multi-omics data to be analyzed more efficiently and utilized more fully.
scFoundationHas becomeAIAn important achievement in the development of virtual cells, it can be seen in many cutting-edge research fields.
Zhang Shichen: BioMap's participation in this competitionxTrimoSCPerturbWhat are the core innovations of the model? What is the most significant breakthrough compared to other models?
Zhang Xiaoming: This is a very good question. The model we are competing with this time is calledxTrimoSCPerturb, it is a perturbation prediction model based on single-cell pretraining.
A key reason for its remarkable effectiveness lies in:It utilizes BioMap's internal resources that have not yet been released to the public. xTrimo v4 Two entirely new foundational large models.
One isscFoundationThe2.0Version. Compared with the previous generation, it has made comprehensive optimizations in model architecture, training data scale, and training strategies, thus having a stronger representation ability for cell states and gene expression.
The other is our self-developed next-generation protein pre-trained large model.——xTrimoProteinNext. It can provide a deeper understanding of the protein sequences involved in the perturbation.
Based on these two models,xTrimoSCPerturbIn fact, a cross-modal perturbation prediction system was constructed.。
At the same time, we have also carried out more refined and stricter quality control on the perturbation dataset to ensure that the model can fully absorb the key signals in a large amount of high-quality perturbation data, and use this to guide the training process.
It can be said that the above-mentioned innovations constitute the most core difference between us and other participating teams.
Zhang Shichen: TrainingxTrimoSCPerturb, how long did it take BioMap?
Zhang Xiaoming:We didn't put in much extra effort for this competition.An important reason is that we have already accumulated solid technical expertise in the early stage.— For examplescFoundation 2.0AndxTrimoProteinNext。
These high-quality, high-precision underlying representation models were already prepared before we entered the competition. Therefore, during the competition, we only needed to quickly build and train a perturbation prediction model on these existing foundations.
Zhang Shichen: Among the top three teams in this challenge, except for BioMap, all are from universities. How do you view the differences in thinking between academia and industry in the field of virtual cells? What are the different focuses?
Zhang Xiaoming: If we look at it from the perspective of development stages, virtual cells are indeed still in an intermediate stage that is mainly focused on cutting-edge academic exploration and secondarily on industrial implementation. Interestingly, when it comes to the underlying technical principles and the construction of foundational models, the understanding between academia and industry is increasingly converging.
Because the industry has clear business implementation needs, there will be a stronger emphasis on clear business orientation throughout the entire chain of task definition, data selection, and model training. As a result,The industry usually prioritizes finding scenarios that can form small closed loops, running them through first and then iterating.——Through the rapid validation and optimization of local models, capabilities are gradually solidified, eventually pushing the entire virtual cell chain towards a truly implementable and scalable application state.
AndThe academic community, on the other hand, tends to prioritize the rapid expansion of modeling capabilities for virtual cells across different scales and modalities, even if the mechanistic derivations between certain stages are not yet fully coherent or temporarily unable to form a complete closed loop.ThisThe strategy of "expanding first, then filling in the gaps" helps to seize the initiative in early scientific research layout.
Zhang Shichen: Do you think that the current virtual cells are still in the divergent stage in terms of technical routes, or have they started to show some convergent trends?
Zhang Xiaoming: I think virtual cells actually present different states at different scales.
For example, at the molecular scale, everyone's technical path is relatively clear, the concepts are relatively similar, and there is a high degree of consensus.
ArrivedAt the cellular scale, although the overall direction remains within the same broad technological field, some differentiation has begun to emerge, with various attempts being made in specific sub-directions.
However, once beyond the single-cell level and entering the realms of intercellular interactions and microenvironments, uncertainty significantly increases, and approaches begin to show marked differences.
Further up, toAt the tissue or even organ level, there is currently almost no clear path for constructing large models.It can be said that the entire technical route is from the bottom to the top, becoming more divergent as it goes higher, and gradually exploring in uncertainty.
Zhang Shichen: How does BioMap define virtual cells?“AlphaFold"moment," and how far are we from this moment?
Zhang Xiaoming:AlphaFoldThe problem being solved is a decades-old challenge that has puzzled the biology community — protein folding. Not only does it have an extremely high prediction efficiency, but more remarkably, it achievesLevel of experimental accuracy.
So we say“AlphaFoldMoment", actually because: this isAIFor the first time, a qualitative breakthrough was achieved in a clear, quantifiable, and verifiable biological task, truly reshaping the scientific research and engineering paradigm of the entire field.
So for virtual cells, I think,When a virtual cell model can continuously and reproducibly simulate the real behavior of cells in key biological processes within a computer, and these simulation results can be systematically verified in experiments——Then, it can be said that the virtual cell "AlphaFold "The Moment" Has Arrived。
At present, whether in the academic circle or the industrial circle, everyone has a high degree of consensus on this vision and is actively investing. Moreover, the technology itself is also rapidly evolving.
From an algorithmic perspective, the capabilities of large foundational models in the life sciences are continuously improving, rapidly covering different scales from molecules and cells to tissues, as well as multimodal data. The performance of these models is also steadily increasing. Moreover, there is a shift away from relying solely on purely data-driven large models. Instead, real biological mechanisms are being integrated into the model architecture and training processes, allowing for...AI More "Understanding" Biology。
From the data perspective, life science data itself is still growing exponentially, which is very different from other fields (such as general language models).——We have a very solid data foundation.
More importantly, the industry's attitude towards virtual cells has shifted from“Want to give it a try”Has become“Strategic Must-Choose Option”。
Of course, the full-chain virtual cell industry has been implemented in few cases so far, but there have been successful validation cases in some local scenarios, which is a great encouragement for the entire field.
If the current trend continues, we predict that virtual cells will“AlphaFold"Moments" is expected to be promising in the future3To5Arriving within the year.
Of course, if we set our sights further ahead—— For example, ultimately achieveVirtual Organ——That may take even longer.
AlphaFoldObtained2024Nobel Prize in Chemistry, we also have reason to believe that once virtual cells truly usher in their "AlphaFold"Moment" can also lead to Nobel Prize-level breakthroughs.
Zhang Shichen: BioMap is one of the very few companies in the world that has implemented virtual cell technology in industrial applications. What were the biggest obstacles in the process from exploration to implementation? And what efforts did BioMap make?
Zhang Xiaoming: We have summarized that there are mainly two major challenges:
The first is the complexity of the data itself.
The data in life sciences is inherently multi-omics.——Including genomics, transcriptomics, proteomics, and more, and it is also multi-scale: from molecules and cells to tissues and organs, and even includes temporal dimensions and spatial locations. To fully align data from different sources and scales in both time and space within such a high-dimensional, heterogeneous, and sparse system is currently almost impossible.
Although there may be locally aligned datasets, full-scale, high-quality, and large-scale aligned data is still extremely scarce. Especially systematic perturbation data, which is not only scarce but also very sparsely distributed.
This constitutes the most fundamental and realistic bottleneck for the implementation of virtual cell technology.
The second issue is that the chain of transformation from technology to application is too long.
The entire technical chain of virtual cells is already quite long, and once it needs to be integrated into real industry scenarios, such as drug discovery, this chain becomes even longer.
It is hard to expect a one-step transition from scientific research to application. Therefore, it is necessary to first construct a small and closed-loop verification path to complete within a limited scale."The iteration of 'prediction → experimental validation → feedback optimization', then gradually expanding to more complex modalities and scales."
In response to these two challenges, BioMap has made two key attempts in the past few years:
First, at the data level, we have constructed a cross-modal, cross-omics, and cross-scale life science knowledge graph.
Through this figure, data from different sources——Even if the original formats are completely different——they can be interconnected based on biological relationships, logically forming a "universally interoperable" panoramic network. At the same time, we have also established a multi-omics, multi-modal high-dimensional vector index matrix, allowing different types of data to be aligned, retrieved, sorted, and associated within a unified representation space. This is equivalent to laying a computable and scalable data foundation for virtual cells.
Secondly, at the system level, we have built a"Dry-wet closed loop" capability.
Specifically, it involves using large models for perturbation prediction, followed by immediate high-throughput experimental validation through our in-house cell lab, and then feeding the results back into the model for iterative optimization. This closed loop currently operates mainly at the single-cell level, but we are gradually expanding to cross-scale, cross-modal scenarios.
It is through such"With the strategy of 'starting with a small closed loop and continuously expanding,' we can gradually transform virtual cells from a scientific research concept into a technical engine that can truly drive industrial value such as drug development."
Zhang Shichen: When BioMap Empowers Multinational Pharmaceutical Companies with Virtual Cells andBiotechHow did the client react at this time?
Zhang Xiaoming: In terms of response, the change is quite significant.
At first, when we communicated with customers, more oftenExplain to them what virtual cells are, the business value this technology can bring, and the potential transformative impact on key areas such as target discovery and drug design. We will also detail the path from technology to implementation, combining it with specific scenarios.
But as the technology continues to mature, especially after we have successfully implemented some validation loops in actual projects, there has been a significant shift in customers' perceptions and attitudes.Now, many seaInForeign customersVeryLooking forward to the landing prospects of this technology, and even explicitly stated: hopeUtilizeVirtual cells, building a brand-new R&D engine for themselves.
Accordingly, our way of communication has also changed.——No longer“To use or not to use”, but rather“Why Us”Customers will ask more in-depth questions: Where exactly does your technology lead? This actually reflects a key change: customers' increasing recognition of virtual cell technology itself.
Zhang Shichen: From proteins,DNA、RNATo virtual cells, building a cross-modal basic model system covering multiple biological levels is an extremely complex capability that is also extremely scarce on a global scale. How did BioMap systematically plan and gradually implement this technological system?
Zhang Xiaoming: BioMap's large-scale model system construction is actually carried out step by step according to different biological scales and different data modalities. At the same time, during the advancement process, we always align with the actual pace of industrial implementation to ensure that the technology is both cutting-edge and practical.
Our starting point is protein. Because protein in antibodyDrugIn R&D scenarios with the most direct business value, we first constructed our first large-scale pre-trained model at the molecular level, which is also the largest protein model in the industry.——xTrimoProtein。
Subsequently, we expanded to support large genomic models with ultra-long sequences.xTrimoDNAAnd large models that integrate understanding and generationxTrimoRNA。
Because of the existence of the central dogma, for proteins,DNA、RNAThe modeling can help enhance our understanding of different modalities at the molecular scale and improve the performance of downstream tasks.
At the cellular scale, we have constructed a single-cell pre-trained large model.scFoundation, and on this basis, developed a perturbation prediction modelxTrimoSCPerturb。
It can be said that we have preliminarily built a cross-scale, cross-modal foundational model system that spans from molecules to cells, from sequences to functions, and from static characterization to dynamic perturbations. In this process, we have also deeply integrated an increasing number of biological mechanisms into the models, significantly enhancing their understanding and reasoning capabilities.
Zhang Shichen: Looking ahead, what kind of imaginative space does this foundational large model system for life sciences have? For instance, can we anticipate a truly integrated system that covers all biomolecules and even entire life systems?"Unified Biological Large Model"?
Zhang Xiaoming: I think it is completely predictable. To draw an analogy, the entire large model system of life sciences is like a Christmas tree that is being lit up: the bottom layer is at the molecular scale.——We have gradually builtDNA、RNA And large pre-trained models of proteins; the next level up is the cellular scale — throughscFoundation Such large single-cell models allow us to begin understanding the state and behavior of cells; moving forward, we will continue to build models of cell-to-cell interactions, followed by those of tissues and organs.。
This is a process of lighting up layer by layer, from the bottom to the top. In this process, we not only integrate each single-modal model with its specific application scenarios within the same scale to form a small, implementable closed loop, but also promote collaboration between different modalities at the same scale. For example, by integrating genomic, transcriptomic, and proteomic data, we build multi-modal fusion closed-loop capabilities. Furthermore, we are also exploring cross-scale and cross-modal integration."Combining dry and wet labs" in a closed loop, closely aligning with real industry needs to truly transform technology into value.
Zhang Shichen: In the field of biology, cross-scale modeling is quite challenging. What are some good experiences from BioMap?
Zhang Xiaoming: This virtual cell (VC) The model in the challenge is actually a typical example of cross-modal, cross-scale modeling.
Our ModelxTrimoSCPerturb At the same time, it integrates two key components: on the one hand, it invokes a protein pre-trained large model (xTrimoProtein), and deeply characterizes the target proteins involved in the perturbation at the molecular scale; on the other hand, it is also based on scFoundation This single-cell large model performs simulations at the cellular scale.
In other words, this model essentially builds a bridge between the molecular scale and the cellular scale.——By combining the representational capabilities of two different scales and modalities, more accurate perturbation predictions were achieved.
We believe that this ability to establish effective connections between scales"The 'Bridge Type' model is a key approach to solving the challenge of cross-scale modeling in the biological field. In the future, we will continue to explore more cross-scale collaborative methods along this direction."
Zhang Shichen: BioMap's past practices have been verified.Scaling Law(The scaling law) has proven effective in the field of life sciences, and recentlyAICircle AboutScaling LawWith more and more discussions about facing bottlenecks, have you also observed a similar trend emerging in the life sciences field? How is BioMap responding to this?
Zhang Xiaoming: AboutScaling LawThe discussion has indeed become very urgent in the general large model field, but in the life sciences field, I think it is quite different.
Because the data in the life sciences is still growing explosively. Behind this is the rapid advancement of a new generation of omics and sequencing technologies. For instance, the cost of single-cell sequencing has dropped from a few dollars per cell to just a few cents over the past decade, while the overall data volume has increased nearly ten thousand times.
This means that we are closer to"Data saturation" is still very far away,Scaling LawIn the field of biology, the bottleneck will come later, at least in the foreseeable future.
However, in the field of general large models,Scaling LawThe reflection on the bottleneck has actually helped us issue an early warning. Therefore, BioMap's current strategy is to "grasp with both hands":
On the one hand, we are still actively embracingScaling Law The dividend - continuing to expand data scale and model parameters, as long as the data is still growing rapidly, this rule remains valid. But on the other hand,We are also actively exploringScaling Law The "paradigm upgrade" no longer relies solely on "larger and more" training, but gradually shifts the focus from the training stage to the inference stage.
More importantly, biological data itself has a natural structure and high dimensionality, unlike natural language, which requires compression and simplification through human language.
The world seen by general large models is actually"The translated version into human language has information loss; whereas our model directly processes"DNA Sequence,RNA Expression profiles, protein structures, and other raw biological signals—this is the most authentic language of life systems.
Therefore, what we are actually trying to do is something more fundamental: to construct a world model of living organisms that is as complete as possible on an ultra-high dimensional microscopic scale.
Zhang Shichen: According to the division of scientific problems and engineering problems, currently, the virtual cell appears to be more of a scientific problem.AI What about proteins? Have they become a type of engineering problem?
Zhang Xiaoming: I think the answer is yes.
Compared with virtual cells,AIThe R&D pathway for proteins has become very clear, and the implementation pace is increasingly maturing.AIThe value that can be delivered here is both specific and measurable.
For example, we can now use a protein generative large model tode novo(From scratch) protein design. Many of these designed molecules have far exceeded the space of known natural proteins — meaning we can discover a vast number of entirely new drug candidates that traditional experimental methods could never reach.
This is just the beginning. On this basis,AIIt can also predict multiple key attributes simultaneously for each designed protein sequence — such as affinity, stability, and expression level.etc. Through this multi-objective joint optimization, we have already filtered out a large number of low-potential molecules before entering the experimental stage, and only sent a small portion with the best overall performance into wet experiments.
This solves a long-standing problem in traditional drug discovery: In the past, a great deal of time was often spent optimizing affinity, only to discover later that the expression level was too low and the drug-likeness poor, forcing the process to start over. Now, we take these engineering attributes into account at the design stage, significantly improving the overall success rate.
Not only that, but the high-quality feedback data generated from experimental validation can also flow back into the model, driving the next round of iteration.——Forming an efficient, repeatable, and scalable "dry-wet closed loop."
So overall,AIProtein is no longer a question of "whether it can be done," but an engineering practice of "how to do it faster, better, and more efficiently."
Zhang Shichen: The top three teams in this challenge are all Chinese teams. How do you view this phenomenon? What factors have contributed to the success of Chinese teams inAI+Leading Strength in Life Sciences?
Zhang Xiaoming: Indeed, the top three teams in this competition are all Chinese teams. Although the competition results have a certain degree of偶然性, I believe that this is not purely coincidental but rather an inevitable outcome of long-term accumulation and沉淀, with several contributing factors:
First, China's investment in life science data and cutting-edge technology is very solid. In recent years, within China,The rapid enhancement in the capability to collect large-scale omics data (such as single-cell sequencing, spatial transcriptomics, etc.) is supported by infrastructure represented by high-throughput experimental platforms.Building a high-quality, large-scale biological data foundation has laid a solid groundwork.
Secondly, China inAIThe development momentum in the large model field is strong, with a deep talent pool. This is evident from the international topAI It can be seen from the conference —— the proportion of Chinese authors continues to rise.
At the same time, the biotechnology industry in China is in a stage of rapid development, with bio-manufacturing and biopharmaceuticals playing a significant role.AIThere is a strong demand for a new paradigm driven byAI+Life sciences provide a natural fast track.
Finally, and most importantly:AIThe rise of native interdisciplinary talents is accelerating. Many of the younger generation now joining the front lines of scientific research and industry are already familiar with...AI, and systematically studied biological knowledge.
This kind of understandingAITheir compound background in both fields allows them to truly bridge the languages of the two domains, continuously driving breakthroughs in scientific research exploration and engineering implementation. I believe this advantage will continue to amplify in the future.
Zhang Shichen: BioMap has always adhered to the value of technology inclusiveness, doing a lot of work in promoting technology open source and ecological co-construction. What measures will BioMap take in the future?
Zhang Xiaoming: Since BioMap began building the foundational large model for life sciences, we have firmly believed in one principle: these underlying capabilities must become accessible to all.
Only when the entire industry innovates based on the technical foundation of large models can upper-layer applications accelerate their development; and feedback from the application end will, in turn, nourish the continuous iteration of the underlying models.——Forming a positive cycle, which can truly empower the industry.
To this end, we have conducted extensive open-source practices: whether it's large protein models or single-cell large models, we have open-sourced the code and inference capabilities. These open-source projects have received positive feedback from numerous developers and research teams, also helping us continuously optimize model performance.
Many clients start by using our open-source models, and after verifying the results in internal scenarios, they gradually build trust in BioMap's technology and further expand into business cooperation. It can be said that open source is not only a form of sharing but also the starting point of trust and collaboration.
Looking to the future, we want to create life science discoveries"Operating System". Data is built on top of this system.-Model-The Full-chain Closed-loop Capability of the Experiment,Attempt to create an intelligent agent (Agent) Driven Open Ecosystem.
In this ecosystem, researchers can conveniently call various components while also contributing their wisdom to enrich the ecosystem together.
We are particularly looking forward to advancements in antibodies/In scenarios such as protein design and synthetic biology, collaborate more deeply with global partners on technology and ecosystem development. Through joint construction and sharing, together advance life science discoveries into a smarter and more efficient new phase.
Zhang Shichen: Research intelligent agents are also a very hot field at present. How do you view its development?
Zhang Xiaoming: I think its future space is extremely huge, but to truly implement it, it must be built on high quality, full-chain, and biologically mechanism-basedAIAbove the system.
For example, we previously published in the knowledge field"Discovery Assistant," which can help users efficiently search through a vast amount of literature, deeply analyze the research potential of a specific target, and even automatically generate bioinformatics analysis code. This essentially provides the capability for in-depth insights at the forefront of research.
There is still a lot of work to be done in the future. For example, to achieve agent-driven protein design, the agent not only needs to invoke powerful generative large models to design molecules but also perform multi-objective collaborative optimization on multiple key attributes. Moreover, it must be able to automatically plan and drive subsequent wet-lab experiments, obtain high-quality validation data, and use this feedback for closed-loop iteration of the model.
The entire process involves the close collaboration of a large number of sub-agents, and the underlying models that each sub-agent relies on must possess sufficiently high accuracy and reliability to ultimately support an end-to-end, truly intelligent life science discovery process.
At present, many agents are still at the stage of using language to drive tools, but in the life sciences field, the intelligence of the tools themselves is the key.
Zhang Shichen: In the past, life science was driven by90%Wet lab and10%is calculated as,ByThis has spurred a vast market for biological reagents and laboratory equipment. Some argue that this proportion will reverse in the future, becoming90%Calculation+10%Wet lab, how do you view this trend?
Zhang Xiaoming: I agree that this shift is taking place. In the past, life sciences were primarily focused on wet lab experiments,AIOr the computing part plays more of a supporting role, providing assistance in only a few aspects, which is why "90%Wet Lab+ 10%The pattern of "calculation".
But withAIThe rapid development has led to a fundamental shift in the entire research and development paradigm. Nowadays, the core purpose of wet lab experiments is increasingly focused on validation.AIDesign Effect.More often than not, our wet lab experiments are not aimed at directly producing the final answers, but rather at obtaining high-quality feedback data to iterate and optimize the model.
In the long run, as virtual cell technology continues to advance, in the future we may not only be able to simulate individual cells but also construct virtual tissues and virtual organs.。By that time, even preclinical and some clinical stage validations could be completed within digital systems. In this way, the role of wet lab experiments in the entire R&D chain will shift from"Dominate" gradually shifts to "assist," and its subordinate role will become increasingly evident.
Therefore, the seesawing relationship between computation and experimentation is an inevitable direction of evolution. This process may take time, but it will undoubtedly occur.
Zhang Shichen: Looking back at your time inAI+What is the greatest insight gained from experiences in the field of life sciences?
Zhang Xiaoming: I am from2022Year StartFromGeneralAICut-inAI for Life Science, it has been almost three years up to now. Looking back on this journey, the biggest feeling is: we have a deep understanding ofAI The understanding has gone through a typical "growth curve."
At the beginning, everyone was full of enthusiasm, feeling thatAISeems to be able to solve all scientific problems. But soon, reality taught us a lesson in the face of the extreme complexity of life systems,AIIt was difficult to directly overcome those challenges, so I entered a phase of reflection and even some sort of low point.
And up to today, we finally begin to precisely knowAI What problems can be solved, to what extent can they be solved, and more importantly, what technical layout should be done next to enableAITo truly deliver value.
In simple terms, it is from"ThoughtAI "Omni-capable" to "DoubtfulAI "Almost nothing it couldn't do," to today's "knowing what it can do and knowing how to make it do more."

—END—




