BioLab: Enabling AI as a True Scientist – An End-to-End Autonomous Platform for Hypothesis-to-Experiment Life Science Research

Oct 23, 2025 10:30 CST Updated 10:30

BioMap

Developer of Innovative Drug R&D Platform

September 2025,BioMapFrom Princeton University, Stanford University, Zhejiang UniversityResearchThe team published a landmark paper on bioRxiv — "BioLab: End-to-End Autonomous Life Sciences Research with Multi-Agents System Integrating Biological Foundation Models》。

This study proposes a system named BioLab — a system capable of“Conduct scientific research independently”'s multi-agent AI platform. It can start from a vague scientific research goal, such as“Design an antibody targeting tumor-associated macrophages”, independently plan research routes, retrieve literature, invoke models for molecular prediction, generate experimental protocols, and optimize strategies based on experimental results.

In other words, BioMap enables AIThe First TimeIt has the full-chain scientific research capability from conception, reasoning to verification, just like human scientists. This breakthrough marks“AI-Native Scientific Research”The era of (AI-native Science) has officially arrived — AI is no longer just a tool for scientific research, but a participant, organizer, and driver of the research process. Its significance lies in:

A Leap in Research Efficiency：Traditional drug design often takes months or even years, while BioLab can complete the entire process from hypothesis to optimized molecules within days.
The Revolution of Knowledge Integration:It spans five levels: DNA, RNA, proteins, cells, and chemical molecules, achieving multi-scale biological information integration and reasoning.
Transformation of Scientific Research Models:BioLab not only“Help scientists conduct experiments”, More“Do Science YourselfResearch”, becoming an important step towards a fully automated research system.

At the same time, another breakthrough of BioLab is that it is no longer just a "scientific research toolbox," but an intelligent underlying operating system for scientific research. Traditional scientific research often resembles piecing together fragments: models run on one side, experiments are conducted on the other, and data needs to be manually integrated—inefficient and with fragmented processes. BioLab has rewritten this logic. Through a modular multi-agent architecture, it integrates the "Design-Build-Test-Learn" closed loop of scientific research into one system. All modules work collaboratively, making the research process as smooth as an operating system running applications.

More crucially, as an underlying research operating system, BioLab boasts high adaptability and scalability, supporting cross-task reuse of experimental templates and components, enabling the inheritance of knowledge accumulation during the research process at the system level. Its modules, models, and knowledge bases can be shared, replaced, and expanded. In the future, researchers will even be able to directly reuse others' experimental modules, just like downloading apps on a computer, and run, improve, and republish them on BioLab. We hope that BioLab can become the underlying system and innovative foundation of the scientific intelligence era, just as Windows and iOS are to the digital world.

Paper Title:BioLab: End-to-End Autonomous Life Sciences Research with Multi-Agents System Integrating Biological Foundation Models
Paper link:https://www.biorxiv.org/content/10.1101/2025.09.03.674085v1

OneResearch Question: LifeWhy Science Needs A“AI Lab”？

The life sciences are in an era of data explosion.

Every day, massive amounts of data are generated in the fields of genomics, proteomics, and cytomics, but the research process remains fragmented across different stages: modeling, experimentation, analysis, and validation — these tasks are completed by researchers from different disciplines, lacking efficient collaboration. The paper points out that the three major bottlenecks in current scientific research are:

1. Fragmented Process：Bioinformatics tools are isolated from each other and cannot work collaboratively.

2. Model Fragmentation:Existing AI models often focus on a single task (such as structure prediction, sequence generation) and cannot cover the complete scientific research process.

3. Disconnection between computation and experimentation:From“Model Prediction”To“Experimental Validation”The closed-loop still highly relies on manual operations.

At the same time, even the most advanced general large language models (such as GPT-4o, Gemini 2.5) still have obvious shortcomings in the scientific research field:They understand language, but not biology; they can write experimental reports, yet are unable to truly design experiments.

This is the background of BioLab's inception — an attempt to break the boundaries between computation and experimentation, making AI“Research Subject”System.

2.Research Method: Reconstructing Science with Eight Intelligent AgentsDiscovery Process

The core of BioLab is a Multi-Agent architecture. The research team used eight collaborative ""Virtual Researcher"Built a system capable of self-thinking, execution, and evaluation.This system integrates a modular professional agent collaboration system, retrieval-augmented (RAG) knowledge base, and a dedicated computational model toolkit developed based on the core foundational model into a unified framework. It is capable of proposingHypothesis, retrieval and synthesis of evidence, invocation of predictive models, and more cruciallyCanGenerate executable experimental protocols and feed the results back to form a closed loop.

1. Agent Division

Planner Agent(PlanningAgent）：As the chief designer, it is the first step in the operation of the BioLab system.Responsible for breaking down research objectives into executable tasksSonTask, and assign subtasks to the appropriate downstream agents。

Reasoner Agent(ReasoningAgent）：It is the core of BioLab's modular intelligence system, responsible for handling the sub-tasks assigned by the Planner after receiving them.Conduct scientific logical reasoning, formulateDetailed and FeasibleExperimental Strategy.

Memory Agent(MemoryAgent）：Save process information to support long-term learning and strategy improvement.

RAG Agent (Retrieval-Augmented Intelligence):Search for information in hundreds of millions of documents, databases, and knowledge graphs.

xBio-Tools Agent(Tool Intelligent Agent)：Invoke specific bio-computation tools to execute tasks.

Code Agent (Code ExecutionAgent）：Automatically generate and run scientific research code.

Critic Agent(ReviewAgent）：Check the scientific rationality of reasoning and results to prevent error accumulation.

Reporter Agent(ReportAgent）：Organize and output experimental reports and research summaries.

This system mimics the organizational structure of a real scientific research team: someone raises questions, someone designs solutions, and someone verifies conclusions.The difference is that all members of the BioLab team are AI.

Figure 1: BioLab Multi-Agent System, Coordinating Foundation Models for Automated Scientific Discovery

As a virtual laboratory driven by modular multi-agent systems, BioLab can autonomously coordinate the entire scientific discovery lifecycle (Fig. 1a).The architecture follows three principles: division of labor and collaboration, memory continuity, and scientific reliability.Seamless Connection between Computational Modeling and Wet Lab Validation, automatically completing the "Design-Build-Test-Learn (DBTL)" closed loop, and continuously optimizing hypotheses with new evidence.

The core operation of BioLab is autonomous workflow generation., coordinated by the Planner Agent (Fig. 1b), which breaks down task objectives into clear subtask sequences, assembles the required agents and computational tools into functional modules, and dynamically orchestrates them into a coherent multi-step workflow. The Memory Agent records intermediate conclusions and context in a structured manner, supporting consistency and strategy adjustments across multiple rounds of exploration.

To ensure all reasoning is based on reliable evidence, BioLab has configured a RAG agent based on a multimodal biomedical knowledge base (Figure 1c), which consists of three collaborative components: an integrated literature corpus, real-time web retrieval, and a proprietary biomedical knowledge graph. The RAG is responsible for query preprocessing, parallel retrieval, and evidence fusion, outputting a traceable evidence package relevant to the task.

2. Cross-scale Biological Modeling Platform: xTrimo Universe

BioLab's "Research Toolbox" consists of six foundational models: xTrimoChem, xTrimoProtein, xTrimoRNA, xTrimoDNA, xTrimoCell, and xTrimoText. These models cover five biological scales—chemical structures, proteins, RNA, DNA, and cells—forming a vast "biological model universe."

On this basis, the research team developed xBio-Tools, a collection of 219 validated computational tools (Fig. 1d), capable of performing high-fidelity tasks such as structure prediction and molecular simulation on demand.This means that BioLab can virtually invoke any kind of bio-computing tool.

3. Knowledge Retrieval and Reasoning

To anchor all reasoning in reliable, verifiable scientific evidence, BioLab has integrated a highly specialized Retrieval-Augmented Generation (RAG) agent, linkingBioLabThe core reasoning and customized multimodal knowledge base. RAG adopts a phased process: first, preprocessing and standardizing the query; then conducting parallel hybrid retrieval across multi-source data such as papers, public databases, knowledge graphs, and internal results; finally, deduplicating, reordering, and synthesizing candidate evidence to produce a task-oriented "evidence package." This knowledge base covers multiple modalities and supports cross-referencing, providing reliable and contextually relevant support for subsequent steps.

Figure 2: Performance benchmarking of BioLab and its xTrimo Universe against current best-in-class capabilities

To validate the key capabilities of BioLab in overcoming research fragmentation, the research team conducted a quantitative evaluation comparing it with existing SOTA AI systems. The results showed that, on four high-difficulty biomedical question-answering benchmarks (PubMedQA, MMLU-Pro/Biology, GPQA-diamond),BioLab's deep scientific reasoning performance continues to outperform general large models such as GPT-4o, Gemini-2.5, DeepSeek-R1, and Qwen3-235B-A22B (Figure 2a).In the evaluation of BioResearchQA, BioLab ranked first in three expert metrics: multimodal planning, tool invocation, and full-process scientific reasoning, demonstrating the ability to orchestrate high-level goals into executable multi-step plans (Figure 2b).

Most of BioLab's capabilities come from xTrimo Universe — a collection of base models and task-specific models (Fig. 2c). After systematically comparing the downstream model suite with SOTA models across various fields, the results show that BioLab achieves SOTA performance in multiple tasks, ranging from 91.30% to 100% (Fig. 2d).

(Figure 2e provides a fine-grained comparison of several representative tasks, further demonstrating robust performance across multi-scale tasks involving chemistry, proteins, RNA, DNA, and cells.)

4. Automated Research Loop

When a user inputs aTaskTarget(For example:“Design an Antibody for T-Cell Immunotherapy”), Planner generates a research roadmap, invokes xBio-Tools for molecular modeling and performance prediction, Reasoner synthesizes the results to propose hypotheses, Critic and Memory continuously monitor and learn, and ultimately Reporter outputs experimental plans, predictive data, and executable experimental scripts.

This“Self-driven”cycle, forming a true AI research closed-loop — namely“Design（Design）—Build(Build)—Test—Learn”。

Three、Case Validation：AI-Driven AutonomyScientific Discovery

The paper verifies BioLab's scientific research capabilities through two core cases.。

Case 1: Fully Automatic Design of Macrophage-Targeting Antibodies

BioMap Receives Mission:“Designing an Antibody Targeting Macrophages for Cancer Treatment”. The system automatically executed five phases:

1. FromLiteratureAndDatabaseCNExcavationCandidateTarget；

2. ThroughxBio-ToolsPredictionCandidateTargetAndTargetScreening；

3. YesCandidateTargetProceedPrioritySorting，Screen out the key target TNFSF11；

4. According to this selectionKnown Antibody DenosumabAs the design base (Fig. 3a–b)；

5. Perform multi-objective optimization (MOO) on this antibody,To enhance the overall predictive performance of candidates in key dimensions such as affinity, sequence naturalness, structural stability, yield, and structural integrity, resulting in an optimized version of Denosumab (Fig. 3c).。

Figure 3: BioLab Enables End-to-End De Novo Design and Optimization of Targeted Macrophage Antibodies

To explain the performance improvement, BioLabFurther provide mechanistic evidence at the molecular dynamics (MD) level: optimization in conjunction withTNFSF11The binding interface exhibits a more stable interaction network (tighter hydrophobic packing, more stable hydrogen bonds/salt bridges), consistent with the predicted increase in affinity (Fig. 3d). Overall, BioMapCan screen suitable targets, optimize molecules and mechanism hypotheses without human intervention, and provide computable results and explanations that can be directly applied for subsequent experimental verification.

This result not only indicates that BioLab can“Design”Antibodies, can also“Explanation”The Molecular Mechanism Behind the Improvement.

Case Study Two: Computational-Experimental Closed Loop for T Cell Target Discovery and Antibody Optimization

The second case is more groundbreaking., verifying BioLab's capabilities in a complete wet lab/dry lab closed loop.。

Figure 4: Closed-loop integration of BioLab and wet lab experiments in T cell target discovery and antibody optimization

BioLab in receiving"Designing a T Cell-Based Cancer Immunotherapy Treatment Strategy"After the instruction,Established a fully integrated "Design-Build-Test-Learn (DBTL)" cycle (Fig. 4a): Targeting this high-level goal, BioLabPrioritize nominations by combining knowledge of immune checkpoints with model evidence.PD-1As a target (Fig. 4b).

To directly implement the conclusion into experiments, the system automatically generates executable protocols: in primary humanTCRISPR-Cas9 in CellsKnockoutPDCD1. The wet experiments completed according to the protocol showed that the activity prediction given by the model correlated with IFN-γ.There is a significant positive correlation between secretion readings, functionally supporting the target and strategy (Fig. 4c).

After completing the target validation, the researchPersonStaff ApprovalBioLabDriving the Discovery Cycle into Antibody Optimization: Generation of Two Optimized Variants, Pem-MOO-1 and Pem-MOO-2, Which Demonstrated Stronger Pathway Blocking Capability Than the Parental Antibody in Dose-Dependent Experiments. Notably, the IC of Pem-MOO-2₅₀The value increased nearly 3-fold (0.01 nM vs. the parent 0.027 nM), demonstrating that this significant enhancement in bioactivity directly results from the computational optimization process (Fig. 4d and left panel of Fig. 4e).

Finally, the researchers conducted a comprehensive multi-parameter evaluation experiment on the optimized antibody to assess its overall therapeutic potential and ensure that functional improvements did not compromise developability. The experimental results showed that Pem-MOO-2 not only had better potency and binding affinity but also maintained developability characteristics comparable to the parental antibody—such as yield, purity, and thermal stability (Fig. 4e, right). This efficacy enhancement was achieved after making an acceptable and known trade-off in terms of hydrophobicity.

This means that BioMap not onlyPropose a hypothesis, alsoCreated stronger antibodies- and has been validated in real-world experiments.

Four、Major Discoveries and Innovations

1. AI System Architecture of Research Team Type

BioLab Proves for the First Time That AI Is No Longer Just“Help scientists analyze data”，StillCan independently complete scientific discoveries.The division and collaboration of eight agents also make BioLab an AI system with a "scientific research organizational structure." This "AI research team" model lays the prototype for future AI research institutions.

2. Dry and WetDeep Integration of Experiments

Traditional scientific research relies on manual switching between computation and experimentation. BioLab achieves the automatic connection of these two, converting computational results into experimental operations and feeding experimental data back into the model. This means the iterative cycle of scientific research will be shortened from months to days.

In the case of T cell target discovery and antibody optimization, BioMap successfully achieved a closed-loop integration of computation and experimentation, surpassing pure computational simulation to truly guide the execution of physical experiments. The system can not only autonomously identify PD-1 as the primary therapeutic target but also generate detailed and actionable experimental protocols to guide successful wet lab gene knockout operations.

FiveConclusion: The Dawn of AI-Native Scientific Research

The emergence of BioLab is a crucial step in the integration of life sciences and artificial intelligence. It is not only an automated system but also a new paradigm for scientific research:

AI from“Laboratory Assistant”Leap to become“Research Collaborator”, from“Analysis Tools”Evolved into“Discoverer”。

In the future, labs may no longer be places where scientists work around instruments; instead, AI could continuously perform hypothesis testing in the background, leaving humans to make directional decisions and interpret results. As the last sentence of the paper states,，BioLab showcases a future where AIWill no longer be just a tool, but becomeA reliable scientific research collaborator in the process of scientific discovery.

This is not only a technological breakthrough, but also a conceptual revolution in the history of human scientific research. From now on, science may truly have a new researcher —AIScientist.

References

Thirunavukarasu, A. J., Ting, D. S. J., Elangovan, K. et al. Large language models in medicine. Nature Medicine 29, 1930–1940 (2023). URL https://doi.org/10.1038/s41591-023-02448-8.
Scho ̈lkopf, B. et al. Towards causal representation learning. arXiv preprint arXiv:2102.11107 (2021). URL https: //doi.org/10.48550/arXiv.2102.11107.
Seyhan, A. A. Lost in translation: the valley of death across preclinical and clinical divide – identification of problems and overcoming obstacles. Translational Medicine Communications 4, 18 (2019). URL https://doi.org/10.1186/ s41231-019-0050-7.
Macklin, P. Key challenges facing data-driven multicellular systems biology. arXiv preprint arXiv:1806.04736 (2018). URL https://arxiv.org/abs/1806.04736. Version 2, last revised 27 Sep 2019.
Zhang, S. et al. Position: Intelligent science laboratory requires the integration of cognitive and embodied ai. arXiv preprint arXiv:2506.19613 (2025). URL https://doi.org/10.48550/arXiv.2506.19613.
Cummings, S. R. et al. Denosumab for prevention of fractures in postmenopausal women with osteoporosis. New England Journal of Medicine 361, 756–765 (2009). URL https://doi.org/10.1056/NEJMoa0809493.
Thomas, D. et al. Denosumab in patients with giant-cell tumour of bone: an open-label, phase 2 study. The Lancet Oncology 11, 275–280 (2010). URL https://doi.org/10.1016/S1470-2045(10)70010-3. Epub 2010 Feb 10.
Alder, B. J. & Wainwright, T. E. Studies in molecular dynamics. i. general method. The Journal of Chemical Physics 31, 459–466 (1959).
McCammon, J. A., Gelin, B. R. & Karplus, M. Dynamics of folded proteins. Nature 267, 585–590 (1977). Jinek, M. et al. A programmable dual-rna–guided dna endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).
Cong, L. et al. Multiplex genome engineering using crispr/cas systems. Science 339, 819–823 (2013).
Mali, P. et al. Rna-guided human genome engineering via cas9. Science 339, 823–826 (2013).