Home Two Mega-Scale Billion-Cell Datasets Unveiled as Tech and Pharma Giants Double Down on AI-Driven Drug Discovery

Two Mega-Scale Billion-Cell Datasets Unveiled as Tech and Pharma Giants Double Down on AI-Driven Drug Discovery

Jan 20, 2026 18:37 CST Updated 18:37
Illumina

Diagnostic Product Developer

AstraZeneca

Pharmaceutical Technology Research and Development Provider

Tahoe Therapeutics

AI Model Technology Platform

Image

On January 13, genomics giant Illumina announced the launch ofBillion Cell Atlas, which is the largest cell-level CRISPR gene perturbation dataset to date.

This dataset contains1 billionCellResponse data in more than 200 disease-related cell lines,Will become the most comprehensive map of human disease biology to date.

In addition,Illumina’s project to build a 5 billion-cell atlas within three years will become the most comprehensive map of human disease biology to date.

Image

Pharmaceutical giants such as AstraZeneca, Merck, and Eli Lilly are also participating in the initiative, which aims to advance drug target validation, train advanced artificial intelligence models on a large scale, and promote research into fundamental disease mechanisms that were previously difficult to explore.

Coincidentally, on the same day,Tahoe Therapeutics, Arc Institute, and BiohubHeavyweight Collaboration, Officially AnnouncedMaximum Perturbation Dataset of Virtual Cell Models

Tahoe Therapeutics、Arc Institute和Biohub合作,共同创造最大的
 虚拟单元模型的微扰数据集

The plan will build on the previously releasedAbove Tahoe-100M, this isThe World's First Billion-LevelPerturb single-cell datasets.

The perturbation richness of the new dataset will be more than 4× higher than that of Tahoe-100M, representing one of the most ambitious biological data generation efforts to date, supporting AI-driven virtual biology.

图片

Currently, with the breakthroughs in AI large models and the explosion of high-throughput sequencing technology, a more ambitious vision has emerged:AIVirtual CellAIVC

And in order to build this grand vision, many companies have chosen to enter the market, creating datasets and foundational models, greatly promoting the development of this field.

图片
Figure: Key Datasets and Models of Virtual Cells

For example, AI giantNVIDIAAndChenZuckerberg InitiativeCZI) AnnounceAndExpand partnerships aimed at revolutionizing life sciences by advancing virtual cell models.

With the participation of sequencing giants, multinational pharmaceutical companies, top technology companies, and AI companies,AI-Driven Virtual Cell Research Enters Boom Period.

The emergence of milestone technologies is accelerating towards us.


图片

—The End—

Recommended Reading

图片

图片
图片
图片