Authors: Wu Yinggang, Li Yuning
On August 13, the Ministry of Industry and Information Technology officially released the "Typical Application Cases of Artificial Intelligence in the Field of Bio-manufacturing (First Batch)."
The cases involve a total of 16 enterprises and research institutes, including Beijing Joinn Laboratories, Shanghai Tianwu Technology, BioMap, and Shanghai Zhiyu Biotech, all of which are well-known entities in the field of biomanufacturing.
Recently, AI-driven bio-manufacturing has emerged as a major industry trend, with a continuous stream of national-level policies being rolled out.
Previously, the Ministry of Industry and Information Technology and the General Office of the Chinese Academy of Sciences jointly issued the “Notice on Launching the ‘Open Competition’ Mechanism to Identify Leading Teams for Innovative Tasks in High-Performance Bioreactors,” which prominently highlighted intelligent industrial operating systems. On July 31, the State Council Executive Meeting reviewed and approved the “Opinions on Deepening the Implementation of the ‘AI+’ Action.”
The industrial sector is equally bustling. On June 25, MegaRobo, an autonomous intelligent agent company, announced its plan to list in Hong Kong, strategically positioning itself in synthetic biology. On June 26, Zhong Shanshan, formerly China’s richest man, partnered with Jinbo BioPharm with a RMB 3.4 billion investment, betting on the convergence of AI and synthetic biology.
AI and biomanufacturing seem to be a natural fit. Yet, amid the current AI hype, it is worth considering: Which scenarios in biomanufacturing can AI truly empower? What is feasible, what is not, what holds significant promise, and what still warrants caution?
In its research and interviews, VCBeat learned that the applications of AI in the field of biomanufacturing mainly include: enzyme mining (novel enzyme discovery), enzyme engineering (enzyme sequence optimization), enzyme synthesis, metabolic optimization of cell factories, laboratory automation, and process optimization.
Among these, AI has demonstrated particularly notable efficacy in enzyme mining, enzyme engineering, and process optimization. For instance, in the realm of enzyme engineering, most enterprises can observe tangible and rapid results within six months to a year. However, leveraging AI to design enzymes for specific reactions remains highly challenging, with a level of difficulty potentially comparable to that of Nobel Prize-worthy research.
Some companies are already attempting and exploring the use of AI to optimize metabolism in cell factories; however, due to the extreme complexity of cellular metabolic networks, high-quality virtual cell models are still lacking. The main challenge for AI applications in laboratory settings lies in the non-standardized workflows, which require frequent manual judgment and decision-making—tasks that AI is not yet capable of replacing.
01.
Enzyme Excavation and Modification: Results Visible Within Six Months to One Year
Enzyme mining and enzyme engineering are relatively mature areas of AI application.
According to Dr. Wang Sheng, Chairman and CEO of Zhiyu Bio, the current relatively mature applications of AI in the field of biomanufacturing are the “three key strategies”: pathway design (designing biosynthetic routes), enzyme mining, and enzyme engineering. In the future, if AI becomes sufficiently powerful, it may even be possible to directly create enzymes that do not exist in nature based on chemical reactions.
Among these, enzyme mining refers to the process of identifying corresponding enzymes from known protein sequence databases based on specific reaction steps, functioning similarly to an “intelligent search engine.”
According to Wang Sheng, the current collection of known protein sequences by humans has exceeded one billion. The implementation method of enzyme mining is to use AI to extract a limited number of protein sequences with potential specified functions from the vast protein sequence database, and rely on DNA synthesis and molecular cloning technology to finally verify the efficiency of catalytic synthesis reactions in biochemical experiments.
Enzyme engineering is employed when existing or newly discovered enzymes exhibit suboptimal performance or fail to meet industrial production requirements, such as insufficient substrate conversion rates, low catalytic activity, or unsatisfactory properties like thermal stability and soluble expression levels. In more challenging scenarios, it may even be necessary to alter the enzyme’s product or substrate selectivity. In such cases, AI-driven computational biology techniques can be utilized to optimize enzymes according to the desired target attributes.
“The specific approach involves using AI algorithms to elucidate the ‘sequence–structure–function’ relationship, identifying key amino acids that influence enzymatic functional properties, and performing targeted mutagenesis. Although traditional directed evolution, which primarily relies on extensive saturation mutagenesis, remains effective and widely adopted for enzyme engineering, it suffers from low efficiency and long development cycles. The integration of AI can significantly enhance efficiency,” pointed out Wang Sheng.
Regarding enzyme engineering, Luo Zhaohui, Chief Expert of Biological Manufacturing Solutions at BioMap, further explained that traditional laboratory-based enzyme engineering typically involves two rounds: the first round focuses on directed modification to enhance enzymatic activity, while the second round aims to improve industrial application properties such as heat resistance, acid tolerance, and alkali tolerance.
However, when modifying the amino acid sequence or protein structure of enzymes in the second round, it may affect the results of the first round, ultimately yielding enzymes with prominent industrial properties but very low activity, or lacking both. In such cases, traditional approaches require continuous experimentation and trial.
Therefore, the primary challenge in enzyme engineering lies in determining whether an optimal pathway can be identified through minimal experimental exploration.
“The advantage of AI lies in its ability to simultaneously integrate attributes such as enzyme activity and heat resistance to optimize and score different pathway models, thereby identifying pathways with higher comprehensive scores,” said Luo Zhaohui.
“Large AI models can learn the sequences or structures of all currently known proteins, as well as the characteristics and scaffold design of different enzymes. Moreover, multiple large models can be integrated into the design tools of protein-design agents, enabling the simultaneous invocation of different models to score enzyme engineering pathways.” Luo Zhaohui further pointed out that companies can then conduct laboratory tests on these pathways to more efficiently obtain enzymes with balanced properties, a process that typically yields promising practical results within six months to a year.
However, enzyme engineering appears to be a relatively rudimentary application in both biomanufacturing and AI.
In Wang Sheng’s view, the current value of AI in enzymology is akin to that of a “search engine,” with its potential underutilized. In the future, it must evolve from the “search era” to the “enzyme creation era,” meaning that given a chemical reaction, AI should be able to design an enzyme capable of catalyzing it. Only then can it bring about transformative changes to the industry.
02.
Enzyme Engineering: Difficulty Comparable to Nobel Prize-Level Work
“If it were merely a matter of randomly engineering an enzyme, many companies could do it. However, designing catalytic enzymes for any given chemical reaction is a feat achievable by only a handful of people worldwide, and such approaches are largely impractical for industrial-scale application. Therefore, we must settle for a more feasible alternative: leveraging AI to discover and engineer enzymes,” said Wang Sheng.
According to Wang Sheng, enzyme engineering is extremely challenging. Current chemical synthesis technologies are highly mature, to the point that “virtually anything can be chemically synthesized.” Given the molecular formula of a target compound, chemists can achieve its synthesis through various methods; however, many such reactions suffer from low efficiency or cause significant pollution.
Can AI be leveraged to design enzymes in a targeted manner, thereby enabling greener and more efficient reactions?
Wang Sheng pointed out that the number of naturally occurring enzymes capable of catalysis is currently limited. Although enzymes have evolved over millions or even hundreds of millions of years to become highly elegant and efficient, most function only under ambient temperature and pressure conditions.
“Humans are expected to directly design enzymes by leveraging a series of chemical reaction equations. These enzymes may not necessarily be composed of the 20 standard amino acids and might not even be proteins in the traditional sense, yet they would still be capable of catalyzing reactions,” said Wang Sheng.
“However, the challenge of engineering enzymes is immense, with a level of difficulty that could be worthy of a Nobel Prize,” Wang Sheng pointed out.
However, in Luo Zhaohui’s view, high-throughput technologies can, to some extent, enable de novo enzyme design while ensuring a certain degree of success.
Luo Zhaohui analyzed that it is highly challenging for large AI models to generate entirely new protein sequences and structures with specific effects based on current understanding of proteins, but the achievements made so far are equally worthy of recognition.
According to reports, the success rate of enzyme engineering can even reach 30%–40%, as AI continuously learns from feedback loops, enabling the design of enzymes that are more optimized and precise in subsequent iterations. “However, de novo enzyme design requires the validation of thousands of targets to allow the AI to gradually learn and iterate, thereby yielding superior outcomes.”
According to Wang Sheng, to truly achieve enzyme design, further breakthroughs are still needed in the following four areas:
First, theoretical chemistry still requires some research, but not an excessive amount;
Second, the essence of enzyme catalysis is related to quantum mechanics, but current scientific exploration into the integration of quantum chemistry and biocatalysis remains insufficient;
Third, it is necessary to develop AI models capable of characterizing the most fundamental information of chemical reactions, which involves dynamic reaction potential energy, electron transfer, bond formation and cleavage processes, and the characterization of transition states. While these aspects may require quantum chemistry for accurate description, no existing algorithm can currently articulate and interpret these steps in the language of AI.
The fourth challenge lies in validation. Even after an initial AI model is developed and hundreds of thousands of protein sequences are designed based on a specific chemical reaction, it is likely that only one sequence will effectively catalyze the reaction. Therefore, how to conduct efficient validation remains a significant difficulty.
Although challenging, Luo Zhaohui believes that de novo enzyme design is more likely to be achieved through large AI models.
Before the advent of large AI models, designing an enzyme from scratch was nearly impossible due to the immense engineering workload.
“When designing protein sequences and structures from scratch, AI must screen over 1,000 protein sequences and structures from millions of imaginative mutations. After laboratory validation, perhaps only five to ten may barely pass. At this point, the AI must leverage insights from the previous round to repeat the process of filtering five to ten candidates from millions of possibilities, thereby progressively improving the success rate. Such a workflow and the sheer scale of engineering involved would have been an impossible task for manual design by humans,” said Luo Zhaohui.
In addition to the volume of work, high costs also pose a significant challenge.
According to Luo Zhaohui’s analysis, the testing and validation of thousands of protein candidates screened by AI already incur substantial costs. However, this process may need to be repeated for 5–10 rounds before AI can truly design high-value proteins, resulting in enormous overall costs.
If the industry accumulates sufficient data in the future, it will be possible to reduce costs from tens of millions to one or two million yuan. Meanwhile, as AI models become increasingly precise, they will not only be capable of designing novel enzyme molecules but also ensure industrial performance metrics such as enzymatic activity and thermal stability. Only then can these solutions gain market favor.
However, in Luo Zhaohui’s view, the data environment for designing novel enzymes is already improving.
Previously, such data largely originated from pharmaceutical companies, as their early-stage R&D laboratories featured higher throughput and relatively comprehensive high-yield equipment, enabling the production of thousands of distinct proteins in a single run. Most attempts at de novo enzyme design have also been undertaken by pharmaceutical companies.
However, starting in 2024, an increasing number of traditional biomanufacturing companies in the agriculture, animal husbandry, and food sectors have also begun to establish synthetic biology laboratories, equipping them with a growing array of new instruments. Both investment amounts and throughput are rising significantly, leading to substantial accumulation of data.
“In short, the emergence of large AI models has ushered in a new future for biomanufacturing, making it possible to create novel protein structures that do not exist in nature,” said Luo Zhaohui.
03.
Cell Factories: Metabolic Networks Are Extremely Complex, and No High-Quality AI Models Have Yet Emerged
AI also has broad application prospects in cell factories.
Cell factories are microorganisms that, after engineering of the chassis cells, can produce specific chemicals and proteins.
Wang Sheng stated that many synthetic biology products are manufactured using cell factories; however, the optimization and engineering of these cell factories still rely on traditional methods. These methods involve continuously knocking out different DNA fragments, introducing various exogenous sequences, or inducing extensive random mutations combined with high-throughput screening. Although these approaches have considerable scientific basis, they are essentially akin to relying on continuous trial and error to arrive at the correct solution.
AI appears to enhance the efficiency of “continuous trial and error,” but no vertical large language models for cell factories have emerged yet.
According to Wang Sheng’s analysis, the core of a cell factory lies in its metabolic networks and pathways. While there has long been a desire to achieve breakthroughs in cell factory technology, most efforts have remained at the academic level. A notable example is the recently popular concept of the “virtual cell,” which essentially aims to leverage AI models to understand, map, and predict cellular metabolic networks.
“Just as the emergence of AlphaFold2 enabled humans to master the relationship between protein sequences and their structures and functions, there is still a lack of corresponding algorithms and models in the field of cell factories, or they are far from reaching the level of AlphaFold2, making it difficult to predict the metabolic networks of cell factories,” said Wang Sheng.
Luo Zhaohui also stated that the construction of whole-cell virtual models is the next stage of AI research and development.
“Large models for DNA, proteins, and biomanufacturing processes are already available. We can combine these models according to customer needs to create customized vertical-domain AI agents. For instance, to increase the yield of a specific product, we can integrate models for DNA, proteins, and systems engineering to help design an end-to-end workflow.”
In fact, the combinatorial modeling approach can also be applied to cell factories to some extent.
According to Luo Zhaohui’s analysis, building agents for optimizing cellular metabolic pathways requires significant time to call upon and combine different models for optimization, necessitating reconstruction each time.
Therefore, while the aim is to reduce costs through AI, the development of agents for cell factories heavily relies on specialized R&D teams, requiring collaborative discussion and customization by AI scientists, biological scientists, and other experts.
“Therefore, the optimal approach is to launch a large-scale model for cell factories as a standalone product. The prerequisite is the 100% completion of a virtual cell model; once established, it can function independently as an intelligent agent to optimize cellular metabolic states and meet diverse customer requirements. Fortunately, our company has also achieved significant progress in the development of virtual cell models.”
It is understood that, unlike individual enzymes, the metabolic networks of cell factories are extremely complex. Metabolism may involve more than ten different enzymes, as well as DNA transcription, reverse transcription, and other metabolic processes, constituting a systems engineering model. Moreover, the optimization goals for cell factories often aim to maximize output while minimizing material input and processing time, which further increases the complexity and difficulty.
Only by fully understanding this network can AI build large models to truly empower cell factories.
Wang Sheng pointed out that a cell factory is virtually equivalent to an entire living organism, whose metabolic network encompasses not only protein–protein interaction networks but also networks of interactions between proteins and nucleic acids, as well as between proteins and small molecules.
Moreover, metabolic networks change in response to external factors. The metabolic states of cells differ significantly across small-scale, 100-liter, and 100-ton fermenters, depending on various factors such as cell density, ambient temperature, pH, agitation speed, and oxygen distribution. This is precisely why scaling up cell factories is challenging.
“Only AI can understand and predict the metabolic networks of cell factories, but AI’s enabling capability in this area is still insufficient due to a lack of robust modeling,” said Wang Sheng.
It is precisely for this reason that Wang Sheng believes AI applications in enzymology will certainly outperform cell engineering in the short term, offering greater controllability and efficiency.
“Since enzymes perform consistently across scales—from one-liter to 10,000-liter fermenters—the enzymes we have discovered and engineered achieved a 99.8% conversion rate within two hours in small-scale reactions, and demonstrated the same efficacy in 10-ton production-scale fermenters. As enzymes are cell-derived catalysts, they rarely encounter the scale-up challenges commonly associated with chemical catalysts.”
“However, if we want to achieve lower costs for biomanufacturing compared to chemical methods, we can almost only rely on cell factories. But currently, the improvement of cell factories may still need to be accomplished through traditional trial-and-error methods,” said Wang Sheng.
04.
Laboratory Scenario: Low Level of Intelligence, Difficult to Replace Human Decision-Making
Laboratory intelligence is also one of the important application scenarios of AI.
According to VCBeat’s industry research report, laboratory automation and digital-intelligent development are still in their early stages. While there is a long way to go before realizing the ultimate vision of “lights-out laboratories,” the field has already demonstrated its potential and is poised to become a key driver propelling the industry’s leapfrog upgrading.
So, exactly how significant a role can AI play in this context?
Hao Youyou, Founder and Chairman of Shanghai Mansen Biotechnology Co., Ltd., stated that it is essential to first distinguish between automation and intelligence, as these are two distinct concepts.
Laboratory automation replaces repetitive manual labor and can be categorized into three levels: the first level is single-function automation, which addresses only a specific action; the second level involves workstations, which automate combined actions; and the third level comprises production lines, which automate the entire workflow and still require further refinement and development.
In contrast, the key to laboratory intelligence lies in replacing cognitive labor by automatically performing tasks that require thinking and judgment. However, automation is a prerequisite for intelligence, and intelligent systems may rely on automation technologies for support.
In Hao Youyou’s view, the current levels of automation and intelligence in the laboratory are not particularly high.
According to Hao Youyou’s analysis, while the current level of automation in factory production is high, laboratory automation remains significantly inadequate. Particularly in the field of biomanufacturing, laboratory personnel far outnumber production line staff, and most operations within laboratories are still performed manually.
The current issue is that every step of laboratory automation requires manual intervention, with staff spending a significant amount of time on operations within a month. The level of intelligence is even lower than that of automation.
Because laboratory procedures are highly non-standardized compared to manufacturing processes, decisions on whether and how to proceed to the next step after completing a current one rely entirely on human judgment. This necessitates that laboratory personnel hold at least a bachelor’s degree. The core challenge in laboratory automation lies precisely here: replacing researchers in making judgments and decisions.
Recently, the renowned U.S. technology research firm Lux Research also published an article pointing out that, with the assistance of AI, laboratory automation can indeed effectively improve throughput and reproducibility, but it typically still relies on human supervision, data interpretation, and planning for next steps.
Moreover, many professionals believe that the core of experimental design lies in human interpretation and creativity—a decision-making process that AI cannot currently replicate. Therefore, current AI is not an essential prerequisite for laboratory automation.
"In Hao Youyou's view, for AI to align with laboratory scenarios, the most critical issue to address is the talent problem."
“Over the past decade, the industry has poured substantial capital into the automation and intelligence of laboratories, yet many such initiatives have ended in failure. This is because the critical bottleneck lies not in technology or funding, but in the deep understanding of specific laboratory scenarios and the subsequent translation of this understanding into intelligent technologies by converting scenario-based operations into automated solutions.”
“Therefore, the key talent capable of achieving laboratory automation and intelligence may no longer be automation specialists, but rather interdisciplinary professionals with extensive experience and deep insights into laboratory operations, who can translate laboratory requirements into specifications for hardware and software.”
Additionally, automation and intelligence must be data-centric.
Hao You pointed out that the laboratory’s goal is not merely high throughput and high efficiency. “It is not about how many tests are completed in a day, but rather about making data more reliable and accurate, while freeing up human resources. Moreover, the application of AI in laboratories must be data-driven; smart manufacturing essentially involves acquiring multi-dimensional data from operational processes and using this data to drive subsequent operations and decision-making.”
05.
Process Optimization: Some applications may be more mature than enzyme engineering
Process optimization is a critical step in biomanufacturing, potentially determining whether a product can be scaled up and mass-produced. The application of AI in process optimization mainly focuses on two aspects.
On one hand, AI is leveraged to design biosynthetic pathways to obtain the final product, a process Wang Sheng refers to as “pathfinding.”
Wang Sheng stated that the design of biosynthetic pathways must leverage AI, as relying solely on traditional database searches makes it difficult to transcend previously established synthetic routes; such methods can only identify pathways that have already been discovered by humans.
“In contrast, AI possesses powerful design capabilities. By leveraging training data and learning molecular features, it can creatively devise a synthesis route from a given starting point to an endpoint. Although this route may never have appeared in the literature, experts may well consider it feasible.”
On the other hand, it lies in process optimization during scale-up.
Luo Zhaohui believes that AI applications are gradually shifting from early-stage R&D phases, such as protein design and cell factory development, to later-stage processes including pilot-scale trials and large-scale manufacturing.
Traditional physical and chemical reaction pathways are relatively well-defined, allowing optimization through models built using conventional machine learning methods. However, biomanufacturing involves large-scale biological processes that rely on cellular expression. Cellular metabolism itself is fraught with uncertainties, resembling a “black box” characterized by a wide variety of complex parameters, massive data volumes, and significant fluctuations. Moreover, the underlying physicochemical reactions are less direct. These characteristics make biomanufacturing particularly amenable to AI intervention.
Luo Zhaohui further pointed out that AI was previously considered difficult to apply in production because traditional biomanufacturing facilities lacked sufficient data to train AI models. Moreover, the industry heavily relied on senior engineers and experienced fermentation masters, who adjusted parameters based on their empirical knowledge. Their decision-making processes resembled a “black box,” governed by personal subjective experience rather than transparent, standardized logic.
“But now, there are increasingly more modern factories that can even enhance the categories, volume, and quality of data collection through real-time online monitoring equipment. Such data is sufficient for training AI models, which can continuously learn and iterate using data from successful or failed experiments, thereby providing recommendations such as parameter adjustments to optimize production efficiency. This will become a future application trend,” said Luo Zhaohui.
Wang Sheng also believes that AI can play a significant role in the systematic management of production processes. By leveraging sensors for pressure, temperature, pH, and dissolved oxygen, it is possible to achieve real-time monitoring of fermentation processes and metabolite levels—a key focus area currently under development at Zhiyu Biosciences.
According to Wang Sheng’s analysis, AI can play two major roles in the separation and purification process:
First, during small-scale and pilot trials, AI is used to calculate potential optimal process solutions, but final determination still requires experimental validation.
Second, in actual production processes, AI performs real-time monitoring and provides intelligent decision-making. In the event of a malfunction during unattended operations, as long as sensor configuration is sufficiently comprehensive, AI can minimize hazards and losses at the earliest opportunity. For instance, if excessive pressure or overheating could lead to uncontrollable reactions, conditional rules can be programmed into the AI to shut down the machinery immediately upon detecting such anomalies. These applications may place lower demands on AI capabilities compared to enzyme engineering, and thus may be more mature in their practical implementation.
There are already documented cases of AI applications in process optimization; for instance, the Pow.Bio platform leverages AI-enhanced analytics to identify the sources of bioprocess variability and provide early warnings of potential mutations caused by changes in bioreactor conditions.
According to Lux Research, in the process optimization phase of biomanufacturing, AI can detect subtle correlations within large datasets, generalize across operational ranges, and adjust control strategies in real time, thereby enhancing the effectiveness of traditional tools.
However, the contradiction lies in the fact that for most industrial applications, bioprocess optimization can be effectively carried out using sensors without the need for AI, especially when operating conditions are stable or vary within known ranges.
AI will only become indispensable to biomanufacturing processes if its advancements move beyond pattern recognition to evolve into holistic, proactive control systems capable of detecting and addressing challenges such as contamination or metabolic issues.
“AI for bioprocess optimization is likely to become increasingly necessary in the next two to three years,” noted Lux Research.