Will AI-Driven Drug Discovery Trigger an Industry Boom? Perspectives from Big Pharma and AI Experts

May 14, 2021 14:00 CST Updated 14:00

To what extent can AI technology influence the birth of a new drug? How will AI substantially disrupt the life sciences sector over the next decade?

On May 10, the “First China Bioinformatics Computing Conference” kicked off by Jinji Lake in Suzhou. Experts from industry, academia, research institutions, and investment sectors convened around the theme of “BT & IT” to explore the definition and boundaries of bioinformatics computing, as well as the significance of AI for biological data, each drawing on their respective fields and perspectives. The conference was co-hosted by Biotree, China’s first life sciences company driven by bioinformatics computing technology, and Bohe Innovation, an innovation incubation center dedicated to the cross-disciplinary integration of IT and BT.

At the special session on new incubation in biocomputing from an international perspective, two in-depth and highly informative roundtable discussions were held, one of which was hosted byLiu Wei, Co-founder & CEO of BioMap; Zhang Lianshan, Senior Vice President and Head of Global R&D at Hengrui Medicine; Yang Qing, Co-CEO of WuXi AppTec; Le Song, Deputy Director of the Machine Learning Center at Georgia Institute of Technology; Ma Weiying, Chief Scientist at the Institute for Intelligent Industry, Tsinghua UniversityRoundtable Discussion with Five Panelists: VCBeat has curated the panelists’ insightful remarks without altering their original meaning for the benefit of our readers:

圆桌7777.jpg

Liu Wei: First, please introduce yourselves briefly, drawing on your respective explorations in the field of biological computing.

Zhang Lianshan:I am from Jiangsu Hengrui, primarily responsible for the company's R&D. Our company started early in the design field, having engaged with many companies at least five years ago.In the past, drug discovery relied on a "trial-and-error" approach, which identified numerous therapeutic targets and delivered substantial benefits to society and patients. However, the landscape has become increasingly complex, making it difficult to identify novel targets. Therefore, we need to leverage AI to facilitate computer-aided drug discovery.

Many patients perceive new drugs as prohibitively expensive, a reality driven by the exorbitant costs of drug development, where bringing a single new drug to market can require $1 billion or even more. In fact, R&D costs continue to rise. Today, I aim to explore a new model through discussions with peers and interdisciplinary scientific researchers, striving to reduce the cost of drug development.

Ma Weiying:I am a newcomer to this field. For the past 20 years, I have been working in internet search, recommendation systems, and content generation, focusing on traditional AI domains such as computer vision, natural language processing, speech recognition, machine translation, and personalized recommendation algorithms.

Why did you become interested in the field of AI-driven drug discovery? In 2019, I began to notice the intersection between natural language processing and new drug discovery, and it was then that I entered this field.I believe that as life science data accumulates to a critical threshold, AI will experience a significant surge and robust growth in the fields of life sciences and biological computing over the next decade.It is equivalent to the development of the Internet era over the past 20 years.

Therefore, I often tell my colleagues in the AI field to stop “wringing out towels” in red oceans; there is still ample room for innovation in both computer vision and natural language understanding. In the life sciences sector, leveraging existing tools can lead to significant breakthroughs. The AI-driven drug discovery market remains a blue ocean with tremendous opportunities.

With the advent of the era of personalized precision medicine, an increasing number of individualized treatment regimens, such as certain immunotherapies, are being introduced. However, these therapies remain prohibitively expensive for patients, and their precision is still insufficient.If AI intervention enhances the efficiency of new drug development, reduces the cost of personalized therapies and genetic sequencing, shortens the timeline for new drug R&D, and improves precision, it will undoubtedly disrupt the existing healthcare industry.

Song Le:I am Song Le, a consultant at BioMap. My background overlaps significantly with Professor Ma’s. For most of my career, I have worked in the fields of AI and the internet, focusing on AI-driven multimodal data analysis, including images, text, and complex networks. In academia, I have also conducted extensive research on biological data analysis, such as multi-omics data and small-molecule data.

Currently, there are numerous publicly available datasets on the internet, and a variety of methods exist to measure many different biological indicators. It appears that we have reached a turning point where integrating these public datasets through AI-driven approaches can facilitate new drug discovery.

Liu Wei: The executives from the ecosystem of large pharmaceutical companies present here all embrace biocomputing. We invited you to this conference precisely because we have observed this trend. Each of your companies has substantial internal teams dedicated to R&D in this area, and you have also made numerous attempts at external collaborations. As traditional drug development enterprises, what opportunities do you believe remain for external AI-driven drug discovery? What tasks are beyond the capabilities of traditional companies like yours, which you would prefer these external partners to undertake?

Zhang Lianshan:In drug development, we are highly interested in AI. Both the government and national authorities expect pharmaceutical companies to develop first-in-class therapies, butIt is currently difficult to identify entirely novel therapeutic targets.Whether AI can be applied to address such issues, thereby improving the efficiency of drug development and reducing R&D costs, remains in the early stages of exploration.

Furthermore, we now advocate for precision diagnosis and precision treatment. Since a given medication may not be effective for every patient, lack of efficacy would leave patients exposed only to the drug's adverse effects.If AI technology can help us improve the efficiency of precision diagnosis and precision treatment while reducing costs, we will be able to allocate more resources to more critical endeavors.

Liu Wei: Are you willing to share your data with partners in the AI field? This is a general question. We have also invited two AI experts to weigh in. If their answer is “yes,” what data would they most want to accelerate model development?

Zhang Lianshan:Our new drug development begins with a specific target, and this information is fully open for shared access. What I seek are the molecules themselves, while the underlying knowledge is also available for communal sharing.

Ma Weiying:Recently, I have noticed that several universities in the United States are engaged in an initiative calledTDC (Therapeutics Data Commons) is a large-scale public dataset for machine learning in biomedicine.TDC encompasses over 20 meaningful tasks and more than 70 high-quality datasets, covering areas ranging from target protein discovery to pharmacokinetics, safety, and drug manufacturing. These two dozen tasks were defined by specialized life science experts and feature standardized representations.Once this public dataset is released, many talented individuals will compete to improve data accuracy. This development model is well worth referencing.If professionals in the life sciences sector collaborate with AI talent to jointly advance the field, I personally believe this is an excellent approach.

Another dimension—personal health management—is also worth considering. Currently, we have an increasing number of personal health management tools, including wearable devices. AI empowers personal health by encouraging individuals to proactively engage in health management and contribute their data.Adopting an open-model approach to drive the development of data-driven AI in healthcare and life sciences is a highly reference-worthy strategy.Overall, AI still relies on data. I personally believe that with the accumulation of time, data-related issues will be resolved.

Song Le:I strongly agree with some of the points raised by Professor Ma, including that if certain datasets and questions could be designed in the pharmaceutical industry, it would engage more people to leverage AI for mining and exploring new drug targets or addressing issues in drug design. HereA critical aspect is the creation of a closed-loop system between data and AI models, enabling more researchers to leverage the system for discovering new drug targets or assessing druggability.

Leveraging AI to empower and upgrade experimental platforms shares many similarities with the evolution of the internet. For instance, user search functions and intelligent recommendation systems can complement each other. Once an AI-driven database platform is established, user interactions continuously update the AI models. These updated models then generate new recommendations for users, who in turn perform corresponding actions based on these suggestions, leading to the accumulation of increasing amounts of data on the platform. However, without a closed-loop system integrating data and AI models, it is difficult to achieve continuous iteration and improvement of the AI models.

Liu Wei:I largely agree with the perspectives shared by Professor Ma and Professor Song. At BioMap, we are also keen to build the closed-loop system described by Professor Song. We position ourselves as an “innovative pharmaceutical company” focused on certain areas of novel drug development, aiming to address the following key challenges: First, the scarcity of industry data. For instance, as Professor Dong Chen noted, data on immune targets—particularly in the field of autoimmune diseases—is scarce across the entire industry, and even large pharmaceutical companies have limited internal data. Second, establishing closed-loop validation capabilities. Whether through our own use of datasets for enhanced learning and validation, or by providing iterative feedback loops to more AI partners in the industry, we aim to enable the discovery of new possibilities from novel data mining efforts. These insights can then empower traditional large pharmaceutical companies to explore subsequent R&D opportunities based on such data. Early attempts involving new datasets and computational methods tend to have relatively high failure rates, requiring prolonged cycles of experimentation and validation to meet the standards of major pharmaceutical companies.

Zhang Lianshan: Currently, there are various types of data,For example, multi-omics, genomics, immunology, and so on, butThe challenge lies in leveraging AI to process seemingly unrelated data, uncovering correlations and identifying substantive insights.I believe this is the primary issue that must be addressed to foster development in this field.

Liu Wei:Yes,Many datasets exhibit batch effects, with insufficient standardization across disparate sources.Insufficient standardization did not pose a substantial obstacle when data granularity was relatively coarse in the past; however, as data granularity becomes increasingly fine, errors may obscure underlying patterns.

Ma Weiying: It is crucial to connect seemingly unrelated data and knowledge, especially in the field of life sciences, which demands extensive professional expertise.Current AI-driven drug discovery companies have only managed to deduce small-molecule compounds and hand them off for subsequent stages; however, they still lack the capability to guide or search for more optimal chemical space, leaving a significant gap in the current workflow.

I believe it is indeed necessary to integrate knowledge engines and knowledge graph rules into deep learning. Life sciences also provide a new foundation for AI to achieve further breakthroughs, fostering deeper integration of knowledge and symbolic logic with statistical methods, purely data-driven approaches, and the previously distinct model-driven and data-driven paradigms, while also bridging the gap with laboratory practices. I am confident that as this workflow becomes more seamless, our entire process will accelerate, driving faster innovation and more rapid scientific discoveries.

Liu Wei: There is now a term called “computational druggability” or “computational drug discovery.” Without biological computing, it would be difficult to develop such drugs, or the conversion rate would be too low for anyone to dare to invest in them. I would like to ask which specific subfield you are more optimistic about, where biological computing will lead to the emergence of a large number of innovative drugs?

Zhang Lianshan:I believe there will be some breakthroughs in the field of small molecules in China, and I am very confident that we can achieve this.

Ma Weiying:My personal understanding is that AI methods can be used to infer structures and predict functions based on existing targets. Furthermore, I believe a new opportunity lies in the future, where AI-driven drug development will become more precise and personalized.

Song Le:I strongly agree with Professor Ma’s viewpoint. Additionally, I have another perspective: how AI can facilitate current new drug development. Although some public datasets already exist in this field, they are scattered across various sources.To discover a new target or design a novel drug, can we integrate these complex datasets and leverage AI models for statistical analysis, interpretation, and inference, or employ AI to screen entities such as small-molecule structures and protein structures?In such circumstances, AI will facilitate advancements in traditional pharmaceutical R&D, bringing about breakthroughs.

Liu Wei: My own strategic choices differ somewhat from those of the two major pharmaceutical companies. BioMap will place greater emphasis on the design of macromolecular drugs. As Professor Ma just mentioned, proteins are essentially a form of language; their sequences and the advances in structure prediction play a supportive role in exploring the entire state space. We will focus our efforts on complex dynamic immune issues and complex programmable antibodies as key areas of exploration, aiming to develop precise immunotherapeutics targeted at dynamic immune assemblies in this process.

Due to time constraints, we will conclude today’s session here. Thank you all.