
AI Protein Design Platform Developer
Biological computing has become the next pearl in the development of AI. On September 2, at the 2022 Shanghai Biological Computing Forum guided by the 2022 World Artificial Intelligence Conference (WAIC), Professor Jinbo Xu, founder and chief scientist of MoleculeMind, known as the "pioneer of AI protein folding," delivered a speech titled
Biological computing has become the next pearl in the development of AI. On September 2, at the 2022 Shanghai Biological Computing Forum guided by the 2022 World Artificial Intelligence Conference (WAIC), Professor Jinbo Xu, the "pioneer of AI protein folding" and founder and chief scientist of MoleculeMind, delivered a keynote speech titled "New Advances in AI Protein Research," unveiling the latest research progress of MoleculeOS, an AI-driven platform for protein discovery, optimization, and design independently developed by MoleculeMind.
Professor Xu Jinbo said that in recent years, the in-depth development of artificial intelligence has enabled tremendous breakthroughs in the research of protein structures and functions, moving from traditional physics and...StatisticsThe method quickly moves towards the latest machine learning and even deep learning algorithms. The research paradigm in the field of molecular biology has also shifted from sequence-based to structure-based, greatly improving the efficiency of de novo protein design. In the industry, AI-driven protein discovery and design have thus gained momentum, becoming a globally watched hot track.
However, due to the extremely complex mechanisms of protein molecules, even with the use of emerging AI methods, there are still many issues that need further exploration and resolution. In addition, there has not been a fully functional AI protein design and optimization platform in China to support the technical breakthroughs in the research community and the industrial implementation in the industry.
MoleculeMind has built the industry's first fully functional AI protein prediction and design platform, MoleculeOS. The platform has two major important functions: directly designing and generating the required proteins using data-driven deep learning methods; and helping industry experts quickly identify and generate the most suitable proteins by analyzing characteristics such as protein expressivity, stability, and druggability, thereby promoting the scaled application of laboratory research results in the industrial field. "MoleculeOS is a new engine for AI-driven protein design that MoleculeMind is striving to build. We hope to make it an infrastructure for China's bioeconomy era," said Professor Xu Jinbo.
MoleculeOS has the world's leading capabilities in protein structure and property prediction as well as protein design. In key algorithms and modules such as de novo protein design, protein optimization, antibody redesign, protein and complex structure prediction, protein-protein docking, protein side-chain prediction, protein function prediction, and protein language models, it has developed more than ten world-leading AI algorithms, with computational results surpassing those reported in literature and the best publicly published results worldwide.
For example, in the field of protein structure prediction, research teams such as DeepMind and Baker have successively introduced AI protein structure prediction models like AlphaFold2 and RoseTTAFold in recent years. While driving tremendous progress in the biotech industry, these AI algorithms have consistently faced a significant limitation: a heavy reliance on MSA (multiple sequence alignment) and its derived co-evolutionary information and sequence profiles to predict protein structures. This makes it impossible to achieve high-precision structural predictions for proteins lacking homologous evolutionary information, such as orphan proteins. Against this backdrop, "AI-based protein prediction methods that do not use homologous sequences and co-evolutionary information" have become a new direction of exploration in the industry over the past two years. The MoleculeMind team, based on the MoleculeOS platform, has proposed RaptorX-Single, an AI-based single-sequence protein structure prediction algorithm. It can predict protein structures directly from primary sequences without using MSA (multiple sequence alignments from homologous proteins), achieving performance surpassing methods like DeepMind's AlphaFold2. Moreover, the model adopted by RaptorX-Single is more lightweight, with parameters amounting to less than one-third of the Meta ESMFold method. This algorithm further enhances the efficiency and boundaries of protein structure prediction exploration.
In terms of de novo protein design, the MoleculeMind MoleculeOS platform possesses several world-leading capabilities. For instance, its protein sequence design algorithm has demonstrated the highest global NSR on four widely-used datasets; the protein backbone structure design algorithm achieved the world's first breakthrough in de novo design of complex protein backbone structures, enabling the creation of various highly complex protein conformations that do not exist in nature and are more stable than natural proteins; it also pioneered the template-free protein-ligand generation algorithm, which can generate novel binding proteins that do not exist in nature.
Moreover, based on the MoleculeOS platform, MoleculeMind has also developed the world's first end-to-end protein flexible docking algorithm, which can achieve morePrecisionThe docking.
In terms of protein optimization, MoleculeMind has developed an AI algorithm to predict the impact of single-point mutations on protein performance. Without requiring experimental data, it can predict single-point mutations, and the algorithm's performance has significantly broken world records, making it the most accurate algorithm in this field worldwide. In antibody design algorithms, MoleculeMind has constructed the industry’s reconstruction algorithm for the CDR region with the smallest error, which can be combined with MoleculeMind’s protein optimization module to optimize the CDR region of antibodies. In protein structure prediction, MoleculeMind’s protein and complex structure prediction algorithm tested on public datasets far surpasses DeepMind’s AlphaFold-Multimer. MoleculeMind has also developed the world’s first end-to-end protein side-chain prediction algorithm that does not use a rotamer library, which not only has much smaller prediction errors for side-chain dihedral angles compared to the widely used SCWRL software in the industry but is also faster than SCWRL. In protein function prediction, MoleculeMind leads all publicly available protein function prediction algorithms globally by 10-30% using graph neural networks and predicted structural information. In protein language models, MoleculeMind used only 5.7% of the training data that Facebook used, yet its trained protein language model outperforms the one trained by Facebook in protein contact prediction.
"In the past few years, significant progress has been made in the field of AI protein structure prediction, disrupting the research paradigm in protein studies and unlocking tremendous potential in biotechnology. However, there are still many unsolved challenges in protein research," said Professor Xu Jinbo. For instance, the accuracy of AI predictions for protein interactions, particularly in antibody-antigen interactions, is far from satisfactory. Problems such as orphan protein structure prediction and predicting protein interactions with other molecules remain unresolved. "Our goal is to design proteins with real practical value, drive innovation in the biotech industry, and unleash new momentum in the field of bio-computation."