Yitu Healthcare's NLP-Powered AI Diagnostic System Trained on Hundreds of Millions of Real-World Data Points Achieves Over 90% Accuracy in Pediatric Disease Diagnosis, Published in Nature Medicine

Feb 12, 2019 15:55 CST Updated 15:55

VCBeat (WeChat ID: vcbeat) has learned that at 00:14 Beijing time on the 12th, the internationally renowned medical research journal Nature Medicine published an article online titled “Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence.” This significant scientific achievement in diagnosing pediatric diseases using artificial intelligence was produced by a collaborative effort involving Professor Huimin Xia and Professor Kang Zhang (University of California, San Diego) from Guangzhou Women and Children’s Medical Center, Dr. Huiying Liang from the Data Center, Director Xin Sun from the Medical Affairs Department, and Director Liya He from the Pediatric Internal Medicine Outpatient Clinic, together with top-tier industry research teams including Hao Ni’s team from Yitu Healthcare and Kangrui Intelligent Technology, as well as the Guangdong Provincial Key Laboratory of Regenerative Medicine. This marks the first time globally that research findings on clinical intelligent diagnosis using natural language processing (NLP) technology based on Chinese text-based electronic medical records (EMR) have been published in a top-tier medical journal.

未标题-2.jpg

This marks another significant milestone achieved by the team in less than a year since their paper on AI-based image diagnosis was featured on the cover of Cell, highlighting their progress in implementing AI technology in healthcare. It signals the dawn of an era in which AI simulates human physicians to perform disease diagnoses.

Not only can it “see” images to interpret medical imaging, but it can also “read” text to understand medical records.

In recent years, AI has demonstrated impressive performance in diagnostic tools based on medical imaging, though it has generally remained limited to relatively standardized static image data. In this latest scientific achievement, building upon image recognition, artificial intelligence automatically learns the diagnostic logic embedded in medical record text data (reflecting physicians’ knowledge and language), thereby gradually acquiring a certain capacity for clinical analysis and reasoning. This enables AI to better interpret and analyze complex cases, suggesting that artificial intelligence may soon be capable of “thinking” like a physician.

Researchers trained AI to comprehend clinical feature data within massive volumes of electronic health records (EHRs), encompassing patient chief complaints, symptoms, personal history, physical examination findings, laboratory test results, imaging reports, and medication information. Leveraging Yitu Healthcare’s natural language processing (NLP) technology, the research team developed an intelligent medical record analysis system to deeply mine and analyze information from medical texts, transforming unstructured textual EHR data into normalized, standardized, and structured data, thereby enabling AI to accurately and comprehensively “interpret” medical records. To achieve this, physicians, scientists, and technologists collaborated closely; an expert team comprising over 30 senior pediatricians and more than 10 informatics researchers manually annotated over 6,000 charts in the electronic health records and continuously validated and iterated the model.

The research team also developed an intelligent diagnostic recommendation system that simulates the clinical decision-making pathway of human physicians, performing stepwise assessments for target pediatric patients. Sun Xin, Director of the Medical Affairs Department at Guangzhou Women and Children’s Medical Center, stated, “The input of high-quality prior medical knowledge from professional pediatricians constitutes a key advantage of this system.” Specifically, the system first categorizes conditions into major systems, such as respiratory, gastrointestinal, and systemic diseases, and then further subdivides each category. For instance, in the most common category—respiratory diseases—the system initially distinguishes between upper and lower respiratory tract disorders, and then further classifies them into laryngitis, tracheitis, bronchitis, and pneumonia. Validation has shown that at each level, the preliminary diagnoses generated by AI achieve accuracy comparable to those made by examining physicians. For example, in acute upper respiratory tract infections, the most common condition among pediatric patients, the model achieved a diagnostic accuracy of 95%.

For certain severe, life-threatening conditions (such as acute asthma attacks and bacterial meningitis), the algorithm has also demonstrated robust diagnostic performance. He Liya, Director of the General Pediatrics Outpatient Department at Guangzhou Women and Children’s Medical Center, stated, “This holds significant importance for clinical application, as AI-assisted rapid triage enables the allocation of limited healthcare resources to the patients in greatest need.”

Yitu Healthcare proposed and tested a system framework specifically designed for data mining of electronic medical records, integrating medical knowledge with data-driven models. The model first annotates electronic medical records using natural language processing (NLP) and employs logistic regression to establish hierarchical diagnoses, achieving performance comparable to that of experienced pediatricians in diagnosing common pediatric diseases.

Ni Hao, CEO of Yitu Healthcare and co-first author of the paper, stated, “The core technology behind this achievement involves deconstructing electronic medical record (EMR) data through deep learning techniques and medical knowledge graphs, thereby constructing a high-quality intelligent disease database. This facilitates the subsequent establishment of various diagnostic models with relative ease. The diagnostic models demonstrate that AI-based systems can assist physicians in processing large-scale data and supporting auxiliary diagnosis, while providing clinical support in addressing diagnostic uncertainty and complexity. Pediatric diseases present with diverse symptoms that are challenging for clinicians to differentiate, making the diagnostic process time-consuming and labor-intensive; however, establishing a definitive diagnosis is critically important. Having an auxiliary diagnostic assistant comparable in capability to experienced pediatricians enables physicians to effectively reduce diagnostic time and significantly optimize the diagnostic workflow.”

Can be applied to the diagnosis of various common pediatric diseases, with accuracy comparable to that of experienced pediatricians

By automatically learning the diagnostic logic from 1.36 million high-quality electronic medical records of 567,000 pediatric patients, this AI is applied to diagnose various common pediatric diseases, with an accuracy comparable to that of experienced pediatricians. Researchers randomly selected 12,000 patient records and divided 20 participating pediatricians into five groups based on their seniority and clinical experience to determine which group’s performance was closest to that of the AI. The results showed that the AI model’s average score was higher than those of the two groups of less-experienced physicians and close to those of the three groups of more-experienced physicians.

Researchers explain that the AI system can obtain patient- or parent-reported text through human-computer interaction, including information such as chief complaints, symptoms, medical history, and medication history, to make a preliminary diagnosis and suggest a range of possible diseases. Through in-person consultations with physicians or remote internet-based consultations, detailed clinical information and features for differential diagnosis are collected; the model then recalculates based on this data to provide a specific and precise diagnosis. If laboratory test or imaging data are available, the AI model can further confirm its diagnostic results. More importantly, it possesses incremental learning capabilities: in practice, it reinforces memory for adopted outcomes, while for non-adopted outcomes, it enhances its capabilities through continued learning after verification.

Dr. Liang Huiying, Director of the Data Center at Guangzhou Women and Children’s Medical Center (the first author of this article), revealed that after three months of refinement and iteration following its launch, the system recorded over 30,000 calls in the first quarter of 2019. He emphasized that data from these calls serve as a compass for evaluating the practical performance of “Auxiliary Diagnosis Bear” and guiding targeted capability enhancements.

There is still much foundational work to be done, and the future may hold even broader prospects.

Xia Huimin, Director of Guangzhou Women and Children’s Medical Center, stated, “The national government’s vigorous promotion of its artificial intelligence roadmap has revealed significant opportunities. By leveraging high-quality medical big data generated through digitalization to implement AI technologies and platforms, we can, to a certain extent, address the shortfall in medical service capacity while enhancing the equity and accessibility of healthcare services. We hope that in the near future, this technology will be widely demonstrated and promoted, providing auxiliary diagnostic and treatment support for primary-care and junior pediatricians, offering intelligent self-diagnosis services and authoritative second opinions to parents of pediatric patients, thereby mitigating medical risks associated with misdiagnosis and missed diagnoses.”

According to the research team, this AI-assisted diagnostic system can be applied in clinical settings through various approaches. First, it can serve as a triage tool. For instance, when patients arrive at the emergency department, nurses can input their vital signs, basic medical history, and physical examination data into the model, allowing the algorithm to generate predictive diagnoses that help physicians prioritize which patients require immediate attention. Another potential application is assisting physicians in diagnosing complex or rare diseases. In this way, physicians can leverage AI-generated diagnoses to broaden their differential diagnoses and consider diagnostic possibilities that may not be immediately apparent.

Regarding the future of AI-assisted diagnostic systems for individuals, Professor Xia Huimin stated, “This study will serve as a significant milestone in the clinical implementation of AI technology in healthcare. Its greatest contribution lies in enabling AI not only to ‘interpret images’ but also to ‘read text,’ thereby deciphering disease-related information embedded within textual data, much like humans do.”

By systematically learning from textual medical records, artificial intelligence may be able to diagnose a wider range of diseases. However, it is crucial to recognize that substantial foundational work remains to be solidified. For instance, the integration of high-quality data is a long-term endeavor, as the collection and analysis of big data require close collaboration among multidisciplinary experts, including algorithm engineers, clinicians, and epidemiologists. Furthermore, even after being trained on massive datasets, the accuracy of AI-driven diagnostic results still needs to be validated and benchmarked against broader datasets.

>>>>

Introduction to Guangzhou Women and Children's Medical Center

Guangzhou Women and Children’s Medical Center is the largest Grade A tertiary hospital for women and children in South China, with 1,700 open beds. In 2018, it recorded approximately 4.63 million outpatient and emergency visits, nearly 140,000 inpatients, 32,000 deliveries, and 87,000 surgical procedures. It has ranked among the top 100 hospitals in the Fudan University Hospital Rankings for seven consecutive years. The center boasts three National Health Commission Clinical Key Specialties, six Guangdong Provincial Clinical Key Specialties (Disciplines), and two Guangdong Provincial Key Medical Laboratories during the “12th Five-Year Plan” period. It houses a biobank with international standards capable of storing one million samples and has established one of the world’s largest cohort study platforms for women and children. The center has set up a postdoctoral research workstation, established 11 independent laboratories, recruited 13 Principal Investigators (PIs) globally, and currently hosts more than 70 postdoctoral fellows. In 2018, it secured 30 projects funded by the National Natural Science Foundation of China, published 195 SCI-indexed papers, and achieved a total impact factor score of 665.891. In January 2019, it was successfully selected as one of the second batch of key construction hospitals under the Guangdong Provincial High-Level Hospital Development Program. It has passed the Level 6 evaluation of the National Electronic Medical Record System Grading and the Level 5-Yi assessment of the National Healthcare Information Interoperability and Standardization Maturity. Additionally, it has achieved HIMSS EMRAM Stage 7 certification for both inpatient and outpatient services, becoming the first smart hospital in China to implement mobile registration, comprehensive appointment-based registration for non-emergency cases, and a “treatment-first, payment-later” model. These achievements have been reported by CCTV’s Xinwen Lianbo (News Broadcast) and the People’s Daily, and the hospital was recognized as a National Demonstration Hospital for Leveraging Information Advantages to Deliver Quality Services from 2015 to 2017.

>>>>

Introduction to Yitu Healthcare

As a leading enterprise in medical artificial intelligence, Yitu Healthcare pioneered the large-scale implementation of AI solutions in China and is currently the only medical AI company in the country to offer full-chain medical intelligence. Its product portfolio spans multiple domains, including intelligent medical imaging, intelligent clinical big data, intelligent outpatient optimization, and intelligent quality control. The company’s care.ai series of AI applications has been deployed in over 100 Grade-A tertiary hospitals across China and is extending to county- and city-level hospitals, effectively enhancing medical efficiency and empowering primary healthcare. Yitu Healthcare is actively exploring innovations to advance the development of smart hospitals and improve the nation’s health standards.

>>>>

Introduction to Kangrui Intelligence

Guangzhou Kangrui Intelligent Technology Co., Ltd. is a high-tech enterprise engaged in the research, development, and commercialization of medical artificial intelligence technologies. It was successfully selected as one of the first batches of innovation and entrepreneurship teams at the Guangdong Provincial Laboratory of Regenerative Medicine and Health. With a strong roster of professional physicians and top-tier AI experts, the company has addressed core challenges in big data computation within machine learning processes and pioneered supercomputing approaches based on medical imaging and comprehensive electronic health record systems. The company holds multiple international patents. Several of its research findings have been published in prestigious international journals such as *Cell* and *Nature Medicine*, earning recognition from peer experts worldwide.