Ernst & Young’s “Life Sciences 4.0 Report” once used FV=IDDescribing the Future Value of Life Sciences: Future Value Equals "Innovation" Raised to the Power of "Data." The "Data" Driving Exponential Growth in "Value" Influences the Acquisition of Scientific Research Achievements.
According to the survey results on the application of hospital big data in the “National Health Informatics Survey Report,” in 2021, the average application rate of medical big data in China’s tertiary hospitals was less than 20%, and in secondary hospitals it was less than 5%. Even for clinical data, which attracted the greatest interest, only one-fifth of hospitals attempted to conduct research.

Status of Big Data Applications in Various Hospitals (Data Source: "National Health Informatics Survey Report")
To break the bleak status quo of medical big data applications and assist physicians in uncovering the latent value within diverse medical datasets, Guangzhou Zhongkang Digital Technology Co., Ltd. has leveraged artificial intelligence technologies such as Baidu’s PaddlePaddle deep learning framework and the Wenxin large language model. By integrating these AI capabilities with its proprietary data acquisition networks, big data processing technologies, and an ecosystem-oriented health industry platform, the company has developed the “AI Clinical Research Big Data Platform Based on the Wenxin Large Model.”
AI Clinical Research Big Data Platform Built on the Wenxin Large Language Model
Recently, at the “Deep Learning and Large Model Industry Applications” special session of the 4th OpenI/O Qizhi Developer Conference, hosted by Baidu PaddlePaddle, VCBeat conducted an in-depth interview with Huang Yining, Director of Digital Medical AI Technology Products at Zhongkang Technology. What kind of impact will the application of large models in healthcare actually create within the medical field?
Generally, developers can leverage deep learning techniques to process various types of data, including text, images, and multimodal text-image data, with multimodal data being particularly common in the healthcare sector.
“We previously collaborated with a cardiology expert on a research project focused on AI-based early warning for cardiac arrest. Unlike common scenarios such as pulmonary nodules or pneumonia, cardiac arrest is characterized by its sudden onset; therefore, the application of AI technology emphasizes prediction rather than diagnosis. To achieve the most accurate possible early warning for cardiac arrest, we need to not only process patients’ clinical data but also analyze electrocardiogram (ECG) data, laboratory test results, and even hydrological and meteorological data from the environment where the patient experienced the event. Theoretically, the richer the data sources, the more accurate the model’s predictions,” explained Huang Yining.
To effectively leverage multimodal big data, Zhongkang Technology has built an AI-powered clinical research big data platform based on PaddlePaddle. Specifically, Zhongkang utilizes the PaddleNLP natural language processing model library and employs the general information extraction technology Wenxin ERNIE-UIE for medical data structuring. Furthermore, it uses the Chinese medical pre-trained model Wenxin ERNIE-Health as its foundation to support a wider range of downstream tasks, such as medical text understanding and analysis.
In addition to NLP technology, technologies in other fields such as CV are also indispensable. Zhongkang Technology has utilized the PaddleDetection computer vision detection model library from PaddlePaddle, employing RetinaNet and SSD for object detection. It also leverages USAD and SCINet from the PaddleTS time-series modeling library for feature extraction from temporal data. Ultimately, this enables the integrated processing, analysis, and understanding of multimodal data, including patient clinical data, electrocardiograms (ECG), ECG time series, and hydrological and meteorological data.
For patients experiencing cardiac arrest, every minute of treatment time after onset is invaluable. If risks can be detected in advance through early warning systems, it is believed that more lives can be saved. With a new technical solution integrating multi-modal data, the platform’s early warning performance has significantly improved, enabling the prediction of patient cardiac arrest 5–10 hours in advance and substantially enhancing the quality of medical services. After effective governance, rich and diverse medical data can further provide clinical decision support for physicians and lay a solid foundation for subsequent clinical research.

Application Structure of Clinical Research Projects on Cardiac Arrest (with link:End-to-End Practice of General Information Extraction)
Insufficient computing power to process large volumes of high-dimensional data is another common challenge encountered by physicians during scientific research. For instance, in the development of ultrasound AI, researchers need to extract key information from the high-dimensional space of ultrasound images; however, hospitals with limited infrastructure often struggle to perform highly complex model training and prediction. To address this issue, the AI Clinical Research Big Data Platform, powered by the Wenxin large language model, provides physicians with robust computational capabilities, thereby facilitating high-dimensional deep learning modeling that was previously difficult to achieve.
To clarify the value of high-dimensional data processing, Zhongkang Technology discussed a collaborative research project with the director of a rehabilitation department titled “Development of an AI-Based Recognition Model for Range of Motion in the Four Limbs.” In essence, this project aims to use video-based assessments to evaluate individuals’ mobility, replacing traditional questionnaire-based methods. This approach helps patients gain insights into every change during their rehabilitation process and make corresponding decisions, ultimately shortening rehabilitation time and improving rehabilitation efficiency.
“When using questionnaires to assess mobility, people often introduce subjective biases during completion, leading to deviations in the final assessment results,” Huang Yining told VCBeat. “By requiring users to perform specified postural movements, video-based detection can address this issue, enabling a more objective and comprehensive grading of user mobility.”
The first step of this project involves the acquisition and analytical processing of human body posture. In this phase, Zhongkang Technology leveraged the PaddleDetection visual detection model library from Baidu’s PaddlePaddle ecosystem, including models such as HRNet, DarkPose, and SWAHR, to automatically identify key human joint points. Subsequently, a temporal graph convolutional neural network was constructed using the PaddlePaddle deep learning framework. By analyzing information such as joint movement trajectories, range of motion, and velocity, the system evaluates users’ mobility levels and provides more precise, personalized diagnosis and treatment services tailored to different mobility grades.
The Process of Recognizing Human Posture and Actions via Baidu AI Technology and Classifying Crowd Mobility Levels
In addition to the two common issues in medical research mentioned above, Zhongkang will continue to deepen data governance based on PaddlePaddle, addressing the complexities of hospital data.
For example, according to feedback from a department director, previously compiling a disease-specific database for 700 patients with 600 data fields required five clinicians to manually organize the data during their time off work over the course of an entire year, resulting in significant delays in data utilization.
By leveraging PaddleNLP’s Wenxin ERNIE-UIE, key fields can be automatically extracted from disease-specific corpora to generate structured data. ERNIE-UIE features efficient zero-shot extraction and few-shot fine-tuning capabilities; it achieves exceptionally high accuracy after fine-tuning with only a small number of annotated samples. Furthermore, PaddleNLP provides an end-to-end solution for information extraction, covering every stage from “data annotation” and “fine-tuning” to “performance acceleration via model distillation” and “deployment.” This makes it highly accessible to healthcare professionals who may not be deeply familiar with the underlying technical principles of NLP.
Overall, Baidu PaddlePaddle and the ERNIE large language model helped Zhongkang Technology achieve a three-level enhancement in its data governance capabilities.
Level 1: Compared with traditional manual operations, the natural language processing capabilities of the research platform can improve time efficiency by approximately 10 times (shorter duration). Level 2: Few-shot learning based on the Wenxin large model requires only one-tenth of the original data volume to complete modeling, boosting efficiency by another 10-fold (reduced data volume). Level 3: Standardized and normalized data governance enables a single disease-specific database to serve multiple research projects, further enhancing efficiency by approximately 5 times, thereby achieving an overall efficiency improvement of about 500-fold.

Tiered Enhancement of Data Governance for Baidu PaddlePaddle and the ERNIE Large Language Model
“The AI Clinical Research Big Data Platform Based on the Wenxin Large Model” has been implemented in practical applications at numerous renowned hospitals across China and has received widespread recognition from department directors. A director at a provincial-level hospital stated, “Previously, patient data meeting inclusion criteria for clinical research projects could only be obtained through manual compilation and screening. However, since the adoption of artificial intelligence technology, the difficulty and time required for this task have been significantly reduced, providing substantial support to our research efforts.”
The AI-powered big data platform for clinical research, built on the Wenxin large language model, is leveraging cutting-edge AI technology to help clients advance their research projects and further promote the development of academic disciplines.

Comparison of the Application Effects of Data Governance: Manual Operations vs. Platform-Enabled Approaches
However, the rapid improvement in efficiency is not the sole objective of building an AI-powered big data platform for clinical research. Currently, Zhongkang Technology has outlined a specific roadmap to further expand the application boundaries of its big data platform based on PaddlePaddle and the ERNIE large language model.
According to Huang Yining, Zhongkang will leverage its extensive data accumulation in the healthcare sector to conduct domain-adaptive large model training for ERNIE-Health, thereby applying it to various NLP tasks in the medical field.
The AI-powered big data platform for clinical research will further perform information extraction and alignment of Chinese medical terminology on content such as drug labels and medical records, thereby automatically constructing a medical knowledge graph.
This means that the former AI-powered big data platform for clinical research will expand beyond the scope of clinical data, gradually incorporating hospital-wide big data into its governance framework.
Dr. Tang Keke, CTO of Sinohealth, stated that Sinohealth and Baidu PaddlePaddle have built a collaborative bridge through joint technology R&D and ecosystem co-construction, achieving mutual prosperity and win-win outcomes. Looking ahead, Sinohealth aims to establish a closer partnership with Baidu PaddlePaddle, leveraging the AI technological advantages of Baidu PaddlePaddle and the ERNIE large language model, combined with Sinohealth’s leading position in health industry big data and its substantial technical expertise in medical research, to drive joint innovation in products and solutions. Both parties look forward to comprehensive, multi-domain, and in-depth exchanges to continuously empower medical scientific research in China and jointly write a new chapter in the field of life sciences.
At the end of the interview, Huang Yining discussed the beginning of the collaboration with Baidu PaddlePaddle.
Huang Yining—Digital Medical AI Technology Product Director at Zhongkang Technology, and a student in the 6th cohort of Baidu’s AICA Chief AI Architect Training Program. It was this unique experience that enabled Huang Yining to recognize the value of integrating PaddlePaddle with medical big data.
Accelerating the large-scale industrial production of AI makes the cultivation of composite AI talent crucial. The Baidu AICA Chief AI Architect Training Program, jointly launched by the National Engineering Research Center for Deep Learning Technology and Application and Baidu, aims to cultivate Chief AI Architects who integrate the capabilities to “analyze business problems, master model algorithms, and implement practical applications.”
As artificial intelligence is increasingly applied in the medical field, the future value of Baidu’s AICA Chief AI Architect Training Program becomes even more promising. As more talents enter the field of medical AI and drive the application of big data in operations, health, and other areas, we will be able to witness a smarter healthcare system that brings new hope to more patients.
About Baidu PaddlePaddle
PaddlePaddle is China’s first open-source, open, and feature-rich industrial-grade deep learning platform independently developed by Baidu, built upon the company’s years of research in deep learning technologies and their business applications. Based on PaddlePaddle, the entire process of deep learning technology development exhibits significant characteristics of standardized, automated, and modularized large-scale industrial production, continuously lowering the barrier to application and enabling artificial intelligence technologies to be efficiently and conveniently applied across various industries.
PaddlePaddle, rooted in industrial practice, is committed to deep integration with various industries. Currently, PaddlePaddle has been widely applied across thousands of sectors, including energy, transportation, manufacturing, and finance. In the medical field, it has been deployed in numerous scenarios, such as AI-powered intelligent consultation, CT image screening, and smart hospital security systems.