Home Opening the Data Pathway: A Panoramic Analysis of Big Data's Value in Clinical Medicine

Opening the Data Pathway: A Panoramic Analysis of Big Data's Value in Clinical Medicine

Mar 22, 2018 08:00 CST Updated 08:00

The issue of clinicians bearing substantial medical research responsibilities alongside their daily clinical duties remains a subject of considerable debate within the industry. Currently, physicians at tertiary teaching hospitals devote the majority of their energy to patient care, self-education, and medical teaching, leaving minimal time for conducting research projects. Meanwhile, physicians at primary-care hospitals face relative deficiencies in research capabilities and limited opportunities to encounter valuable clinical cases, resulting in less-than-satisfactory research outcomes.

 

Should Clinicians Prioritize Clinical Practice or Academic Research?In theory, clinical practice and academic research should complement and reinforce each other, jointly advancing the quality of both patient care and scientific inquiry. However, the real-world challenges surrounding this balance continue to trouble physicians, healthcare administrators, and society at large.

 

With the technological advancements and rapid practical application of big data in healthcare, can this contradiction be resolved, enabling mutual reinforcement and promotion between the two? In light of this question, VCBeat engaged in dialogues with healthcare professionals and big data experts, aiming to explore new perspectives and directions in this evolving landscape.

 

I. Disconnect Between Clinical Diagnosis and Treatment and Scientific Research: Data Processing Is the Bottleneck


At present, medical resources in China remain concentrated in large tertiary hospitals, resulting in an extremely high clinical workload for physicians. Integrating research activities into clinical practice places immense pressure on the allocation of time and energy.

 

Professor Ma Jianqun from the Department of Thoracic Surgery at Harbin Medical University told reporters, “Personally, I devote approximately 30% of my time to scientific research, with the majority of my efforts dedicated to clinical work. Next year, our hospital plans to recruit a researcher specializing in scientific research to enhance the research efficiency of our department. It is clearly unrealistic to expect clinicians to produce high-quality research outcomes by allocating only about 30% of their time to research. In the past, writing scientific papers required us to first review literature, attend external academic conferences, stay abreast of international academic developments, and determine research directions. We then had to design study protocols, collect clinical case data, review pathological slides and other test reports, or carry out patient treatment and follow-up.”


图片1.png

Professor Ma Jianqun, Department of Thoracic Surgery, Harbin Medical University, specializes in the diagnosis and treatment of lung cancer and esophageal cancer.


Case reports, being few in number, are generally analyzed quickly and can be completed within one to two weeks. Retrospective papers summarizing clinical experience typically take longer, usually requiring about one to two months. In contrast, research projects demand significantly more time; at least six months are needed for data collection, organization, analysis, and statistical processing, with the entire study often spanning years. However, in terms of scientific value, prospective, randomized research studies offer greater guidance than the former two types.

 

Professor Liu Hui from the Department of Radiation Oncology at Harbin Medical University stated, “Conducting clinical research requires follow-up as a core component. Evaluations such as overall survival (OS), five-year survival rate, and adverse reactions all rely on long-term follow-up data. Without the accumulation of follow-up data, the value of clinical research cannot be discussed.”


图片2.png

Professor Liu Hui, Department of Radiation Oncology, Harbin Medical University, specializes in radiotherapy for head and neck tumors.


“In the past, our research efforts were primarily coordinated through hospital medical records departments. We would submit applications along with patient names, and the medical records department would handle the follow-ups. Even for a small-scale study, at least 50 medical records were required, while larger studies might involve hundreds of records. Merely retrieving the medical record numbers could take a considerable amount of time. Only after the records were retrieved would the medical records department begin the follow-up process. If the information obtained from following up with these hundreds of patients was incomplete, additional follow-ups would be necessary, further prolonging the time required.”

 

First, clinical duties occupy the majority of physicians’ time, leaving them with relatively limited energy to devote to scientific research. Second, the collection and retrieval of medical record data significantly prolong the research cycle for physicians. Due to the lack of structured data entry in past use of electronic health records (EHRs), information was predominantly recorded in free-text format. As a result, historical EHRs in hospitals contain large sections of unstructured or blank fields, creating difficulties for hospitals in leveraging these data for scientific research.

 

Therefore, efficiently and accurately structuring these existing data assets to establish a “data pipeline” between clinical practice and medical research may be the key to solving the problem.


II. From High-Quality Data Acquisition to the Development of Clinical Research Solutions


Data is the foundation of medical research, and there is no doubt about this. However, the question remains: what kind of data do doctors actually need for support? This needs to be clearly defined.


In response, Ms. Yang Haiying, Chief Medical Officer of LinkDoc Technology, told reporters: “Currently, China is continuously accumulating experience in the standardized and normalized processing of clinical research data, aligning with and referencing a series of international standards. However, challenges persist, including inconsistent data standards and the inability to integrate and share data across hospitals or research institutions. Only high-quality, large-volume, and well-structured medical data can significantly enhance research efficiency and facilitate the generation of scientific achievements.”

 

How to Help Physicians Obtain High-Quality Data: According to Yang Haiying, This Involves Three Aspects:

 

First, the data quality of hospital EMRs (Electronic Medical Records). “The reason why some papers in journals have been retracted is that there was fraud in the process of data processing. This is related to the standardization and self-discipline of doctors when writing EMRs.”

 

Therefore, when hospitals utilize clinical data capture systems, medical record documentation should be as complete and standardized as possible. By leveraging electronic medical records (EMR) as the platform, an integrated system for clinical research-oriented medical record documentation and data collection should be established, enabling research data acquisition without increasing the routine workload.

 

Second, the processing of hospital EMR data, particularly the data structuring process, directly determines the quality of research data. Post-structuring data processing based on big data and artificial intelligence technologies is one of the key technical solutions to address this issue.

 

Through post-hoc structuring, historical medical records are structured to meet the data collection needs of scientific research, while ensuring that the original medical record data entered by physicians remains traceable.

 

Third, the data quality control/verification/data analysis phase. Medical big data companies must have a professional quality control team to conduct data verification, including manual review, logical validation, and medical verification, and they are required to perform annual competency assessments for relevant personnel.

 

Furthermore, the statistical methods employed by researchers are a critical factor influencing data quality. International journals place significant emphasis on statistical methodology, including aspects such as participant selection and analytical techniques; therefore, it is essential to engage a professional statistical team to conduct rigorous data analysis.

 

Yang Haiying stated, “It is extremely challenging to establish this comprehensive framework, which encompasses standardized data acquisition, efficient and high-quality data structuring, as well as efficient manual review and logical verification provided by the central data platform. It is precisely based on this integrated intelligent data processing system that we ensure the quality of the massive volumes of data processed and uphold the standardization and rigor of the clinical research solutions provided to physicians.”

 

III. Exploration and Presentation of the Value of Big Data in Clinical Medicine


Having explored the challenges and solutions surrounding clinical research for physicians, we are compelled to ask: What value does high-quality, research-grade data bring to the healthcare industry as a whole? Based on practical case studies from healthcare institutions and enterprises, VCBeat has identified answers in the following four areas:

 

>>>>

Supporting the Expansion from RCTs to RWS


RWS (Real-World Study) originated from pragmatic clinical trials. It is a non-randomized, open-label study design that does not use placebos and is conducted based on real-world clinical practice, yielding results with high external validity.

 

Real-world studies draw on a wide range of data sources, encompassing vast amounts of data generated through multiple channels such as outpatient visits, hospitalizations, diagnostic tests, surgeries, pharmacy records, and wearable devices, thereby enabling the inclusion of patients with multiple comorbidities and complex clinical conditions.

 

Yang Haiying stated, “In clinical practice, new drugs and diagnostic and therapeutic methods emerge every year. Applying these new technologies to a broader population than the participants in clinical trials still faces many practical challenges, as the real-world population is far larger than the cohorts involved in the research phases of drugs and medical devices. Therefore, scaling up to Real-World Studies (RWS) remains fraught with difficulties.”

 

Issues such as the feasibility of combining medications with other therapeutic modalities, the duration of pharmacological treatment, and dosing regimens tailored to the Chinese population are challenges that clinicians face on a daily basis. This necessitates that physicians draw upon experience, data, and evidence from clinical practice to determine the most rational treatment plans for individual patients.

 

To effectively leverage clinical data, it is imperative to implement standardization throughout the processes of data generation and accumulation.

 

Professor Yao Chen, Deputy Director of the Oncology Department of the Wu Jieping Medical Foundation, noted that while RCTs (Randomized Controlled Trials) emphasize standardized treatment and are generally conducted at clinical trial institutions, RWR (Real-World Research, synonymous with RWS) places greater emphasis on real-world treatment practices and can be carried out across all healthcare institutions.


QQ图片20180321093317.png

 

In late 2016, the U.S. Congress passed the 21st Century Cures Act, which approved the use of “real-world evidence” to support expanded indications in lieu of traditional clinical trials, thereby underscoring the significant role of real-world studies (RWS) based on large-scale accumulated case data.

 

>>>>

Promoting the Development of Clinical Practice Guidelines in China


Clinical practice guidelines refer to optimal recommendations developed through the evaluation of evidence generated from research and an assessment of the benefits and harms of various alternative interventions. International guidelines are often based on extensive clinical evidence and are frequently adopted and referenced in China. However, due to differences in population characteristics and genetic profiles, treatment regimens vary across ethnic groups. Therefore, China must establish its own clinical guidelines grounded in an evidence system derived from the Chinese population.

 

At this juncture, the value of data becomes evident. Physicians can professionally and systematically analyze their clinical diagnostic and treatment experiences through data, such that both the results of clinical controlled studies and those of real-world studies based on case data constitute high-level clinical evidence.

  

Yang Haiying stated, “At present, there is insufficient accumulation of prospective clinical study evidence based on the Chinese population; however, this gap can be partially addressed by findings from real-world studies. We hope that such evidence will assist Chinese oncology experts in refining China’s guidelines, thereby guiding clinicians’ diagnostic and therapeutic practices.”

 

“Actually, by establishing a high-quality medical big data database and leveraging current technological means, the conflict between clinical practice and scientific research for physicians can be effectively resolved,” said Yang Haiying.

 

In September 2017, the “Collaborative Clinical Research on Big Data for Thoracic Tumors” project, led by the Oncology Department of the Wu Jieping Medical Foundation and jointly participated in by ten hospitals including Tianjin Chest Hospital, West China Hospital of Sichuan University, Henan Cancer Hospital, and China-Japan Friendship Hospital, aggregated medical records from these ten institutions over a period of nearly four years. The project accumulated more than 32,000 lung cancer cases and over 16,000 esophageal cancer cases. Leveraging LinkDoc’s real-world database for data processing and its research solutions featuring high-value, comprehensive data handling, the team completed ten real-world studies within four and a half months. In early 2018, ten abstracts derived from these studies were submitted to the American Society of Clinical Oncology (ASCO), the world’s largest and most influential oncology organization.

 

Professor Zhang Xun, Chair of the Oncology Division of the Wu Jieping Medical Foundation, pointed out that aggregating data from different hospitals significantly enhances research efficiency and outcomes. Multi-center clinical research collaborations based on real-world data can rapidly improve the research capabilities and clinical proficiency of thoracic surgeons in China, thereby strengthening the voice of Chinese thoracic surgery experts on the international stage.

 

>>>>

Supports AI-Assisted Clinical Diagnosis and Treatment


As a renowned Grade-A tertiary hospital in China, the Department of Thoracic Surgery at Harbin Medical University has made significant achievements in clinical applications. According to Professor Ma Jianqun, their department utilizes big data analytics tools to assist physicians in performing precise pulmonary segmentectomies. By leveraging structured databases provided by these tools for research analysis, they can determine the respective advantages of segmentectomy versus wedge resection.

 

“If, within a certain age group and for lesions of a specific size range (e.g., 1 cm), there is no difference in outcomes between precise segmentectomy and conventional wedge resection, then we may cease performing precise segmentectomy in such patients. This is because wedge resection is simpler, less time-consuming, more cost-effective, and causes less trauma to the patient.”

 

The application by the Department of Thoracic Surgery at Harbin Medical University demonstrates that the value of high-quality, multi-omics datasets is not limited to scientific research. Based on these data, deep learning can be leveraged to develop AI-assisted diagnostic and therapeutic tools, thereby helping physicians improve diagnostic and treatment efficiency.

 

According to Yang Haiying, AI products currently assisting in clinical diagnosis and treatment fall into two categories. The first category is based on knowledge bases (such as clinical guidelines and relevant literature) and can only provide fixed diagnostic and therapeutic recommendations for specific patient groups in accordance with these guidelines and literary evidence. However, this approach has certain limitations due to variations in individual patient characteristics. The second category comprises AI-based clinical decision support products that leverage electronic health record (EHR) data and algorithmic models. By integrating real-world patient EHR data with algorithmic models, these products build precise diagnostic and therapeutic models and provide data support, thereby guiding clinical decision-making for both typical and rare cases. Furthermore, using underlying data and training materials derived entirely from Chinese patient cases will better facilitate the application and value of these tools within the Chinese population and for individual clinical cases.


>>>>

Accelerate New Drug Development and Expand Drug Indications


Leveraging big data mining and artificial intelligence technologies to extract knowledge capable of advancing drug discovery from vast amounts of dispersed global information, such as patents and clinical trial results, thereby generating new, testable hypotheses and accelerating the drug development process.

 

Furthermore, national regulations explicitly require pharmaceutical companies to submit drug safety monitoring data within five years after a drug’s market approval; failure to do so risks the drug’s withdrawal from the market. Leveraging real-world big data offers an effective approach for pharmaceutical companies to meet such regulatory requirements.

 

“A pharmaceutical industry practitioner told VCBeat, ‘Traditionally, pharmaceutical companies could only discover new indications through expensive randomized controlled trial (RCT) clinical trials, which required a lengthy period. Moreover, not only are the costs for meeting regulatory standards extremely high, but the associated risks are also significant. In contrast, real-world studies (RWS) can help pharmaceutical companies and experts conduct analyses in advance and initiate trials earlier.’”

 

On the other hand, physicians determine the scope of drug use based on the product labeling. In practice, off-label use is common in many hospitals, where doctors prescribe medications based on their clinical judgment. This phenomenon is not uncommon across the pharmaceutical industry. By leveraging an intelligent data processing platform, it is possible to effectively control research costs while maximizing the exploration of clinical efficacy and safety. Pharmaceutical companies can then use this evidence to decide whether to expand the indicated indications for their products, thereby opening up broader therapeutic applications for their drugs.


IV. Big Data Empowering Clinical Medicine: Technical Capability as the Foundation, Professional Medical Expertise as the Core


In summary, it is evident that the key to healthcare big data enterprises empowering the entire healthcare industry lies in two aspects:

 

First is technical capability, including the establishment of real-world databases, the development of big data processing systems/platforms, and artificial intelligence platforms.

 

Second is medical expertise, including leading the design of disease databases, conducting standardized and normalized data processing and quality control, as well as real-world research design, data management, and rigorous statistical data analysis.

 

Currently, as major domestic medical big data companies such as ZeroCrunch Technology and Taimei Medical have successively secured Series C financing, the medical big data sector has quietly entered its second half. Earlier this year, after Roche announced its plan to acquire the oncology big data company Flatiron Health for $1.9 billion, signals of industry transformation have become increasingly pronounced.

 

Medical big data companies can assist physicians in conducting clinical research and development, providing valuable medical solutions that truly benefit doctors and even patients.


If such services are viewed as a form of empowerment, this empowerment represents value creation that integrates both technological and medical dimensions. It is grounded in professional technical and medical expertise, providing healthcare providers, physicians, pharmaceutical companies, insurance institutions, and other stakeholders with comprehensive healthcare solutions. Ultimately, it aims to facilitate patient treatment and enhance the patient care experience. In realizing this process, technological capability serves as the foundation, while professional medical expertise constitutes the core.