In recent years, Big Data research appears to have rapidly expanded from the fields of biology and basic medical research to those of clinical disease data and population health information, with the use of wearable devices for data collection potentially yielding limitless volumes of big data.Recently, the Director of the U.S. National Institutes of Health (NIH) praised scientists for achieving a major breakthrough in exploring and characterizing subgroups of patients with type 2 diabetes using big data approaches (see references below). The research concepts and methodologies employed are worthy of reference and further investigation.In short, we discuss big data in translational medicine. What exactly is big data research? Where can useful big data be found? These issues need to be clarified.
I. Big Data: Expanding from Basic Research to Clinical Informatics
In 2014, the 5th China-US International Forum on Clinical and Translational Medicine was honored to invite three leading experts in big data from the biomedical field. They were Professor Lee Hood, President of the Institute for Systems Biology and founder of P4 (Predictive, Preventive, Personalized, and Participatory) Medicine, as well as a member of the three US National Academies; Professor He Fuchu, President of the Academy of Military Medical Sciences and a leading figure in Chinese proteomics; and Mr. Wang Jian, co-founder of the world’s largest genetic testing company. They shared insights on big data research and its applications from their respective professional domains. It should be noted that the big data research they discussed represents successful cases in basic research and systems biology applications. By leveraging gene sequencing technologies, proteomics tools, or systems biology methods and perspectives, we have gained greater understanding at the molecular level of DNA-encoded information related to human health and disease, epigenetic markers, and metabolic signatures, among others. However, in the realm of human diseases, particularly major conditions such as cancer, cardiovascular diseases, and diabetes, there is still a lack of replicable successful cases in clinical big data research.A recent paper published in *Science Translational Medicine* serves as a typical example of big data research. Researchers identified correlations between clinical efficacy and disease progression in patients with type 2 diabetes by analyzing large-scale datasets from population health studies and disease registries.
II. What Have Big Data Studies on Subphenotypes of Type 2 Diabetes Revealed?
Domestic peers have also conducted clinical studies on the epidemiology and disease screening of diabetes. However, there are no reports yet on combining patients' electronic health record information with genomics big data to study type 2 diabetes.NIH-funded researchers leveraged big data to analyze 11,000 volunteer patients, identifying three distinct patient subgroups. This stratification of type 2 diabetes mellitus (T2DM) patients not only provides an evidence-based foundation for precision clinical therapy but also further elucidates the mechanisms underlying functional decline and direct mortality in diabetic patients.
Specific Approaches to Big Data Research: Based on patients’ electronic health records (EHRs), researchers categorized the 11,000 volunteer patients by race and socioeconomic status. These patients were also individuals with chronic diseases receiving follow-up care in hospital or community general practitioner clinics. Rather than using disease symptoms or treatment outcomes as observational indicators, the researchers sought commonalities across all patients’ EHR data—specifically, shared features such as laboratory test results, blood pressure readings, height and weight measurements, and other routine data elements contained within the electronic health records. This approach was less a clinical research project and more accurately described as the establishment of a virtual patient community, constructed using members’ medical and health information as guiding clues. When researchers mapped the “community distribution” of patients based on different types of baseline information, three subgroups emerged, orⅡRefined Classification of Patients with Type Diabetes (as shown in the figure).
Photo source: Dudley Lab, Icahn School of Medicine at Mount Sinai, NY
3. What Is Population Health Research?
Further analysis of the three subgroups of patients with type 2 diabetes, including physiological or pathological parameters from electronic health records, such as gender, blood glucose levels, and white blood cell count. Generally speaking,As type 2 diabetes progresses, patients develop secondary clinical manifestations, such as neuropathy, blindness, renal failure, and secondary cardiovascular disease.Among three subgroups of patients with type 2 diabetes, researchers observed that those in the first subgroup were more likely to present with microvascular complications, such as blindness or visual impairment. Patients in this subgroup were also younger and had obesity. In contrast, patients in the second subgroup had a higher risk of tuberculosis and cancer. The third subgroup had a higher prevalence of HIV positivity or AIDS, and was prone to conditions such as hypertension and aortic thrombosis. Compared with the second and third subgroups, patients in the first subgroup had a higher risk of secondary heart disease. This represents the most definitive population health study project.
As researchers further probe the underlying mechanisms based on the findings from these big data studies (understanding not only what happens but also why it happens), the genomic sequencing and coding information of patients in various subgroups aligns more closely with their diverse clinical manifestations.. All of these provide researchers with opportunities to delve into the lifestyle, dietary habits, and living environments of different subgroups of patients with type 2 diabetes, as well as other external factors that may induce diabetes or genetic mutations.
Case Analysis: At this point, we should have a clear understanding of the fundamental concepts and design principles of big data research. Leveraging big data for life sciences exploration extends beyond basic or clinical medical projects; it predominantly involves translational research in interdisciplinary fields. In the practice of clinical and translational medical research, greater emphasis is placed on the spirit of teamwork.
References:
Original paper in Science Translational Medicine: “Identification of type 2 diabetes subgroups through topological analysis of patient similarity.” Li L, Cheng WY, Glicksberg BS, Gottesman O, Tamler R, Chen R, Bottinger EP, Dudley JT. Sci Transl Med. 2015 Oct 28;7(311):311ra174.
To browse more of Professor Shi Zhanxiang’s featured articles, please clickDr. Shi Zhanxiang’s Column: Article Index
This article is published on VCBeat with the authorization of Teacher Zhan Xiang. Please obtain permission from the author if you wish to repost it.
WeChat Official Account: sasctm
Extra: A Wave of Benefits Is Coming!
Starting in December, the WeChat public account of the Global Doctors Organization, sasctm, will broadcast weekly highlights every Friday in authentic English audio. Content will include reviews of the latest advances in clinical and translational medicine, disease diagnosis and treatment practices, and analyses of challenging cases. You are welcome to follow and share your comments!
(The English-language radio program is designed to help professionals in fields such as medicine, clinical research, public health, and clinical pharmacy, as well as medical school interns and graduate students, stay abreast of frontier advances in clinical and translational medicine. It also enables them to self-study and enhance their listening comprehension of precise, professional English, thereby cultivating genuine competence in international communication. Featured experts include the Editor-in-Chief of The New England Journal of Medicine, leading authorities in international clinical and translational medicine, and practicing clinicians. Your support and attention are warmly welcomed!)