Home Medical AI Investment Insights: Healthcare Data as the Driving Force Behind AI Innovation

Medical AI Investment Insights: Healthcare Data as the Driving Force Behind AI Innovation

Mar 28, 2018 08:00 CST Updated 08:00

Author: Yang Chengkui, Tongdu Capital


February 2018 was a month of continuous surprises for the field of medical AI.

 

Professor Zhang Kang’s team published a landmark study on the application of AI in healthcare in the top-tier journal *Cell*: they developed an AI system based on deep learning capable of diagnosing two major categories of diseases—ocular diseases and pneumonia—with accuracy rivaling that of top physicians. The study utilized 100,000 accurately annotated fundus images and 5,000 chest X-ray images.

 

FDA Approves First AI-Based Diagnostic Product for Autism: Cognoa’s product collected extensive real-world data during its development by providing free screening services to children and their parents, thereby building and continuously refining its AI machine learning model, which has been tested on a cumulative total of 250,000 children and family members.

 

Lepu Medical’s AI-ECG Platform, an artificial intelligence-based automated ECG analysis and diagnostic system, has had its medical device registration application accepted by the U.S. Food and Drug Administration (FDA) for regulatory review. The platform utilized approximately 25 million ECG data samples derived from 300,000 patient ECG examinations to train its deep convolutional neural network models.

 

A common foundation underlying these major advances is the utilization of “substantial” datasets. Roche announced its acquisition of Flatiron Health, a cancer big data company, for $1.9 billion, with the total transaction value reaching $2.1 billion, further underscoring the value of healthcare data. Data serves as fuel, providing the foundational basis and driving force for AI development. Discussions on healthcare data involve the following three issues:


>>>>

1. How is medical data generated?

 

From the perspective of healthcare stakeholders, this involves hospitals, patients, pharmaceutical and medical device companies, insurers, genetic testing firms, and other entities. The business activities of these diverse stakeholders, as well as their interactions, generate varied types of data.

 

Hospitals are the primary source of medical data, which is categorized into clinical and non-clinical data. Clinical data generally includes: medical imaging data (such as CT, X-ray, MRI, ultrasound, endoscopy, and PET/PET-CT); medical image-related data (such as physiological signals like ECG and EEG, and digitized microscopic fields of view, e.g., pathological images); and medical text data (such as medical records, operative reports, physician orders, and examination/test reports). With the development of precision medicine, an increasing volume of genomic data is being generated, although currently only in a subset of hospitals. Non-clinical data generally includes health insurance payment data and hospital operational data.

 

The aforementioned data types already exist within hospitals, generated through clinical IT systems, medical devices, and healthcare provider behaviors. However, from the perspectives of data completeness and continuity, in-hospital diagnostic and treatment data alone are limited. Greater value can be realized only by integrating continuous out-of-hospital data on rehabilitation, physiological status, and patient behaviors. In reality, however, out-of-hospital data are largely missing, with only a small amount of follow-up data collected for research purposes. The absence of out-of-hospital patient data is primarily attributable to two interrelated factors: limitations in technological capabilities and the lack of incentive mechanisms to engage all stakeholders.

 

With the development of novel biosensors and technology-driven new models, there will be opportunities to promote the generation and application of patient data outside hospital settings, which is worthy of attention. For example, SANO is a biosensor company that helps diabetic patients continuously monitor blood glucose levels painlessly, avoiding the discomfort associated with finger-prick testing. Meanwhile, DreaMed employs AI algorithms to collect and analyze patient data, including continuous glucose monitoring data, to adjust insulin pumps and optimize diabetes treatment regimens.


AiCure captures patient data via app-based photography, leveraging facial recognition algorithms to verify whether patients have prepared the correct medication and actually ingested it, thereby improving medication adherence in clinical trials and mitigating the 20–30% risk of trial failure attributable to poor patient adherence.

 

In addition to in-hospital data and out-of-hospital patient data, pharmaceutical and medical device companies, insurers, genetic testing firms, and other healthcare institutions continuously generate data during their business operations. These data hold significant value for advancing the healthcare sector and urgently need to be explored.


>>>>

2. How to Prepare for Use?

 

Informatization serves as the foundation for data generation. After more than two decades of development, China’s healthcare informatization has achieved certain results, with hospitals at all levels having established systems to varying degrees. However, due to the lack of overall planning and design in the early stages, hospitals are characterized by a proliferation of disparate systems with inconsistent standards, creating barriers to interoperability. Another significant factor is that these healthcare information systems were originally designed to facilitate operational workflows, without considering how the accumulated data could be utilized.


Although business systems have accumulated vast amounts of data, the reality is that “data silos” and “poor data quality” persist, rendering these healthcare data unusable in their raw form.

 

To unlock the value of medical data, it must be built upon a foundation that is “multi-dimensional, large-scale, structured, and standardized.” Data generated by business systems must undergo a process of “data integration and processing.” In layman’s terms, this involves aggregating and fusing various types of data from disparate systems into an “analyzable” dataset. The data collection and governance processes must adhere to specific standards to clean the data and convert unstructured data into structured formats, particularly for medical textual data with poor inherent structure, such as medical records and surgical logs. Due to factors such as the weak foundational quality of medical data and the historically closed nature of the healthcare system, the “data integration and processing” phase is fraught with challenges. Without overcoming these hurdles, it is difficult to realize the full value of data applications.

 

As the national health and medical data strategy advances, healthcare data will break free from constraints and accelerate its aggregation and integration. AI technology acts as a catalyst, making this process “more efficient, automated, and cost-effective.” The “aggregation and integration” of medical data will unfold across all levels and scenarios, ranging from national data centers to regional hubs, then to hospitals and departments, or driven by specific application needs such as clinical trials, insurance design, and single-disease research. For instance, Yiming Data, a leading domestic medical big data company, can “aggregate and integrate” clinical data to form large-scale, analyzable single-disease datasets. These datasets can be mined and analyzed according to the needs of various healthcare stakeholders—including clinicians, pharmaceutical companies, and insurers—thereby unlocking the value of data.

 

Products or services that leverage advanced technologies to accelerate the “aggregation and integration” of medical data will become a critical link in the medical data value chain and warrant close attention. However, regardless of how advanced the technology may be, data security and privacy protection must not be overlooked in the sensitive field of healthcare, and it is expected that technology will also play a significant role in this regard.


>>>>

3. How is the data used?

 

Data mining and application become feasible only when large-scale, accessible data is available. However, in reality, the aggregation and integration of data cannot achieve significant progress in the short term. AI applications in healthcare will continue to face foundational data challenges, which means that the pace of AI development across different healthcare subsectors will vary in the near term. Data infrastructure must be evaluated in the context of specific application scenarios. The complexity of healthcare gives rise to a wide range of potential use cases; however, the extent to which AI can address problems differs across these scenarios. In some cases, there may not be critical pain points, rendering AI improvements marginal or of limited practical value despite their technical feasibility. Furthermore, the underlying data infrastructure varies significantly depending on the application scenario.

 

When application scenarios are clearly defined, there is an urgent need to address specific problems, and a solid data foundation exists (characterized by large-scale, structured, and highly standardized available data), the integration of advanced intelligent algorithms can yield remarkable application outcomes. The rapid development of AI in medical imaging is precisely attributable to its robust data foundation. As a critical auxiliary tool for disease diagnosis, clinical practice faces numerous pain points, where AI can significantly enhance both efficiency and healthcare quality. Beyond clinical applications, AI also holds immense potential in other healthcare sectors, such as new drug development and gene data mining.

 

The application of new technologies must deliver user value as a prerequisite, and only those capable of commercial translation can achieve sustainable development. Therefore, when considering AI applications in healthcare, it is essential to factor in their future commercial potential: Is there a clear payer? What are the willingness and ability to pay? How large is the potential market size?


微信图片_20180328110905.png

QR Code for Offline Salon Registration


微信图片_20180328110859.jpg