Home Yang Zimo of Shulian Yixin: Tackling Chaotic Medical Data Through Governance Before Application | Forum Speech

Yang Zimo of Shulian Yixin: Tackling Chaotic Medical Data Through Governance Before Application | Forum Speech

Sep 20, 2017 08:00 CST Updated 08:00

Healthcare has strong potential demand for artificial intelligence. Currently, a relatively complete industrial structure for “AI + Healthcare,” encompassing infrastructure, technology, and applications, has initially taken shape globally. For new technologies to truly drive industry transformation, coordinated efforts across policy, technology, talent, and other areas are essential, alongside corporate exploration and the accumulation of experience over time. To explore future development and practical implementation strategies for healthcare big data and artificial intelligence, the 2017 Changjiang Industry Forum (Autumn Session) and the Healthcare Big Data and Artificial Intelligence Conference were grandly held at the Wuhan Conference Center on September 16–17, 2017.


At the conference, Yang Zimo, Chief Data Officer of Chengdu Shulian Yixin Technology Co., Ltd., delivered a presentation titled “Data Governance Starts with Medical Equipment,” in which he outlined the current state of medical data, the rationale and methodologies for data governance, and Shulian Yixin’s data management practices in the field of medical equipment. The following is a curated summary of the highlights from his speech, compiled by VCBeat:


Guest Introduction


杨紫陌.jpg


Ms. Yang Zimo, Chief Data Officer of Chengdu Shulian Yixin Technology Co., Ltd.


Graduated from the University of Electronic Science and Technology of China in 2013. Since undergraduate studies, has been engaged in research on network science and data mining, publishing more than ten papers in international SCI journals. Among these, one paper was selected as an ESI Highly Cited Paper globally, and another was featured in a long-form special report by MIT Technology Review, the world’s leading technology media outlet (see below for the list of publications). In 2012, received the Best Student Paper Award at the 8th National Academic Conference on Complex Networks, being the only undergraduate recipient. In 2013, was awarded the “Alibaba Star” distinction through Alibaba’s campus recruitment program. Only seven individuals nationwide received this honor that year, and he/she was the only undergraduate among them. At Alibaba Group, successively led projects including precision email marketing (Taobao), real-time product recommendations (Taobao), offline Taobao search and recommendation services (Koubei.com), and personalized Alipay homepage (Alipay). Ranked first (including ties) in individual annual performance within Alipay in both 2014 and 2015, and ranked first in individual annual performance within Alipay’s Wireless Business Division.


First, I would like to clarify that data governance permeates various data-intensive sectors and is not limited to medical devices; it encompasses data across the entire healthcare industry as well as many other industries.

 

What constitutes “good” data in the minds of most people? What kind of data do you find highly desirable? I believe everyone shares this perspective: Regardless of the format in which it is provided, I expect the data in the database to be accurate and clean. What I need is not merely the data output itself, but also a supporting framework for managing that data. This framework refers to data operational standards, encompassing comprehensive audit trails of all data-related operations and robust permission management for the data.


Current Status of Medical Data


Regarding data application, we expect data to be simple, easily retrievable, and conducive to analysis, modeling, and simulation. This is our expectation for high-quality data. However, what is the current state of medical data?


Currently, hospitals of all sizes across China operate thousands of disparate systems, each containing hundreds or even thousands of tables, resulting in highly disordered data structures. For instance, patient medical records are fragmented and unstructured, making search and screening difficult. Furthermore, within imaging systems, no parameter adjustments have been applied to the images themselves, leading to inherent biases in the imaging data. Additionally, the boundaries for medication use are ill-defined, and physicians’ clinical practices lack standardization.

 

In fact, much of the data in the healthcare industry is like this: due to inherent irregularities and lack of standardization, although the volume of data is large, it is difficult to apply. Healthcare data itself holds enormous value, but its potential cannot be realized due to chaotic content. We have various information systems aiming to integrate all digitalized resources, yet none of these numerous platforms can provide complete and accurate data.


Why Implement Data Governance?


How Can Data Governance Issues Be Resolved? Many people consider data governance to be a tedious and labor-intensive task, potentially requiring case-by-case preprocessing and incurring substantial labor costs.

 

Let us first examine the historical evolution of data. Initially, it began with data generation. Taking the healthcare industry as an example, with the advancement of informatization over the past few decades, traditional data has been managed through more effective approaches. As a result, we now have access to diverse types of information, marking a comprehensive transformation in medical data.Many experts in the industry are implementing their visions for medical data applications; however, these efforts are predicated on the premise of having high-quality data. Without prior data governance and data sharing, realizing effective data application is exceedingly difficult.

 

Why is data governance necessary? The reason lies in the current chaotic state of healthcare data. We aim to integrate and standardize healthcare data to establish a common language for it, thereby facilitating more efficient retrieval and analysis. Why is there an additional layer of data sharing between governance and application? Data is a typical example where 1+1>2. Moreover, its value increases significantly with scale. The larger the volume of data provided to machine learning algorithms, the more precise their judgments become, thus better supporting various predictive tasks.


How to Implement Data Governance?


How to Implement Data Governance: Is Having a Unified Standard for Data Cleaning and Imputation Sufficient? I Believe This Is Only the First Step. Data Asset Inventory and Data Lineage Governance Are Not Merely About Cleaning Up Existing “Dirty Data,” but More Importantly, About Addressing Issues at the Source. Establishing Such a Mechanism Requires Substantial Expertise, Which Is Essential to Developing a Rational and Effective Data Governance Framework.


I believe that comprehensive data governance comprises five layers. The core first layer is business data, which refers to the understanding of business operations. The second layer is metadata cleansing, the third is data standardization, the fourth is data integration and sharing, and the fifth is a big data convergence management platform.


Shulian Medical Credit: Big Data Management of Medical Equipment


Shulian Yixin is primarily a company specializing in medical equipment management. In the medical equipment industry, asset registries are currently quite disorganized, leading to inefficient management processes, limited after-sales service, inability to standardize inspections, and potential quality risks. By leveraging data analytics, we can effectively assign a unique “identity” to each piece of equipment. This requires standardizing equipment categories; at a minimum, we need to determine how many monitors of the same type a hospital has. Our current efforts in equipment standardization focus precisely on this aspect—implementing metadata governance for hospital-wide asset registries.

 

For newly introduced equipment, we primarily rely on brand imagery to identify the data category. We have successfully recognized various types of devices, including B-mode ultrasound scanners, achieving satisfactory average processing time and accuracy. By standardizing equipment categories, we are able to integrate extensive inspection and maintenance information into Augmented Reality (AR) systems, allowing users to visualize inspection items and fault details through AR interfaces.

 

Subsequently, by analyzing the device’s historical failure data—including its category and model—we can predict future failures. Based on current data, the accuracy of failure prediction exceeds 94%. These insights are derived from visualizations generated after comprehensive data processing. Without such data processing, it would be difficult to conduct large-scale statistical analysis and derive these findings.