Home Healthink Care Technology Advances Healthcare Data Standardization with Million-Level Medical Code Dictionary

Healthink Care Technology Advances Healthcare Data Standardization with Million-Level Medical Code Dictionary

Aug 03, 2022 08:00 CST Updated 08:00

Is China’s medical big data industry gradually transitioning from the “concept dividend phase” to the “value realization phase”?

 

From the perspective of the external policy environment, since the State Council’s 2015 “Notice on Issuing the Outline for Promoting Big Data Development” first explicitly called for developing big data in medical and health services and building a big data application system for medical and health management and services, the national government has successively issued multiple documents to promote the development of the medical big data industry. These policies have progressively evolved from broadly “establishing construction” to more granular aspects such as “how to construct” and “how to regulate.”

 

Vigorous policy promotion has spurred a large number of enterprises and investment institutions to enter the market, also fueling robust financing and investment activity in the primary market.

 

According to the “Q1 2022 Global Trends in the Value of Healthcare Big Data” report by VCBeat, Chinese healthcare big data companies experienced a surge in financing in 2018. Subsequently, as the development of medical artificial intelligence in China entered a more challenging phase, investment in the healthcare big data sector gradually cooled down, but it rebounded steadily in 2020 and 2021, reaching a peak.

 

After years of accumulation, a number of companies operating in niche sectors have advanced to Series D and beyond, gradually entering the harvest phase. However, financing data from 2016 to 2021 shows that 55% of invested companies remained at Series A or earlier stages, indicating that the medical big data sector still hosts a large number of startups and is still some distance away from the industry as a whole entering the harvest phase.

 

Meanwhile, the medical big data industry continues to face challenges such as data silos, lack of data standardization, and privacy and security concerns. The solutions to these underlying issues have become the key to corporate development—those who break through first will emerge from the pack.

 

Amidst the fierce competition, some companies have chosen to invest in building hospital information infrastructure, others have opted to establish data integration platforms for hospitals, while still others are bridging data among hospitals, insurance companies, and pharmaceutical firms to unlock the application value of medical data.

 

Kangding Technology falls into the last category of enterprises. It is undeniable that it faces numerous competitors; however, after seven years of navigating the industry, the company has carved out its own niche for sustainable growth.


A Serendipitous “Encounter” with Medical Data from Over Fifty Chinese and Western Medicine Hospitals


If one were to write a chronicle of China’s healthcare industry, 2015 would undoubtedly be “highlighted in bold and vivid colors.”

 

 

Just as industry practitioners regarded this new regulation as strictly “formal,” its rigorous enforcement quickly dispelled such notions, with the withdrawal rate of products rising from 20% in the first month to 89.4% within a year.

 

The growing pains in the industry reflect the government’s determination to address data fabrication, non-compliance, and incompleteness in clinical trials. This has created a favorable policy environment for the orderly development of the pharmaceutical sector and, alongside a heightened industry-wide awareness of data standardization, has given rise to a vast blue-ocean market.

 

It is against this backdrop thatThe founder of Tianjin Kangding Technology Co., Ltd. chose to enter the field and, in 2016, participated in projects under the “National 13th Five-Year Plan Key R&D Program for New Drugs,” engaging in the in-depth mining of comprehensive medical data from more than 50 top-tier (Grade A tertiary) hospitals specializing in both traditional Chinese and Western medicine across China.

 

According to Xue Shaobo, General Manager of Tianjin Kangding Technology Co., Ltd., the team was thrilled when they first encountered such a large volume of medical data. However, after the initial excitement subsided, the immense workload made everyone realize the arduous nature of the task: for nearly three months, team members practically lived at the office, eating and sleeping there almost every day.

 

Full of Obstacles,The most vexing challenge, however, was the standardization work—seemingly the most “unremarkable” yet most critical—because at that time, it could only be performed manually.

 

“We had only 187 staff members at the time. Faced with such a massive volume of data, we had no choice but to manually standardize each entry in spreadsheets—one by one. Even if we worked until retirement, we wouldn’t have been able to finish,” Xue Shaobo lamented.

 

It was precisely this experience that inspired Tianjin Kangding Technology Co., Ltd. to pursue the automation and intelligentization of data standardization.


A Million-Entry Medical Code Dictionary Integrating Traditional Chinese and Western Medicine


Medical and health data hold significant importance for scientific research, drug evaluation, patient management, and other areas. This is an undisputed fact within the industry.

 

However, regardless of the application domain, raw and disorganized medical data cannot be directly utilized. In other words, all applications involving healthcare data must be built upon the foundation of standardization.

 

The first step toward standardization is to unify terminology and metrics. For example, in clinical laboratory tests, white blood cells may be referred to as “white blood cells” or “WBC.” Therefore,The first step in data standardization is to standardize the names of various items, followed by the standardization of indicators (as data indicators have different units and normal value ranges, requiring scientific algorithms for unification). Only then can the resulting data possess the value for digital asset mining and application.

 

However,Unfortunately, there is currently no off-the-shelf “standard” for the industry to follow. Therefore, after years of effort, Tianjin Kangding Technology Co., Ltd. has independently developed a medical code dictionary that integrates both traditional Chinese and Western medicine, incorporating ICD-10 and WHO standards.


截屏2022-06-27 下午4.13.47.png

 

According to Xue Shaobo, this dictionary was not conceived out of thin air by Tianjin Kangding Technology Co., Ltd.; rather, it is the culmination of insights from over 100 expert review meetings, boasting social recognition and market applicability.

 

“To some extent, the dictionary library is equivalent to the standard for data matching; only by possessing such a standard can one dominate the market,” said Xue Shaobo.

 

However, the dictionary database is merely a reference framework; to truly target and capture the market, a compatible operational system is also required.

 

In this regard,Kangding Technology has also developed a data standardization platform based on the aforementioned dictionary database. The specific operational workflow is as follows: data entry → data cleaning → batch standardization of non-standard terms by standardization specialists → secondary review by quality control personnel.

 

Furthermore, the data standardization platform of Tianjin Kangding Technology Co., Ltd. enables automated matching of data, specifically by assessing the similarity between data entries and dictionary tables, which serves as the foundation for the platform’s data standardization capabilities.


Specifically, during the standardization process, if a term has a 100% similarity to the dictionary table (e.g., in vocabulary normalization), it can be automatically matched. If the similarity between the term and the dictionary package is insufficient, secondary standardization shall be performed by standardization or quality control personnel. If the similarity is 0 and there are no matching fields in the dictionary table, reviewers may reset the entry.


康鼎科技配图.png

 

The platform supports batch processing for both initial data standardization and secondary standardization performed by standardization and quality control personnel. Although, in theory, there is no upper limit to the volume of data that can be processed in a single operation (as data processing capacity depends primarily on server performance), Tianjin Kangding Technology Co., Ltd. recommends setting the batch size for single-time standardization to 50 records.

 

Built on automation and batch processing, Kangding Technology’s data standardization platform is more efficient and cost-effective than traditional manual operations.

 

“Prior to the establishment of the dictionary repository and standardization system, data standardization was conducted in Excel spreadsheets. A single staff member could standardize no more than 500 records in an eight-hour workday, with a standardization cost of 8 yuan per record, resulting in high costs and low efficiency,” revealed Xue Shaobo.During the same period, Kangding Technology’s data standardization system can process over 150,000 records, achieving an efficiency 300 times that of manual operations, with auditors verifying a data accuracy rate exceeding 90%.

 

However, whether it is a dictionary library or a data standardization platform, these solutions remain at the technical level. How to effectively implement them, deliver tangible value to the healthcare industry, and achieve genuine value conversion is another critical issue that enterprises must address. What path has Tianjin Kangding Technology Co., Ltd. chosen?


Building a Patient-Centric Full-Industry Ecosystem Chain


At the end of last year, VCBeat conducted an annual review of the medical big data sector, identifying five major profitable scenarios. Among them, leveraging data to empower application scenarios for pharmaceutical companies, hospitals, and insurance institutions stood out as one key area. This is also the commercial path chosen by Tianjin Kangding Technology Co., Ltd.

 

VCBeat has learned that, as of now,Projects involving Kangding Technology’s digital R&D platform have yielded over 300 published articles, with more than 50 collaborative initiatives in medical big data. Partners include internet hospitals, state-owned pharmaceutical enterprises, and companies undergoing initial public offerings (IPOs).

 

What Empowerment Can Kangding Technology Provide to Stakeholders Across the Industry?

 

For hospitals, the interoperability of patient data not only enables better patient management and the formulation and timely adjustment of treatment plans, but also helps implement tiered diagnosis and treatment and break down data barriers encountered by patients seeking care across different hospitals. For pharmaceutical companies, standardized data not only supports drug development and enables precision marketing, but also facilitates adverse drug reaction monitoring and aids in clinical trials with greater focus. For community pharmacies, patient data similarly enables precision marketing and allows them to introduce new services based on an understanding of patient pain points, thereby increasing patient stickiness. For insurance companies, patient data reveals not only market pain points but also provides the foundation for risk control models and key considerations for underwriting and claims assessment design.

 

andThe construction of this ecosystem actually revolves around a central point—patient management. To quote Xue Shaobo directly, “Only by truly managing patients can industry participants—such as pharmaceutical companies, hospitals, and insurance providers—gain a genuine understanding of patient needs, thereby making so-called R&D and precision marketing feasible. And only with standardized patient data can patient management become possible.”

 

Tianjin Kangding Technology Co., Ltd. does not act as a collector or provider of raw data; rather, it serves as an “intermediary” that delivers standardized data outputs in strict compliance with national laws and regulations.


If one sentence were to summarize what Tianjin Kangding Technology Co., Ltd. is currently doing and plans to do in the future, it would be: “Under the premise of legal and regulatory compliance, standardize and store patients’ medical and health data spanning twenty years into the past and continuing indefinitely into the future, then ‘circulate’ this data to various industry stakeholders to improve business quality and meet patient needs.”

 

To this end,Kangding Technology plans to launch pilot programs this year in Henan, Tianjin, Beijing, Jiangsu, and other regions, focusing on the integration of physical healthcare with internet hospitals, as well as pharmacies with operating enterprises. Over the next two to three years, Kangding Technology aims to cultivate one technology company in each province across China to empower local industries.