Home What's Holding Back China's Health Data Trading Market? Insights from Recent IPO Filings

What's Holding Back China's Health Data Trading Market? Insights from Recent IPO Filings

Sep 30, 2025 07:59 CST Updated 08:00

Since Xuanwu Hospital of Capital Medical University completed Beijing’s first health data transaction in November 2024, Changle District Hospital in Fuzhou sold more than 100 cranial MRI imaging datasets to the Fuzhou Data Technology Research Institute in April 2025. The resulting revenue was deposited into the state treasury, marking China’s first instance of medical data asset usage fees being incorporated into the state treasury.

 

Following the entry of top-tier hospitals into the arena, secondary hospitals, private hospitals, and third-party testing and inspection centers have begun applying for data asset certificates to list their health data assets on local exchanges.

 

图片8.pngSelected Health Data Trading Assets Listed on Online Exchanges (Statistics as of July 10, 2025; Incomplete Statistics)

 

However, the increase in the number of health data suppliers has not yet reached the critical threshold for qualitative change, and the absence of further policy regulations regarding the security of buyers’ data usage has resulted in a temporary mismatch between the needs of supply and demand sides. Consequently, the growth rate of the entire health data trading market remains slow.

 

To expand their client base, staff at some exchanges would call every registered user, seizing every possible opportunity to facilitate data transactions.

 

Since the establishment of the national data set in October 2023, China’s data elements have undergone nearly two years of development, with coverage now extending to virtually all functional departments. So, what exactly is hindering the trading of medical and health data, and even the development of the entire data trading market?

 

Supply-Side Analysis: Exorbitant Costs, Scarce Targets


First, analyze the issue from the perspective of data supply. To transform raw data into tradable healthcare data assets, suppliers typically need to complete five stages: data collection, data governance, legal assessment, asset establishment, and platform-based trading.

 

"This is a time-consuming process, and even more so, a costly one."

 

The modules that incur costs within the process can primarily be broken down into three parts. The first is the data cleaning phase, where the main challenge lies in defining the dataset content, locating raw data, and performing data cleansing. In this phase, imaging data is significantly more difficult to process than text data; however, the value of imaging data far surpasses that of text data.

 

Yimai Yangguang, a publicly listed third-party medical imaging service provider in China, launched its first health data trading asset in the first half of this year—a 20GB CT chest lesion dataset—and successfully completed its inaugural data transaction in September. According to the company, the standard cost for a radiologist at a tertiary Grade A hospital to interpret a single chest CT scan is approximately RMB 50–60 per image. Consequently, the data governance cost for a dataset comprising 1,000 patient cases amounts to RMB 50,000–60,000.

 

After completing data governance, the next step for the data provider is to engage a law firm to conduct a compliance assessment, ensuring that the sources of data assets meet legal requirements. Yimai Yanggao told VCBeat: “At this stage, the fee structures of law firms are largely similar; charges are not tied to the content of the data but are based solely on volume and frequency. Generally, the cost for a single digital asset assessment ranges from RMB 50,000 to RMB 60,000.”

 

After completing the above two steps, the data provider ultimately needs to engage with a data exchange to establish asset ownership rights and issue a data asset certificate for public disclosure and trading purposes. This step is relatively less expensive than the preceding ones; while fees vary across different exchanges, they are all capped at several thousand yuan.

 

Due to the absence of large-scale data trading, it is currently not feasible to discuss whether exchanges will charge commissions proportional to transaction amounts. As the market matures, enterprises will likely incur an additional layer of transaction costs.

 

Based on this calculation, institutions now need to pay a base fee of RMB 60,000–70,000 to list a dataset, plus data asset governance costs that vary with the size of the dataset. Consequently, the fixed cost for an imaging dataset can easily exceed RMB 100,000. Before a mature data trading market is established, such substantial costs may, to some extent, hinder secondary hospitals and non-public medical institutions from entering the market.

 

Compliance Discussion: The Battle for the Right of Return Determines Rise and Fall


Beyond cost considerations, data providers are more concerned with compliance issues related to health data. Many suppliers possess substantial data assets but refrain from participating in data trading; the core issue here is the determination of ownership rights for health data.

 

The autonomous driving sector is currently one of the most mature segments in the data trading market: OEMs can effortlessly collect users’ voice and text data, as well as various sensor and LiDAR data generated during driving. These data assets belong entirely to the manufacturers themselves, eliminating the need to address disputes over data ownership.

 

In the more specialized voice data sector, companies such as Haitian Ruisheng and iFlytek can allocate personnel to generate recording datasets in various tones and languages based on market demand, thereby rapidly developing standardized products for scalable and repeatable data sales. In this process, suppliers do not need to consider whether the environment is trusted; it is even simple enough to copy the data directly onto a USB drive and hand it over to the buyer for collection.

 

Clearly, health data differs significantly from the aforementioned categories. It not only requires de-identification during model training but also necessitates tracking of subsequent usage environments to ensure the trustworthiness of the entire data utilization process. The key factor underlying this distinction is that suppliers do not hold ownership rights to health data.

 

Yimai Yanggao told VCBeat, “The ownership of health data must belong to the patient. When we use health data for model research in our daily operations, we must first obtain the patient’s consent and use their data only with their informed knowledge.”

 

"As healthcare institutions, we hold the rights to manage and use data, which empower us to perform data cleaning or develop our own models. However, if health data assets are traded, their nature shifts from 'research' to 'commercial use.' Under existing agreements, the boundary between these two categories is ambiguous; therefore, healthcare institutions cannot assume this risk by monetizing health data assets before property rights are clearly defined."

 

Resolving this core challenge will take time, requiring either regulatory bodies to enact legislation redefining rights or healthcare institutions to revise informed consent agreements to seek patients’ authorization for commercial use. Some companies estimate that it will take another two to three years to establish a robust health data trading system.

 

Discussion on Demand: The Massive Demand for Data Always Exists


So, are buyers willing to pay for data integration at scale?

 

From a motivational perspective, the cost for data demanders to directly purchase health data is higher than the cost of training their own physicians to standardize data collection. However, the former approach allows them to avoid establishing collaborative partnerships and sharing AI-related intellectual property rights. According to a survey by VCBeat, many companies have expressed willingness to purchase health data assets, provided that the pricing is reasonable and data quality is guaranteed.

 

However, the primary challenge for demand-side participants at this stage is the scarcity of tradable data asset categories, with virtually no competing enterprises offering similar data. Multi-center studies are limited to purchasing single-center data, which naturally constrains the growth of transaction volume.

 

Meanwhile, price is also a factor considered by the demand side. As the entire market is still in its early growth stage, a fair pricing system has not yet been established within the market to evaluate whether the pricing of data assets is reasonable. This uncertainty has, to some extent, hindered the progress of transactions.

 

In theory, given the persistent and massive demand for health data among healthcare enterprises, if exchanges can ensure adequate supply by devising methods to list substantial volumes of high-quality health data assets, the entire trading market will become self-sustaining.

 

The Value of Health Data Trading Extends Beyond the Transaction Itself


When discussing the challenges facing the development of medical AI in China, Jiang Xiaojuan, Chair of the Annual Conference on China’s Digital Economy, lamented that the reuse rate of health data in the country is excessively low. Many high-quality datasets, created through substantial efforts, have failed to deliver their intended value.

 

The emergence of the healthcare data trading industry has undoubtedly provided a solution to the aforementioned challenges. Reuse facilitated through trading can leverage market forces to maximize the value of medical data.

 

For AI companies, their application research and development based on clinical data in the past were often constrained by collaborations with a limited number of hospitals, which could lead to regional bias in algorithm performance and hinder broad adoption.

 

Once health data trading achieves scale, these companies are expected to reduce their reliance on hospitals by integrating data from multiple institutions at the outset of R&D, thereby developing more robust artificial intelligence. Furthermore, this new model will enable greater independence in the commercialization of AI products, helping to avoid potential intellectual property disputes.

 

For healthcare institutions, the governance value of medical big data has been a topic of discussion for decades, prompting the establishment of numerous big data centers to unlock this value.

 

With data trading now gaining momentum, they are finally poised to transform data governance from a cost center into a revenue stream, thereby unlocking new avenues for income growth.

 

Following the emergence of DeepSeek, the demand for healthcare data trading has become more urgent. After all, transitioning from general-purpose large models to specialized vertical large models requires standardized healthcare data on a scale far exceeding that of traditional AI training.

 

It is important to note that the value of health data transactions extends beyond the transactions themselves. Health data assets possess a highly elastic value potential; if these data can be leveraged to accelerate the research and development of pharmaceuticals and medical devices, as well as to improve hospital operational efficiency, the resulting benefits will far exceed the value generated by the transaction process alone.

 

As with all industries, validating the business model for healthcare data requires undergoing an explosive period of investment; only after crossing the trough can one reap the benefits of reduced marginal costs.

 

Although investment has yet to materialize, the various stakeholders involved in healthcare transactions have reached a consensus and are preparing for the subsequent influx of capital.

 

It is reported that health commissions in multiple cities have completed project approvals and established partnerships with enterprises to identify potential challenges across the entire data lifecycle—including data collection, governance, and trading—thereby conducting pilot explorations. These efforts aim to establish a guiding framework that provides data suppliers with a standardized pathway for data assetization.


Within three years, we may witness the true rise of the healthcare data trading industry—a sector with a potential market value in the hundreds of billions—which will reshape the value creation pathway of digital-intelligent healthcare and drive the iterative upgrading of the entire digital health ecosystem.