On September 13, 2018, the National Health Commission issued the “Administrative Measures for Standards, Security, and Services of National Health and Medical Big Data (Trial),” establishing regulations for the health and medical big data industry from the perspectives of standardized management and development/utilization. The Measures provide guidance in four key areas: standards for medical big data, security of medical big data, services related to medical big data, and oversight of medical big data. By directly addressing current pain points in the field, the Measures hold significant importance for future coordinated standardization of data management, implementation of security responsibilities, and regulation of data services and administration.
VCBeat Institute, starting from the market of the medical big data industry, collected information on 561 domestic enterprises involved in healthcare big data. It analyzed the development stages and current status of the industry by categorizing them across dimensions such as sub-sectors, big data value, and big data applications.
Through analysis, we have derived the following key data and conclusions:
1. From an industrial perspective, companies in the global big data sector are experiencing rapid development. The China Academy of Information and Communications Technology (CAICT) estimated that the market size of China’s big data industry reached RMB 470 billion in 2017, with a growth rate of approximately 30%. Of this, the output value of big data software and hardware was approximately RMB 23.4 billion, representing a growth rate of around 39%.
2. Reports released by EMC and IDC show that the global volume of healthcare data was 153 EB in 2013, with an expected annual growth rate of 48%. This means that by 2020, this figure will reach 2,314 EB (2.26 ZB).
3. Since 2014, the Chinese government has intensively formulated and rolled out policies related to healthcare big data. With the top-level design largely completed and a wave of state-backed healthcare big data companies emerging, these developments signal that the healthcare big data industry is entering a phase of rapid growth.
4. The richness and accuracy of medical big data sources are key factors driving the development of the related industrial chain. Electronic medical records (EMRs) constitute the primary source of medical big data and represent a critical competitive battleground for industry growth; this sector is poised to produce unicorn companies most rapidly. Currently, the top three data sources are electronic medical records, laboratory test data, and medical imaging data.
5. The five major users of medical big data are physicians, healthcare institutions, individuals, pharmaceutical companies, and insurers, each corresponding to distinct segments of the industry chain and market opportunities.
6. Data processing providers, represented by Alibaba, will also generate significant market opportunities. This development also symbolizes the entry of big data into a phase of integration and governance, with siloed data being incorporated into the industrial chain for data cleansing.
7. In the generation of new standardized big medical data, genes have emerged as the biggest beneficiaries, and a vast market for big medical data will arise around the gene value chain.
8. We have mapped out the data-generation funnel for the first time and, based on this, charted a corresponding industry landscape. Feel free to use it.
Any human manifestation and behavior can generate data; however, in the past, these physiological signs and behaviors were not digitized, collected, or stored by appropriate devices. In the early 21st century, with the widespread adoption of information technology, people began to recognize the value that data could bring. At that time, data generation was confined to computer platforms and the internet. Consequently, only a portion of the data generated by humans within the internet environment could be collected.
Since the 1980s, the world’s data storage capacity has doubled every 40 months (Hilbert & López, 2011). According to a 2014 report by IDC, the global datasphere reached 4.4 zettabytes (ZB) in 2013 and was projected to grow tenfold to 44 ZB by 2020. An increasing number of data-generating devices transmit information via the internet to data storage providers. The explosive growth in data volume has first created significant business opportunities for these providers, leading to substantial revenue increases. Furthermore, once big data is mined for value, it serves as a powerful driver for economic development.

Data source: IDC
From an industrial perspective, companies in the global big data sector are experiencing rapid development. The China Academy of Information and Communications Technology (CAICT) estimated that the market size of China’s big data industry reached RMB 470 billion in 2017, with a growth rate of approximately 30%. Among this, the output value of big data software and hardware was about RMB 23.4 billion, representing a growth rate of around 39%.
Nowadays, the field of medical big data has also begun to enter an era of value output, playing a significant role in diagnosis and treatment as well as hospital management.
The widespread adoption of electronic medical records (EMRs) at the current stage has driven rapid growth in valuable healthcare big data, significantly increasing the volume of data available to physicians, researchers, and patients. According to reports released by EMC and IDC, the global volume of healthcare data reached 153 exabytes (EB) in 2013, with an projected annual growth rate of 48%. This implies that by 2020, this figure would reach 2,314 EB (2.26 zettabytes [ZB]). Consequently, we estimate that the total volume of healthcare data accounts for approximately 5.1% of global data capacity.
Jin Xiaotao, former Deputy Director of the National Health and Family Planning Commission and President of the Chinese Society for Health Informatics and Medical Big Data, predicts that when China’s total population peaks at 1.5 billion, the country’s health and medical big data alone will exceed one zettabyte (ZB). He believes that, due to its immense volume, healthcare big data will give rise to a massive industry scale and yield substantial economic benefits.
Since 2014, the state has successively introduced policies to support the development of medical big data, initially completing top-level design and outlining a grand blueprint for its advancement.
In 2014, the National Health and Family Planning Commission formulated the “46312” Project, which entails building a four-tier health information platform at the national, provincial, prefectural city, and county levels; relying on electronic health records (EHRs) and electronic medical records (EMRs) to support six business applications: public health, medical services, healthcare security, drug management, family planning, and comprehensive management; constructing three databases: an electronic health record database, an electronic medical record database, and a case-based population registry database; establishing a secure health network; and strengthening the development of health standards and security systems.
Subsequently, the state has formulated dozens of additional policies related to medical big data to promote the development of the medical big data industry.
VCBeat outlines the developmental trajectory of healthcare big data across three dimensions—data acquisition, data governance, and data application—thereby constructing an hourglass model for healthcare big data. These three stages reflect the state transitions of big data, illustrating the process by which data is transformed into knowledge, and knowledge, in turn, guides action.
Graphic by VCBeat Eggshell Institute
More specifically, the big data field can be divided into five aspects: data acquisition, data storage, data processing, data analysis, and data application. The input end of medical big data consists of healthcare data generated by various information systems, sensors, and smart devices. After collection, massive amounts of medical big data are stored in data centers, where they undergo cleaning and processing to extract valuable insights. Finally, knowledge derived from big data analytics guides clinical practices, thereby creating value.
Generally, people only recognize that the sources of medical big data are becoming increasingly diverse and that medical big data can provide valuable insights for healthcare services. As previously mentioned, although the volume of big data is massive, it is predominantly composed of low-quality or "garbage" data, with a relatively small proportion being truly valuable. If medical big data undergoes cleaning and processing through intermediate steps, its potential value will be significantly enhanced. Therefore, the three stages of input, processing, and application are all indispensable in the utilization of medical big data. The medical big data hourglass model we have developed illustrates the three critical steps in transforming medical big data into knowledge, and subsequently into actionable guidance.

Graphic by VCBeat Eggshell Research Institute
The medical big data industry did not emerge in its current form from the outset; prior to the advent of big data solutions, the value that medical big data could deliver was limited. With advancements in technologies such as informatization, the Internet of Things (IoT), cloud computing, and artificial intelligence, the utility of big data has been steadily increasing. We are gradually transitioning from an era of data acquisition to one of information mining and value delivery. Accordingly, the value of data has evolved from merely summarizing medical practices to supporting clinical decision-making and providing comprehensive, AI-assisted healthcare decision support.
VCBeat Research Institute screened a total of 561 companies related to medical big data from the VCBeat enterprise database, conducted statistics on dimensions such as corporate financing stages, data types, and application directions, and then derived insights into the trends of the medical big data industry based on the statistical data. Among these 561 companies, 243 were sequencing companies in the genomics sector.
I. Market Layout of Medical Big Data Enterprises
Graphic by VCBeat Eggshell Institute
Data Source: VCBeat Eggshell Research Institute
Most big medical data companies have not raised funds or have not disclosed their funding status, with less than 50% of companies having completed fundraising. Among those that have secured funding, eight are already listed on the secondary market. These include traditional healthcare IT companies such as Winning Health Technology Group and Neusoft Healthcare, which collect and integrate medical big data through information systems, as well as gene sequencing firms like BGI Genomics and Berry Genomics.
Among companies that have secured financing, there is a relatively higher proportion of those at the angel and Series A stages, which aligns with the current state of the primary market. Companies that have reached Series B financing are also more numerous compared to other sub-sectors; the vast majority of these are startups in the genomics industry, accounting for 32 companies, while others are AI-driven medical imaging enterprises. In contrast, medical big data companies whose core business focuses on clinical data integration and mining remain underrepresented in Series B or later-stage financing rounds.
Data source: VCBeat Eggshell Research Institute
From the bubble chart of the number of medical big data enterprises,The Largest Number of Companies Focus on Gene Big Data, the field is currently developing at a rapid pace. However, clinical demand for big genetic data has not been strong to date, and only a small fraction of products have achieved widespread adoption. Enterprises specializing in medical big data derived from clinical sources rank second in number. This is because large-scale clinical data—such as patient vital signs, disease categories, treatment regimens, and therapeutic outcomes—can be directly applied to diagnostic and treatment processes when integrated with practical application scenarios, leading to substantial data demand among healthcare institutions.
Source: VCBeat Eggshell Research Institute
The above content is an excerpt from “Entering the Era of Value Output: Medical Big Data Industry Report.” The full report is approximately 25,000 words long and was originally priced at RMB 499. Members can obtain the report free of charge.Read the Full Report!
References
iResearch: The Healthcare Industry in the Era of Big Data, 2018
Chen Zunqiu "A Brief Discussion on Medical Big Data"
Executive Office of the President of the United States, "Big Data: Seizing Opportunities, Preserving Values"
(Eric Topol) The Patient Will See You Now: The Future of Medicine is in Your Hands
McKinsey, “Data: The Next Frontier for Innovation, Competition, and Productivity”