Since the second half of 2021, data security has received unprecedented attention. In just three months from June to the present, the Data Security Law, the Information Security Technology—Guidelines for Health and Medical Data Security, and the Personal Information Protection Law were successively enacted. According to statistics from VCBeat, since 2020, relevant authorities have sequentially issued 12 policies and standards concerning health and medical data security, a level of regulatory intensity that has drawn significant attention. This also indicates that national authorities have recognized the substantial risks associated with data security and are actively addressing these gaps through top-level legislation and standardization.
What positive impacts will top-level standards and legislative design have on the application and mining of healthcare data, and can they break down healthcare data silos? If these barriers are removed, can effective data sharing be achieved across various sectors such as clinical care, out-of-hospital health management, drug R&D, and health insurance payment? How can data security be ensured? Answers to these questions may be found at the “2021 China (Hangzhou) Digital Health Conference,” held on September 25 at the Academic Exchange Center in Hangzhou Future Sci-Tech City.
Healthcare data is not a topic that has only emerged in the past two years. The "Basic Specifications for Electronic Medical Records," which came into effect on April 1, 2010, states that electronic medical records refer to digital information—including text, symbols, charts, graphics, data, and images—generated by medical personnel using healthcare institution information systems during medical activities. These records can be stored, managed, transmitted, and reproduced, serving as a form of medical record documentation.
It is evident that electronic medical records (EMRs) essentially encompass the digital information across the entire patient care journey, with the involved systems covering nearly all of a hospital’s internal information technology systems. From a data-driven perspective, their importance is self-evident. Therefore, since 2008, the Chinese government has been progressively promoting hospital informatization centered on EMRs and has begun to establish EMR-related standards as well as the basic framework for the EMR information standard system.
In recent years, relevant national authorities have further compelled hospitals to improve their electronic medical record (EMR)-centric information systems, as well as the underlying capabilities for data collection, governance, and application, by incorporating EMR grading into performance evaluations and implementing Diagnosis-Related Group (DRG) payment reforms—leveraging these two key mechanisms of performance assessment and health insurance payment.
The country is also actively promoting the adoption of Electronic Health Records (EHRs), which go beyond Electronic Medical Records (EMRs). The aim is to achieve information sharing among medical institutions regarding residents’ basic health information, examination and test results, and medication records through the construction of regional information platforms, thereby enabling real-time dynamic updates of residents’ electronic health records and electronic medical records within the region.
Currently, China is implementing Phase I of the National Health Informationization Project for Universal Health Coverage. This phase will achieve scheduled data synchronization between over 90% of provincial population health information platforms and the National Population Health Data Center. As the project nears completion, the mining and application of healthcare data based on electronic health records have become the next key focus.
So, what value can these data generate? Experts from the Zhejiang Provincial Health Information Center believe that, from a data perspective, hospital data distribution and flow can be basically divided into the following five data domains: the production data domain centered on patient services, the data utilization domain centered on diagnostic and treatment improvement, the data utilization domain centered on operational management improvement, the data utilization domain centered on testing and development, and the data flow domain centered on exchange and sharing (interconnectivity).
By capitalizing these data assets, significant value has been unlocked in hospital operations management, healthcare service coordination, intelligent patient services, and foundational medical support. This has enabled more precise diagnostic and therapeutic processes for physicians, fostered more scientific hospital management, and continuously improved operational efficiency.
The “Opinions of the Central Committee of the Communist Party of China and the State Council on Building a More Complete System and Mechanism for Market-Based Allocation of Factors of Production,” released in April 2020, as the first central government document on market-based allocation of factors of production, explicitly recognized data as a new type of factor of production, placing it on par with traditional factors such as land, labor, capital, and technology.
The “Opinions” propose accelerating the cultivation of a data factor market, promoting the opening and sharing of government data, enhancing the value of social data resources, and strengthening data resource integration and security protection.
Thought leaders have begun to discuss the potential applications of health and medical data. Li Tiantian, Chairman of DXY.cn, stated that while China’s current health and medical data market focuses on data collection, it is data analysis that truly unlocks the value of big data. The application of medical big data analytics and mining technologies can, to a certain extent, help the healthcare industry enhance productivity, improve diagnostic, therapeutic, and nursing standards, and strengthen the competitiveness of healthcare institutions. Furthermore, these technologies can conserve medical resources and generate both social and economic value.
Currently, China’s performance in data application and mining is far from ideal; it has achieved “big data” in terms of volume rather than true “Big Data” capabilities. There remains a significant gap in data mining and analytics as well as in the development of analytical platforms. The platform-based capabilities for data analysis are relatively weak, with efforts largely concentrated on single domains and limited integration of diverse data analysis objectives.
Experts from the Zhejiang Provincial Health Information Center stated that China still has much work to do in the application of health and medical data, which can be summarized into three aspects.
First, resource planning for big data in health and healthcare needs to be refined. For example, China still lacks a comprehensive and effective inventory of health and healthcare big data resources, and has not yet obtained detailed foundational information such as “what data are available,” “what data are missing,” “where the data are located,” “who needs the data,” “who provides the data,” and “who serves as the authoritative source.”
Secondly, inter-departmental collaboration within the data sector needs to be strengthened. The application of health and medical big data requires deep integration between technical and operational functions, necessitating coordination and governance in institutional design, organizational restructuring, and benefit distribution, all centered around institutional development and business needs. Currently, big data applications are largely confined to individual healthcare service domains, with limited cross-departmental implementation. It is essential to establish mechanisms for close cross-departmental cooperation to fully leverage the intensive scale efficiencies of big data.
Finally, the integration of data applications requires further deepening. The healthcare sector features numerous business scenarios that necessitate coordination between institutions at different administrative levels. Currently, there are few applications capable of achieving end-to-end connectivity across the “national–provincial–municipal–county–institution” hierarchy within any specific business domain. The recent COVID-19 pandemic has, to some extent, exposed deficiencies in this area.
Li Tiantian further stated that the industry has long expected healthcare data to become a publicly tradable market factor in the future, as guided by policy. However, he also emphasized that this requires adherence to the principles of “inclusive prudence and precise regulation.” “Inclusiveness” serves as the prerequisite, providing sufficient room for growth and trial-and-error for emerging innovations. “Prudence” constitutes the baseline, whereby regulators clearly define red lines and boundary scenarios to maximize support for the healthy development of big data. “Precise regulation” refers to a dynamic and flexible approach to process oversight, moving away from the previous “one-size-fits-all” crude model. Instead, it enables more adaptable and targeted regulatory services throughout the entire process, responsive to changes in business operations and technological advancements.
Experts from the Zhejiang Provincial Health Information Center also noted that the rapid global development of health and medical big data will lead to the public release of an increasing amount of personal data after de-identification, for use in various big data research applications such as precision medicine. However, the public disclosure of health and medical data may give rise to a series of privacy and security concerns. Therefore, data security is a prerequisite for the public supply of health and medical data as a factor of production.
Currently, health and medical data security issues in China are particularly prominent. The main challenges include a volatile internal and external environment with a complex cybersecurity landscape; diverse and complicated application scenarios that make the implementation of security measures difficult; compounded risks to data security and integrity, which increase management complexity; and a shortage of specialized technical talent, hindering the effective deployment of safeguarding measures.
Evidently, as a prerequisite for fostering the development of the data element market, it is necessary to improve data governance rules and legally define data property rights, while formulating differentiated governance rules for the disclosure, circulation, and trading of various types of data. In light of this consideration, the state has also accelerated the top-level legislative design concerning data. In 2021 alone, the Data Security Law, the Information Security Technology—Guidelines for Health and Medical Data Security, and the Personal Information Protection Law were intensively enacted.
In this regard, Li Tiantian believes that the Data Security Law and the Personal Information Protection Law have established provisions for regulating data processing activities, ensuring data security, and promoting the development and utilization of data, thereby charting the course for China to establish and improve its data security governance system. As a critical sector bearing on national welfare and people's livelihoods, the healthcare industry is facing challenges related to data security and compliance with data management requirements in the course of its digital transformation.
“Enterprises and medical institutions should attach great importance to these two laws, with top leaders taking personal charge, conducting thorough study and analysis, building consensus across all levels of the organization, and making proactive arrangements in accordance with the legal requirements,” he stated.
Experts from the Zhejiang Provincial Health Information Center stated that this will significantly promote the establishment and improvement of a regulatory and policy framework in China that meets the needs of health and medical big data development. For instance, it involves further proposing ethical guidelines and policy recommendations suitable for all stages of health and medical big data applications, including collection, storage, utilization, and sharing; developing medical ethics review standards and operational norms adapted to the big data era; and implementing comprehensive ethical governance for issues such as privacy protection.
Furthermore, this will further drive the initiation of legislation to clarify the legal validity of electronic medical records and electronic health records, as well as the ownership, authorization, collection, openness, and usage rights of personal information, medical information, and privacy-sensitive information, thereby regulating the professional ethics of big data practitioners.

Experts from the Cybersecurity Division of the Cyberspace Administration of Zhejiang Provincial Committee of the Communist Party of China provided an interpretation of the Data Security Law and the Personal Information Protection Law, focusing on their detailed implementation.
He stated that following the implementation of the Data Security Law, multiple penalties have been established for data-related violations, with strict sanctions including fines of up to RMB 10 million. Depending on the circumstances, authorities may order the suspension of relevant business operations, mandate business rectification, revoke relevant business permits, or revoke business licenses. Where such violations constitute a crime, criminal liability shall be pursued in accordance with the law.
As the application of big data in healthcare is poised to become a focal point in the future, relevant enterprises should implement measures in the following areas to avoid inadvertent violations of the Data Security Law.
First, ensure that data processing activities are lawful and compliant. The Data Security Law imposes binding requirements of “lawfulness and fairness” at the source of data. Enterprises must adhere to legal and regulatory compliance during data collection—whether data is collected through users’ voluntary provision or automated means, they must obtain valid authorization for the purpose, method, and scope of such collection. Meanwhile, the methods and means of data collection must be appropriate and satisfy the principle of necessity. In addition, all stages of data processing activities, including “storage, use, processing, transmission, provision, and disclosure,” must comply with the requirements of relevant laws, regulations, and regulatory provisions.
Second, establish a comprehensive data security management system covering the entire lifecycle and designate data security officers and governing bodies. Enterprises need to develop sound data security management systems that encompass the entire process—including data collection, transmission, storage, and sharing—and adopt corresponding technical measures to ensure lawful and compliant data processing. Meanwhile, enterprises shall fulfill their data security protection obligations based on the Multi-Level Protection Scheme (MLPS) for cybersecurity, clearly identify data security officers and governing bodies, and implement data security protection responsibilities.
Third, implement classified and graded data management. Establish identification and management systems for important or core data, impose differentiated processing requirements based on data types and security levels, formulate corresponding management policies, and comply with relevant approval procedures.
Fourth, fulfill the obligations of the Classified Protection of Cybersecurity. In accordance with the requirements of the Cybersecurity Law regarding the Classified Protection of Cybersecurity and other related systems, enterprises must formulate specific security measures, including establishing internal security protocols, designating a cybersecurity officer, implementing technical safeguards against virus attacks, monitoring network operations, retaining network logs, and adopting encryption measures. Meanwhile, in compliance with the “Classified Protection of Cybersecurity 2.0” standards, enterprises shall complete the classification assessment and filing for the relevant systems.
Fifth, conduct regular risk assessments and fulfill the obligation to submit risk assessment reports. Enterprises should regularly carry out risk assessments, focusing on the types and volume of important data processed, the circumstances of data processing activities, data security risks, and corresponding mitigation measures. When processing data included in the catalog of important data, enterprises shall conduct regular risk assessments of data processing activities in accordance with relevant regulations. In the event of a data security incident, enterprises shall immediately take remedial actions, retain records of such actions, promptly notify users as required, and report to the competent authorities.
Sixth, cooperate with public security organs and state security organs in data retrieval. Enterprises are obligated to provide stored data to public security organs and state security organs when such data is retrieved for the purposes of safeguarding national security or investigating crimes in accordance with the law.
Seventh, conduct regular data security training. Enterprises should, on the one hand, provide regular data security training to personnel in data security roles, with content covering laws and regulations, management requirements, and security technologies; on the other hand, conduct regular data security awareness training for all employees to strengthen their data security awareness and capabilities.
In addition to the Data Security Law, the expert also emphasized that the Personal Information Protection Law draws on overseas personal information protection legislation, represented by the General Data Protection Regulation (GDPR). Through principle-based provisions, it clarifies that individuals have the right to know and the right to decide regarding the processing of their personal information, as well as the right to restrict or refuse such processing by others. It specifically stipulates the rights to access and copy, data portability, correction and supplementation, erasure, and the right to request an explanation. For deceased natural persons, it also clearly sets forth provisions for their close relatives to exercise these relevant rights.
Enterprises must make significant efforts to determine “how to obtain separate consent.” The existing “bundled notice” approach, which informs users that they must provide all requested information to the service provider in order to receive services, has been rejected by regulators and legislators. Under new regulatory and legislative frameworks, this practice is considered, in certain circumstances, to constitute coercion of users into providing their information. “Separate notice” means obtaining user consent for each specific matter individually, in a manner that adequately draws the user’s attention. In the future, if an enterprise processes sensitive personal information but cannot obtain separate consent, it may be unable to continue its operations, resulting in substantial business impact.
Improving legislation is a crucial condition for data to become a market factor, but it is only the beginning. There are still many urgent issues to be addressed in the mining and application of health and medical data. Li Tiantian believes that there are numerous challenges to overcome before health and medical data can fully function as a market factor. For instance, it remains unclear how to define the extent of compliant de-identification. According to relevant legislative requirements, compliant de-identification must exclude any personal privacy information. However, it is uncertain whether such de-identification might affect subsequent data analysis results, and what the boundaries are for de-identifying different datasets. Currently, there are no corresponding standards or guidelines in place.
On the other hand, ensuring the quality of compliant de-identified data also poses challenges. Some datasets are of high quality, with diagnoses and treatments adhering to clinical guideline standards and exhibiting complete and clear structures; others are of poor quality, characterized by disorganized structures and even data contamination from non-standardized diagnostic and therapeutic practices. Even within the same hospital and department, data quality can vary across different time periods. Addressing these issues requires continuous improvement of relevant policies.
“Data quality control is a critical issue. Even compliantly de-identified data may be of poor quality. For example, the unique market conditions in the past led to instances of overprescribing and irrational drug use, which ‘contaminate’ the entire dataset—commonly referred to as ‘Garbage In, Garbage Out.’ This means that inputting low-quality, dirty data results in unreliable and inaccurate outputs, necessitating costly data cleaning efforts. Addressing these issues requires a multifaceted approach, including adherence to evidence-based medicine, compliance with industry guidelines and consensus, and curbing adverse market practices,” added Li Tiantian.
Guided by the Zhejiang Provincial Health Commission, the Zhejiang Provincial Development and Reform Commission, the Zhejiang Provincial Department of Economy and Information Technology, and the People’s Government of Yuhang District, and hosted by the Zhejiang Health Service Promotion Association and Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, the “2021 China (Hangzhou) Digital Health Conference” will bring together experts from across the entire industry chain—government, industry, academia, research, and application. Centered on the theme “Digital Reshaping of the Life and Health Ecosystem Chain,” the conference will focus on topics such as life sciences technology, digital healthcare, smart hospital management, and new models of health services, featuring in-depth discussions and cutting-edge dialogues on how digital technologies can empower the integrated development of the life and health ecosystem chain. Stay tuned.
