China's AI Development Urgently Requires Data Liberation and Policy Support as Foundational Technologies Mature: Excerpt from the 2017 Medical Big Data and Artificial Intelligence Industry Report

Sep 14, 2017 08:00 CST Updated 08:00

2Since 2016, the global consensus has been that the inflection point for artificial intelligence has arrived. From world-class players like Google and IBM to fervent investors and entrepreneurs, all are racing to secure strategic positions, even engaging in an AI arms race. Artificial intelligence is experiencing a boom on a global scale.

How should we view and think about this surging wave of artificial intelligence? As a witness to this tide, VCBeat is compelled to leave its mark on it.

VCBeat’s 2017 flagship report, “2017 Medical Big Data and Artificial Intelligence Industry Report,” will be released on September 16 at the Forum on Practical Applications of Big Data and Artificial Intelligence in Healthcare.

Spanning 100,000 words, this comprehensive report was compiled by VBInsight over the course of one month, drawing on more than one million words of reference materials and interviews with senior executives at dozens of artificial intelligence (AI) companies. It represents VCBeat’s most systematic review to date of the AI in healthcare sector, providing a detailed account of the foundational technologies underpinning medical big data and AI enterprises, an analysis of nine subsectors within medical AI, and an overview of the current landscape of medical AI companies. The report also features case studies of more than 60 domestic and international enterprises.

Meanwhile, VCBeat’s Eggshell Research Institute has applied its proprietary methodology to provide an objective overview of the development status across various subsectors of AI in healthcare. We have systematically analyzed the financing and investment activities of 192 medical AI companies both domestically and internationally, and for the first time, mapped out the technology maturity curve for subfields of AI in healthcare to serve as a reference for industry professionals.

There are two ways to obtain the full report:

I. Register for the 2017 Yangtze River Industry Forum (Autumn) and the Healthcare Big Data & Artificial Intelligence Conference, held on September 16–17,Click to Register），Obtain a Printed Copy of the Full Report, review the list of AI healthcare companies and understand the current development status in various fields.

II. Scan the QR code below to become an official member of VCBeat, and you will receive the complete electronic version of the "2017 Medical Big Data and Artificial Intelligence Industry Report" after its official release on September 16.

微信图片_20170913095737.png

The following is a curated serial excerpt from the report; the full version contains far more comprehensive content.

2017 Report on the Big Data and Artificial Intelligence Industry in Healthcare (Excerpt) I

Analysis of the Underlying Technologies of Artificial Intelligence

>>>>

I. The Relationship Between Artificial Intelligence, Machine Learning, and Deep Learning

When discussing artificial intelligence, two concepts are frequently mentioned: Machine Learning and Deep Learning. Machine learning and deep learning have an inclusive relationship, with deep learning being a key technology driving the current development of artificial intelligence.

Machine learning is the most fundamental approach to achieving artificial intelligence. It employs algorithms that learn from historical data or experience, without relying on hardcoded instructions or predefined rules. Traditional computer programs are explicitly coded to solve specific tasks, whereas machine learning leverages large datasets for training, enabling algorithms to learn how to accomplish tasks directly from the data.

Machine learning was primarily applied in the early stages of artificial intelligence. Traditional algorithms included decision tree learning, inductive logic programming, clustering, reinforcement learning, and Bayesian networks. In the early development of machine learning, due to limitations in computational power, sample size, and other factors, the algorithms had significant constraints, low levels of intelligence, and were not practically applicable.

Deep learning is a subset of machine learning, and its development has become one of the driving forces behind the current advancement of artificial intelligence. The artificial neural network-based learning algorithms employed in deep learning are also a category of machine learning algorithms, although they previously received limited attention. At its core, deep learning focuses on feature learning, aiming to acquire hierarchical feature representations through layered networks, thereby addressing the significant challenge of manual feature engineering required in earlier approaches.

The Relationship Between Artificial Intelligence, Machine Learning, and Deep Learning

The concept of deep learning originated from research on artificial neural networks. While neural networks and deep learning share similarities, such as the adoption of similar layered architectures, they differ in that deep learning employs distinct training mechanisms and possesses powerful representational capacity.

Traditional neural networks were once a popular direction in the field of machine learning, but later faded from prominence due to challenges such as difficult parameter tuning and slow training speeds. Subsequently, deep neural network models became an important frontier in artificial intelligence, and deep learning algorithms underwent a period of rapid iteration. Various new algorithmic models, including Deep Belief Networks, Sparse Coding, Recursive Neural Networks, and Convolutional Neural Networks, were continuously proposed. Among these, Convolutional Neural Networks (CNNs) have become the most sought-after algorithmic model for image recognition.

A Brief History of Deep Learning Development

In recent years, significant advancements in computational power and storage capacity have propelled the advent of the big data era, driven by data mining. Deep learning, widely adopted as a method to enhance machine learning capabilities, leverages large volumes of data to enable the implementation of previously prohibitively complex algorithms, yielding more refined results.

>>>>

II. Three Key Conditions for the Development of Artificial Intelligence

Algorithms, computing power, and data are the three key drivers of rapid artificial intelligence (AI) development. Breakthroughs in algorithms first brought hope for the commercialization of AI. Secondly, enhanced computing capabilities have enabled the implementation of complex algorithms, accelerated training outcomes, and reduced costs. Finally, the big data era has provided abundant resources for AI training and learning. Without any one of these elements, large-scale commercial application of AI would be unattainable.

Three Prerequisites for the Development of Artificial Intelligence

Algorithm

Algorithms form the foundation of artificial intelligence development, and the widely used deep learning algorithms have been introduced earlier. Most algorithm frameworks, such as Caffe, TensorFlow, and Torch, have been open-sourced, becoming the preferred choice for the majority of engineers and playing a significant role in accelerating industry growth and cultivating talent.

The maturity of global open-source platforms has also enabled Chinese companies to rapidly replicate advanced algorithms developed in other regions. In terms of application, there is no significant gap between China’s algorithmic development and that of other countries. In fact, China has achieved breakthrough progress in artificial intelligence algorithms for speech recognition, leading the world.

Computing Power

Computing power is one of the foundational infrastructures of artificial intelligence, thus holding immense strategic significance. The powerful parallel computing capabilities of GPUs (Graphics Processing Units) have significantly enhanced computer performance while reducing costs. NVIDIA’s latest GTX 1080 gaming graphics card delivers 9 TFLOPS of floating-point performance at a price of just $700, resulting in a cost of only 8 cents per GFLOPS. According to Goldman Sachs data, providing 1 GFLOPS of computing power with the IBM 1620 in 1961 would have cost approximately $9 trillion (adjusted for inflation).

In terms of computing power, the world’s three largest chip suppliers—NVIDIA, Intel, and AMD—are responsible for providing GPUs and CPUs. Silicon Valley is also strategically developing FPGAs (Field-Programmable Gate Arrays) and ASICs (Application-Specific Integrated Circuits) for artificial intelligence computing. Cloud computing and supercomputing are also providing services to support the development of artificial intelligence.

Data

Artificial intelligence systems must “train” themselves on vast amounts of data to continuously improve the quality of their outputs. The more high-quality data available, the more efficient neural networks become.

>>>>

III. How to Obtain Learning Data

For artificial intelligence to advance, it faces a significant challenge from the real world: a severe shortage of training data. Currently, data sources mainly come from three aspects.

First, proprietary enterprise data. Through extensive manual collection and subsequent structuring, this data forms the foundation for artificial intelligence training. Most AI companies, before entering this field, had already accumulated substantial industry-specific data within their respective domains, which led them to leverage these data resources to develop AI-driven business operations.

Second, public data from governments worldwide. The U.S. federal government has released 130,000 datasets across multiple sectors on its Data.gov platform, covering healthcare, business, agriculture, education, and other fields. China and other countries have also progressively opened up public data in selected sectors.

Third, industry collaboration data. AI startups acquire data by establishing partnerships with industry players and upstream data providers in the supply chain; for instance, in the healthcare sector, they collaborate with hospitals. IBM Watson initially obtained medical records, literature, and other data through its partnership with Memorial Sloan Kettering Cancer Center.

When data volume is insufficient and it is difficult to increase data supply through previously effective methods, the advantages of deep learning cannot be fully leveraged. More importantly, we also face the challenge of heterogeneous data types. In the physical world, data consists of real-time streams acquired by various sensors, whereas current deep learning applications in the information domain, such as image recognition, rely on static image-based data points rather than data streams. This discrepancy constitutes a fundamental barrier to extending the existing successes of deep learning to real-world physical applications.

Reducing the demand for data volume and achieving few-shot learning, or even one-shot learning, are key challenges in current deep learning research. Prominent deep learning experts such as Yann LeCun and Yoshua Bengio have repeatedly emphasized the importance of addressing the one-shot learning problem in their speeches. However, few-shot learning technologies are not expected to achieve breakthroughs within the next two to three years; we will still need to provide computers with large amounts of real-world data for training.

The Current State of Artificial Intelligence in China

>>>>

I. China’s Academic Research in Artificial Intelligence Leads the World

Although the United States has long remained at the forefront of basic research in artificial intelligence, China’s AI technology talent has been achieving overtaking on a bend over the past two years.

China's AI Research Has Surpassed That of the United States

According to the U.S.-released "National Artificial Intelligence Research and Development Strategic Plan," the number of AI-related papers indexed by SCI that involved “deep learning” increased approximately sixfold from 2013 to 2015. The number of papers published by Chinese scholars surpassed that of the United States starting in 2014 and has since substantially led all other countries.

Although the number of SCI-indexed artificial intelligence papers published by Chinese scholars has increased, their impact has not risen commensurately. According to McKinsey’s report *The Future Road of Artificial Intelligence in China*, Chinese scholars’ AI papers received 2,124 citations in 2015, far exceeding the 1,116 citations for U.S. scholars. However, after excluding self-citations, U.S. scholars’ papers rose to the top in citation counts. In terms of the H-index—a scientific metric measuring both the productivity and citation impact of a scholar’s publications—U.S. scholars ranked first in paper influence, while China ranked third.

The Influence of Chinese Artificial Intelligence Papers

>>>>

II. Gradual Opening of Data in China

In terms of data, China’s foundational data volume far surpasses that of Europe and the United States, particularly in medical and health data derived from its large population. However, this massive amount of data lacks a unified standard and an ecosystem for cross-platform sharing, resulting in prevalent data silos with low utilization rates and limited value. On the other hand, there is a growing global recognition that open government databases can foster innovation in artificial intelligence within related fields, and the Chinese government is gradually increasing the openness of its data.

According to a McKinsey report, China ranks 93rd globally in data openness. The assessment criteria are primarily based on ten key dimensions affecting public access to data, including whether the data is published, available free of charge, updated in a timely manner, and machine-readable.

Comparison of Data Openness Between China and the United States

From the current development of artificial intelligence, algorithms and computing power no longer present significant technical barriers, while data has become the key determinant of project success or failure. AI without data is akin to cooking without rice, particularly in the healthcare sector. In China’s medical field, although medical data is not scarce, effective medical data remains “in short supply.” Specifically, the annotation and structuring of such data pose significant challenges, making machine learning exceedingly difficult.

>>>>

III. Evolution of China's Artificial Intelligence Policies

Although China still lags behind the United States in the foundational technologies of artificial intelligence, the Chinese government has systematically structured and comprehensively deployed a national development plan for AI. On July 20, 2017, the State Council issued the Development Plan for New-Generation Artificial Intelligence, marking the first time that a comprehensive, state-level strategic layout was formulated for a specific technology domain.

Compilation of China's AI-Related Policies

On July 8, 2017, the State Council issued the Development Plan for New-Generation Artificial Intelligence. This document represents China’s first systematic deployment in the field of artificial intelligence and serves as a guideline for establishing the country’s early-mover advantage in the future. It provides a comprehensive plan and strategic arrangement for the overall approach, strategic objectives, major tasks, and safeguard measures governing the development of new-generation AI in China by 2030.

The Plan outlines a grand blueprint for the development of artificial intelligence in China over the next decade and more, establishing a “three-step” objective: by 2020, overall AI technologies and applications should keep pace with advanced global levels; by 2025, major breakthroughs should be achieved in fundamental AI theories, with certain technologies and applications reaching world-leading standards; by 2030, AI theories, technologies, and applications should overall attain world-leading status, positioning China as a primary global innovation center for artificial intelligence.

>>>>

IV. Detailed Interpretation of the “New Generation Artificial Intelligence Development Plan”

In July 2016, a group of academicians, including Xu Kuangdi and Pan Yunhe, proposed the “Proposal to Launch China’s Major Science and Technology Program on Artificial Intelligence.” To seize the significant strategic opportunities presented by AI development, establish China’s early-mover advantages in this field, and accelerate the construction of an innovative nation and a world-leading scientific and technological power, the Next-Generation Artificial Intelligence Development Plan was formulated, and major science and technology projects on next-generation artificial intelligence were implemented, in accordance with the requirements and deployments of the Central Committee of the Communist Party of China and the State Council.

The Plan outlines six key priority areas:

1. Build an open and collaborative system for AI-driven technological innovation.

2. Cultivate a high-end, efficient intelligent economy.

3. Build a safe and convenient intelligent society.

4. Strengthen military-civilian integration in the field of artificial intelligence. Promote the bidirectional transfer of AI technologies between military and civilian sectors, and foster the joint development and sharing of innovation resources.

5. Build a ubiquitous, secure, efficient, and intelligent infrastructure system.

6. Proactively plan and deploy major science and technology projects on next-generation artificial intelligence.

In response to the development and evolution of artificial intelligence, the state will fully leverage existing resources such as funding and bases, coordinate the allocation of domestic and international innovation resources, give full play to the guiding role of fiscal investment and policy incentives as well as the decisive role of the market in resource allocation, and mobilize enterprises and society to increase investment, thereby forming a new pattern of multi-party support from fiscal funds, financial capital, and social capital.

>>>>

V. Blind Spots in Artificial Intelligence Policy

In addition to policy support needed to drive the development of the artificial intelligence industry at the national level, legal and regulatory issues involved in the application process of AI also require early planning and supervision. Especially in the strictly regulated healthcare industry, there are still many issues that need to be standardized by policies for the commercial application of artificial intelligence.

1. Norms for the Application of Artificial Intelligence. Medical issues pertain to human health and life, representing a complex and sensitive domain where every issue is closely tied to patient safety. Therefore, it is imperative to promptly establish regulatory measures at the national level to define, through legislation, the scope of AI applications in healthcare, the extent of regulatory oversight, and the determination of liability for risks, among other matters.

2. Rational and lawful application of data. Artificial intelligence derives its intelligence and achieves improvement by learning from historical data; therefore, a large volume of high-quality medical data forms the foundation for AI’s decision-making capabilities. In the United States, commercial applications of medical information must strictly comply with the provisions of both HIPAA and HITECH. Currently, China’s policy stance in this area remains ambiguous. It is imperative that we promptly clarify how data should be utilized, specify which data are permissible and which are prohibited for use, and determine the appropriate legal framework for regulation.

3. Industrial Policy Support. Currently, more than half of China’s high-tech companies have not incorporated artificial intelligence into their strategic plans. Even those that have begun to engage with AI may still face obstacles in data, talent, and technology. To guide the digital healthcare industry through its AI-driven transformation, the government can leverage traditional economic tools to help enterprises overcome the challenges encountered in the early stages of AI development.

It is encouraging to witness the release of the “New Generation Artificial Intelligence Development Plan,” which promotes the development of artificial intelligence at the national level. In the future, China must transform AI-driven innovations across various sectors into sustainable productive forces, and only under a comprehensive framework of strategic planning and policy support can the foundation of artificial intelligence be firmly established.

References:

1. “The Differences and Connections Between Artificial Intelligence, Machine Learning, and Deep Learning,” Leiphone, Qu Xiaofeng, September 6, 2016.

2. “Reducing the Dimensionality of Data with Neural Networks,” Science, Geoffrey Hinton/Ruslan Salakhutdinov, July 28, 2006.

3. “Deep Learning: Advancing the Dream of Artificial Intelligence,” Programmer Magazine, by Kai Yu, Lei Jia, and Yuqiang Chen, June 2013.

4. “How Can Deep Learning Break Through the Data Bottleneck?”, Synced, Yunfeng Zhao, September 8, 2016.

5. “Global AI Talent Report,” LinkedIn, July 2017.

6. “The Future Path of Artificial Intelligence in China,” McKinsey, March 2017.

7. “Artificial Intelligence Industry Research Report,” 36Kr Research Institute, June 2017.

8. Regular Briefing on the “Development Plan for New Generation Artificial Intelligence,” Ministry of Science and Technology, Li Meng, July 21, 2017.

The full table of contents for the “2017 Medical Big Data and Artificial Intelligence Industry Report” is as follows:

人工智能报告9月10日下午最终版本小.jpg

人工智能报告9月10日下午最终版本小2.jpg