Ping An's Chief Medical Scientist Xie Guotong on AI as a Pendulum Swinging Between 'Knowledge' and 'Data'

Dec 02, 2020 08:00 CST Updated 08:00

Ping An

Integrated Financial Group

Prior to serving as Chief Scientist at Ping An Group, Xie Guotong had already spent 15 years deeply engaged with IBM in the field of cognitive medicine, making him a witness to the development of cognitive medicine in China from “0” to “1.”

Cognitive medicine, with cognitive computing as its core technology and medical big data as the underlying data support, leverages AI to conduct in-depth mining, analysis, and utilization of patient data, thereby exploring novel solutions to medical problems. There are two key elements in this description.First, AI; second, data.。

In theory, AI is accessible to everyone, but data is not necessarily so. When training Go-playing AI, the DeepLearning team could easily leverage abundant game data to continuously refine and optimize the AI’s decision-making processes and capabilities through repeated simulations. In contrast, developing medical AI is significantly more challenging, a difficulty largely attributable to the inherent characteristics of “medical data” itself.

Medical data is characterized by non-standardization and ethical constraints. On one hand, due to inconsistencies in training and practices, medical record entries often vary among different physicians. On the other hand, while data ownership remains unsettled, it certainly does not belong to the enterprises seeking to develop AI.

This is one of the reasons why Xie Guotong joined Ping An. With a comprehensive medical ecosystem, Ping An has strong incentives to generate and standardize medical data, which means it possesses mature AI self-sustaining capabilities. Here, Xie Guotong can address the challenges related to “knowledge” and “data.”

What Did Xie Guotong See at Ping An Group? What Lies Ahead for the Future of AI? Recently, VCBeat engaged in an in-depth conversation with Xie Guotong.

Ping An Group’s Chief Medical Scientist, Xie Guotong

“The development of AI is like a pendulum, swinging between two poles: knowledge and data.”

Q: In addition to the three core elements of algorithms, computing power, and data, there is now a particular emphasis on knowledge as a new key factor. In this new phase, how can we effectively manage both knowledge and data?

A: When the concept of artificial intelligence first emerged, it followed a reasoning-heavy approach centered on “knowledge.” Specifically, researchers attempted to transcribe human-accumulated knowledge into logical algorithms that machines could understand, also known as expert systems.

Attempts at expert systems ended in failure. Transforming expert knowledge into rules requires a highly robust rule representation language, as well as engineers with deep domain expertise to encode that knowledge. In practice, even when expert experience is successfully converted into knowledge, the system typically achieves only 50–60% of the required performance level, making it unviable for hospital deployment.

The failures in knowledge-driven approaches have pushed many toward the other extreme, with numerous researchers becoming obsessed with data and feeding massive volumes of it into algorithms. While this data-heavy learning paradigm may work in other fields, it is not suitable for healthcare. The medical domain is too vast; one can never claim that their big data is truly comprehensive and all-encompassing.

GPT-3 represents an attempt to scale up data volume, focusing on general-purpose NLP models and trained on 45 TB of data. However, in tests involving medical-related tasks, the performance of this “brute-force aesthetic of AI” remained unsatisfactory.

Therefore, whether emphasizing “reasoning” or “learning,” leaning too heavily toward either side fails to fully harness the power of AI. However, finding the right balance is no easy task. Many scholars have made accessible yet profound attempts to enable machine learning and logical reasoning to function more evenly and effectively within a unified framework.

The development of AI is like a pendulum, swinging between two poles: knowledge and data. So far, no one knows where the equilibrium point lies. This also means that there is no best approach to handling knowledge and data—only better ones. We are always on the journey.

Q: Have the methods for creating AI evolved with technological advancements? Has the healthcare sector benefited from these changes?

A: In the era of punch-card machines, data was stored by punching holes in cards; data transmission involved physically transporting these cards to different locations for reading... Sixty years have passed, and while the fundamental processes of data handling—collection, governance, storage, and application—remain unchanged, technological advancements have enhanced the value of data at every stage. For instance, whereas only textual information could be processed and stored in the past, we can now store diverse types of data, including images and audio. The inclusion of an increasing amount of unstructured information within the realm of processable data has made the creation of AI possible.

The gaming sector is particularly well-suited for developing AI algorithms, as it features clearly defined rules and an abundance of accessible data. Google’s AlphaStar, an AI developed for StarCraft, was trained on millions of game replays from StarCraft II, enabling it to compete at a professional level within just one year.

In contrast, medical data are often unstructured, and the distinctions and correlations among different data elements require professional expertise to discern. This means that understanding medical data and processing them with machines is not straightforward. Taking diabetes as an example, physicians follow a stepwise pharmacological approach involving first-line, second-line, third-line, and fourth-line medications; it is inappropriate to recommend fourth-line agents as an initial intervention. The efficacy of fourth-line medications is contingent upon specific prerequisites, which AI must not overlook.

Another issue stems from the uneven distribution of medical data. We previously attempted to develop an AI software covering the diagnosis of most eye diseases. However, after aggregating de-identified data from four top-tier ophthalmology hospitals, we found that common conditions such as glaucoma and cataracts accounted for the majority of the data, while cases of retinal tears and retinal artery occlusion were extremely scarce. Consequently, it is difficult to create mature AI tools that meet clinical needs for these less prevalent conditions.

Overall, the fundamental approach to creating AI has remained unchanged, but the details have been constantly evolving. The healthcare sector has indeed benefited from this, yet at present, these benefits are very limited, and AI still requires a long period of development.

“AI that assists doctors should not aim to be a disruptor”

Q: Given this, the capabilities of medical AI at the current stage are still quite limited. How should we properly understand AI to avoid blind optimism or setting excessively high expectations for physicians?

A: Point out the misconceptions, and what remains is the correct understanding.

First and foremost, it is essential to understand the healthcare industry. Many AI companies oversimplify physicians’ needs when communicating with them. They make commitments easily, only to confront reality during implementation and realize their inability to deliver. This issue is not unique to healthcare; whenever an algorithm is applied to a specific industry, a profound understanding of that industry is prerequisite. This includes its applications, workflows, and relational ecosystems. Only with such insight can one determine how to effectively integrate their solution.

Secondly, it is crucial to have a clear understanding of one’s own role. In recent years, many AI companies have been preoccupied with “disruption” and “reconstruction,” aiming to replace physicians with AI. This approach is akin to throwing a stone into a calm lake; while it may disrupt the existing equilibrium within hospitals, no one welcomes such disruptors. Healthcare is a slow-moving industry, where internet-centric mindsets often fail to gain traction.

To this day, the integration between medical AI and physicians has spanned only a few years. The information a physician derives from a single image is intertwined with their cognitive framework and understanding of the patient’s condition. This is not merely a process of identifying suspicious regions within an image; rather, the physician’s reasoning draws upon both historical and current knowledge, interspersed with clinical imagination.

AI still has a long way to go in mastering these capabilities.

Q: In the current landscape of AI applications, which areas show promising prospects, and which require further transformation?

A: When discussing this issue, it is generally necessary to distinguish between in-hospital and out-of-hospital settings.

Let’s start with in-hospital care. As the saying goes, “30% of recovery depends on in-hospital treatment, while 70% relies on post-discharge care.” Currently, the number of patients with chronic diseases in China is surging, with annual clinical visits rising from 7 billion to 8 billion and then to 9 billion. After discharge, hospitals are keen to maintain oversight, aiming to retain patients and collect comprehensive health data, which benefits both hospital revenue and subsequent scientific research. Meanwhile, continuous follow-up and management by the same physician throughout the patient’s journey can yield better clinical outcomes.

However, relying solely on physicians to manage out-of-hospital disease care is unsustainable, as they cannot handle the workload; nor can it depend entirely on patient self-discipline, which is often insufficient. Therefore, machine-assisted tools can help physicians with data aggregation and monitoring in out-of-hospital management. This represents a viable application scenario, with demand from both pharmaceutical companies and hospitals.

Certainly, many enterprises aim to penetrate the core clinical workflows of Grade 3A hospitals. However, to date, I believe no AI solution has truly integrated into these core processes—neither in pulmonary nodule management nor in pathology. While there may be isolated pilot cases that have achieved this, widespread, scalable adoption has not yet occurred.

Can AI play a significant role in diagnosis and treatment? I believe it can. However, not in tertiary hospitals, but in primary healthcare.

China has one million medical institutions, with only over 3,000 tertiary hospitals. Primary healthcare represents a scenario with substantial demand for AI. Naturally, the needs of primary care differ from those of Grade A tertiary hospitals. Physicians in primary care settings generally have less expertise compared to their counterparts in large hospitals and are not as overburdened. Therefore, what they require more is not improved efficiency, but enhanced standardization. AI developers must seize on these characteristics.

The benefits of standardization are evident: first, it benefits patients by improving the accuracy of diagnosis and treatment; second, it reduces costs by avoiding various unreasonable medical practices.

AI application scenarios outside hospitals are reasonably very limited, with the most important being drug development.

Drug development is a field that requires extensive data-intensive computation, involving the screening of hundreds of millions of molecules and the analysis of countless clinical trials to ultimately identify which substances have the highest potential to become medicines.

Nowadays, the entire R&D process for an innovative drug often costs billions of dollars and spans a decade. However, every stage of this process offers significant opportunities for algorithmic optimization, representing a substantial market.

To date, many startups have raced into this space, but none have gone very far. Now that major companies such as Ping An, Tencent, and Baidu have joined the fray, catching up will not be difficult.

Q: With a clear direction, how can one excel in AI?

A: The first issue is motivation. Based on my personal experience, it can be briefly summarized into two points.

First, the problem must be sufficiently challenging—not something I can easily tackle without effort. I need to invest considerable effort and thought into it and execute it well; only then will I take on the task.

Second, addressing this issue must be meaningful; it should not be driven solely by profit. Healthcare itself holds immense value. When you witness doctors gradually pulling cancer patients back from the brink of death through medical interventions and helping them regain their lives, you recognize the profound significance of this work. We also aim to leverage technological means to assist physicians, thereby benefiting patients.

With motivation in place, the next challenge is execution. Why choose Ping An? Because medical technology cannot be developed just anywhere; it requires collaborative efforts across multiple sectors, and Ping An possesses such an ecosystem. Its thirty years of experience in the insurance business, along with a decade of healthcare operations and data accumulation, are critical for AI R&D and difficult to replicate.

Today, we are consolidating the vast amounts of user health examination data, insurance claims data, online consultation data, and medical imaging center data accumulated over many years. By structuring this information into a knowledge graph, we are developing a "Medical Brain" to serve more patients and provide comprehensive, lifecycle disease management.

AI solutions focused on a single workflow stage struggle to deliver significant value. Companies built around such narrow AI applications either fail to scale or are acquired after achieving initial success, as few possess the capability to tackle the industry’s most complex challenges. Ultimately, the survivors will be companies with core healthcare operations, rather than pure-play health-tech firms.

“What would we do without NVIDIA’s GPUs and the open-source algorithms from Google and Microsoft?”

Q: What ethical boundaries must medical AI adhere to?

A: Data issues have always been the most sensitive concern in medical AI, representing a baseline that companies must adhere to. In the past, privacy awareness was limited; however, as public sensitivity toward privacy concerns grows, companies operating in the medical big data sector will face increasing challenges, leading to a relative slowdown in the industry’s development.

Of course, this slowdown in growth is relative to the rampant expansion of AI in healthcare in recent years. For a company engaged in data processing, the primary task is to ensure data security and protect user privacy. If this baseline cannot be met, there is no point in discussing the pace of development.

Ensuring data security is not solely the responsibility of enterprises; we also need national guidance to promote proper data usage. Some overseas countries have clear definitions for data trading, allowing lawful transactions as long as relevant requirements are met. In this regard, we still lag behind and need to learn from international practices. Only with established regulations and adherence to baseline standards can any industry thrive.

Q: Can China reach the global forefront in AI?

A: There is no doubt that we will stand at the forefront of the world.

When I was still at IBM, I told my colleagues, whether in New York or Silicon Valley, that China is a promising choice for developing medical AI. Five years have passed, and seeing their achievements over this period, I believe my earlier statement holds even more weight now.

China has an inexhaustible drive to develop medical AI, with a large patient population and a shortage of doctors creating clear demand. Meanwhile, we possess an enthusiasm for new technologies that is unmatched by people in other countries, enabling us not only to succeed but also to achieve excellence.

What remains to be addressed are two critical gaps: first, the ability to define problems; second, core underlying technologies.

Why Is the Ability to Define Problems Necessary? We have always possessed a strong spirit of pragmatism and have never lacked the ability to solve problems. When others take action, we can follow suit and even perform better. However, as we gradually move to the forefront, we find ourselves at a loss, because we lack pioneering ideas and have nothing left to learn from others.

Therefore, we need to create an environment that fosters innovative development for enterprises.

Next is the core underlying technology. Current AI is like a sandcastle—impressive, yet fragile. What would we do if NVIDIA stopped selling GPUs to us, or if Google and Microsoft ceased open-sourcing their algorithms?

Huawei serves as a compelling case study from which we should draw lessons.

Now is an opportune moment, as many Chinese scientists are returning home amid the U.S. crackdown on researchers of Chinese descent. They possess the expertise to drive groundbreaking innovations; the key question is whether we can provide a fertile environment for their growth. Ultimately, we must cultivate an ecosystem that fosters innovation and professional development for top-tier talent.