Tencent Youtu Lab Tackles Medical AI’s Small-Sample Challenge with Transfer Learning and Synthetic Image Generation

Jun 04, 2019 14:50 CST Updated 14:50

Tencent

Internet Comprehensive Service Provider

“Medical imaging is essentially an image recognition problem, with the greatest challenge being that of few-shot learning.”

VCBeat (WeChat ID: vcbeat) has learned that the "13th Annual Conference of Radiologists of the Chinese Medical Doctor Association," hosted by the Chinese Medical Doctor Association and its Branch of Radiologists, and organized by Guangdong Provincial People's Hospital and the Guangdong Medical Doctor Association Branch of Radiologists, was held in Guangzhou from May 30 to June 2. As one of the highest-level conferences in China's radiology community, a key topic of this year's event was the cross-disciplinary integration of radiomics and artificial intelligence in medical imaging.

Dr. Zheng Yefeng, Director of Medical AI at Tencent Youtu Lab, was invited to attend and deliver a keynote presentation titled “Applications of Deep Learning in Medical Image Analysis.” He shared how Youtu Lab leverages two major approaches—transfer learning and computer-synthesized images—to overcome the challenge of insufficient medical AI data, which precludes the large-scale data feeding typical of traditional machine learning.

屏幕快照 2019-06-04 下午2.43.39.png

Dr. Zheng Yefeng Delivers Keynote Address at the 13th Annual Conference of Radiologists of the Chinese Medical Doctor Association

Tencent Youtu Lab is one of Tencent’s premier artificial intelligence laboratories, dedicated to technical research in areas such as facial recognition, images, videos, and medical imaging. “Tencent Miying,” Tencent’s first product to apply artificial intelligence technology in the medical field, was led by the Tencent Medical Health Division with algorithmic support provided by Youtu Lab.

Medical AI Faces a “Dual Challenge”

The rapid advancement of current artificial intelligence (AI) technology is closely tied to powerful computational capabilities, sophisticated optimization algorithms, and high-quality big data. To enable machines to think like humans and serve as capable assistants to physicians, it is essential to “feed” them vast amounts of data to help them identify underlying patterns. However, in the field of medical AI, this process is far from straightforward. Dr. Zheng Yefeng noted that while deep learning has made significant strides in recent years across areas such as image recognition, gaming, speech recognition, and natural language processing, the development of medical AI faces a “dual challenge.”

First, there is a scarcity of training samples. Dr. Zheng Yefeng stated, “The goal of deep learning is to achieve end-to-end processing as much as possible, with images going in and results coming out. Consequently, networks are becoming larger and deeper, requiring increasingly more training samples.” However, unlike the acquisition of natural images in everyday scenarios, obtaining medical imaging data is extremely challenging. On one hand, patients place greater emphasis on the privacy of their medical records, meaning medical images are rarely uploaded online or shared. On the other hand, the “high barrier” to image acquisition also restricts access to training samples. “Medical imaging requires specialized equipment, some of which is very expensive, such as CT and MRI scanners.” Meanwhile, the unique nature of certain diseases further hinders algorithm engineers in acquiring samples. Dr. Zheng Yefeng noted, “For some rare diseases, only a few hundred to around one thousand images can be found, as the annual incidence rate is inherently low.”

Second, there is a lack of annotated data. For natural images, annotation is relatively straightforward, and even laypersons can perform it directly. Medical imaging, however, is different; its annotation requires the involvement of top-tier specialist physicians. “The reality is that training a physician takes ten years or even longer. Coupled with heavy clinical and research responsibilities, physicians often find themselves ‘willing but unable’ to engage in data annotation,” explained Dr. Zheng Yefeng.

Two Major Approaches to Overcoming the Small-Sample Learning Problem in Medical AI

The two major challenges of insufficient training samples and lack of annotations have left deep learning “under-resourced,” and the resulting “few-shot learning” problem has, to some extent, hindered the development of AI in medical imaging. Dr. Zheng Yefeng proposes that two approaches can help address this issue: one is transfer learning; the other is computer-generated images, such as those produced by generative adversarial networks.

When introducing the concept of transfer learning, Dr. Zheng Yefeng used a vivid analogy: “For instance, imagine someone going into a forest to find a tiger but having never seen one before and not knowing what it looks like. However, if this person can distinguish between cats, dogs, foxes, and other animals, we can first train them to identify cats—this represents the pre-training process. Subsequently, we inform them that a tiger is essentially a cat scaled up 100 times in size and colored yellow, thereby achieving the goal of ‘finding a tiger.’” He emphasized that transfer learning is particularly well-suited for addressing training challenges involving small sample sizes.

Another approach is computer-synthesized imaging. Dr. Zheng Yefeng stated that cross-modal image translation enables computer-synthesized images to effectively augment training datasets, while generative adversarial networks (GANs) further enhance the training process: one network generates images, and another discriminates the authenticity of the targets, with both networks undergoing joint training. At the end of training, the generative network can produce highly realistic images.

Dr. Zheng Yefeng cited liver cancer as an example, stating, “Sometimes cross-modality generated images can become distorted, creating artificial lesions or missing existing ones. To address this, we incorporate various constraints during the research process to minimize distortion in the generated images. Our algorithm perfectly preserves the morphology of organs and lesions by leveraging highly realistic images for training tasks, thereby significantly improving accuracy.”

Medical AI Gradually Being Implemented to Improve Diagnostic Accuracy and Efficiency

Deep learning in the field of medical imaging diagnosis has made significant progress through methods such as transfer learning and computer-generated synthetic images. Taking pulmonary nodule detection as an example, Dr. Zheng Yefeng introduced that the primary method for pulmonary nodule examination is currently lung CT. With the application of thin-slice low-dose CT, the doubling of image volume, improved detection rates of small nodules, and quantitative measurement of nodules have significantly increased the difficulty of image interpretation. Meanwhile, the heavy and monotonous workload of reading scans has increased radiologists' fatigue, thereby raising the risk of missed diagnoses and misdiagnoses.

The application of artificial intelligence has gradually resolved these issues. Through continuous iteration and updates, the “Tencent Miying” AI system for early lung cancer screening adopts Tencent Youtu Laboratory’s “end-to-end computer-aided diagnosis technology for lung cancer,” enabling precise localization of tiny nodules and assisting physicians in accurately assessing patients’ risk of lung cancer. The preprocessing module and the detection and recognition module constitute the core algorithms of this system.

The former utilizes 3D segmentation and reconstruction algorithms for the lungs, enabling it to process heterogeneous source data generated by different CT imaging devices under varying imaging parameters. The latter employs “the best segmentation algorithm in the field of deep learning”—the fully convolutional neural network—to achieve early detection and segmentation of pulmonary nodules.

Dr. Zheng Yefeng stated that a fully convolutional neural network consists of two parts: an encoder, which continuously performs convolution and downsampling to compress the image into a low-dimensional space—a component shareable across different tasks; and a decoder, which continuously performs convolution and upsampling to ultimately output a segmentation result with the same dimensions as the input image—a component unique to each task. “Our pre-trained encoder has been exposed to images from all tasks, resulting in robust training performance.”

“After the encoder is well-trained, it can be transferred to other tasks, such as lung segmentation and the classification of pulmonary nodules as benign or malignant. Using public datasets, we found that not only segmentation but also classification can achieve excellent performance.” Dr. Zheng Yefeng emphasized, “In medical AI, most technical approaches are similar; the ultimate competition lies in the details. For instance, in determining whether nodules are benign or malignant, Tencent proposed the Med3D pre-training model, which was trained on multiple public competition datasets. By selecting 3D medical images for segmentation tasks and crawling and collecting these data to pre-train a model, the accuracy of both segmentation and classification can be significantly improved, addressing the common issue where most nodules are not biopsied and their benign or malignant nature remains unknown.”

It is reported that “Tencent Miying” currently assists physicians in image interpretation through AI-powered medical image analysis, capable of precisely locating small pulmonary nodules larger than 3 mm with a detection rate of ≥95%. In addition to early-stage lung cancer, “Tencent Miying” leverages AI-based medical imaging analysis to aid clinicians in screening for early-stage esophageal cancer, fundus diseases, colorectal tumors, cervical cancer, and breast tumors.