
Developer of Medical Artificial Intelligence Technology
Recently, VCBeat invited Dr. Yang Qiong, Co-founder of PaiYiPai, a startup team focused on the application of machine vision in mobile healthcare, to deliver a training lecture on the application of OCR technology in the medical field.
At the beginning of 2016, MEDP.AI announced the completion of its RMB 30 million Series A financing. Wu Shizhan, founder and CEO of MEDP.AI, stated that the company is built on a team with strong technical expertise. Following this round of financing, MEDP.AI will fully open up its core technologies and expand into three major B2B business areas: first, fully opening its medical document photo recognition technology to build a cloud-based photo recognition platform; second, exporting technical capabilities to healthcare and pharmaceutical institutions by providing “Internet Plus” technical services; and third, acting as a technical venture capital partner for mobile health startups, serving as the technical co-founder to help entrepreneurs with traditional pharmaceutical backgrounds overcome the challenge of “having ideas but lacking technical expertise,” thereby completing the product design and development process from 0 to 1 and delivering high-quality apps for startups in a short period of time.
The completion of its latest funding round also signifies the recognition of OCR technology’s applications in the healthcare sector. Seizing this opportunity, VCBeat aims to outline the application pathways of OCR in healthcare based on Dr. Yang Qiong’s lecture.
Text version:
Hands-Free OCR: How to Solve Data Entry, Storage, and Analysis with One Click
Faced with a myriad of complex paper-based medical documents, how to easily digitize and analyze the content on these documents has long been a challenge for many internet healthcare entrepreneurs. The application of OCR technology in the medical field provides just such a solution.
In the medical field, the most common application of OCR technology is the recognition of laboratory test reports. PaiYiPai, a startup team focused on machine vision for mobile healthcare applications, has played a significant role in this area. According to Dr. Yang Qiong, co-founder of the company,Due to the inherent rigor of medical practice and other factors,OCRApplying technology to the recognition of laboratory test reports is not as straightforward as it may seem.
(Dr. Yang Qiong has previously held positions at Microsoft Research Asia, the European Microelectronics Center (IMEC), and Baidu Institute of Deep Learning. She earned her Ph.D. from Tsinghua University and is a senior expert in the fields of optical character recognition (OCR), facial recognition, deep learning, artificial intelligence, and big data analytics, holding 11 patents worldwide or in the United States. She has published more than 40 papers in top-tier international journals and conferences such as PAMI, IJCV, CVPR, ICCV, IJCAI, and ACM MM, with multiple articles receiving Best Paper Awards or nominations. She also won awards for best performance in several international evaluations, including FAT2004 and Middlebury. As a key leader and driver in Baidu’s OCR and AI initiatives, she was one of the earliest pioneers at Baidu to advance the application of deep learning in text recognition, image classification, and big data analytics.)
First, hospitals are the primary institutions that generate laboratory test reports. Since there is no uniform standard for the format and layout of documents used by different hospitals, the interface of these reports is highly complex and lacks standardization, posing significant challenges for machine recognition.
Secondly, photographic recognition of laboratory test reports is not merely an optical character recognition (OCR) task, but rather a challenge of semantic understanding. Dr. Yang Qiong stated that different terms may convey identical meanings, while the same term can carry distinct implications across different test panels. For instance, white blood cell (WBC) count appears in both complete blood count (CBC) and routine urinalysis; however, the reference ranges for normal values and their clinical interpretations differ significantly between these two tests.
Furthermore, since laboratory test reports are printed on paper, they are prone to creasing and deformation during storage. Such deformation also poses challenges for their recognition. In comparison, bank cards, which have a stiffer texture, are easier to recognize.
Fourth, laboratory test reports contain a variety of characters, including English letters, Chinese characters, numbers, and special symbols. Optical character recognition (OCR) in scenarios involving mixed types of characters is itself a global challenge.
Why Use OCR Despite So Many Technical Challenges?How does the technology perform photo recognition of laboratory test reports?Dr. Yang Qiong stated that there are two reasons.
On the one hand, there is significant user demand in this area. From the consumer (C-end) perspective, patients are unable to interpret their own laboratory test reports, and physicians lack the time to assist them. Optical character recognition (OCR) of lab reports via photo capture helps patients overcome the challenge of understanding these results. From the business (B-end) perspective, many healthcare companies require large volumes of laboratory data, and manual entry is time-consuming and labor-intensive. Photo-based OCR significantly reduces labor and time costs for these organizations.
On the other hand, although laboratory test reports feature complex layouts and diverse formats, they still follow recognizable patterns. For instance, tests such as complete blood count (CBC) and routine urinalysis have specific test items, results, and reference ranges that can be used for identification purposes. Of course, in-depth exploration and understanding of these issues are required to truly advance the technology to a practical level of application.
Compared with traditional OCRUnlike the template-based approaches adopted by other companies, MEDP.AI has developed its proprietary core technologies and recognition workflows to enhance the identification and parsing of laboratory test reports, thereby significantly expanding support for a wide variety of formats and layouts.
Positioning Area: After capturing a clear photo of the laboratory test report, MEDP.AI's machine recognition technology will first accurately identify the borders of the report to determine its approximate location.
Coarse Classification: Subsequently, the information regions on the laboratory test report are coarsely classified into categories such as test items, patient information, and hospital information. Among these, the test item category is the most critical and content-rich.
Data Parsing + Medical Knowledge Base Matching: After delineating the regions, the system intelligently parses and interprets the content within the laboratory test data information region and matches it against the medical knowledge base.
Global Adjustment and Optimization: After data parsing, MEDP.AI performs overall optimization and readjustment of the parsed results, correcting any potential misidentifications.
To ensure the accuracy of data recognition, MEDP.AI prompts users to select an appropriate distance, angle, and lighting when taking photos. Furthermore, the artificial intelligence machine learning system in the MEDP.AI backend possesses adaptive and self-learning capabilities, enabling continuous improvement.
Furthermore, all data processing by MEDP.AI is conducted on its own SaaS platform, enabling users to obtain desired results without the need for large-scale data storage and computing infrastructure.