Early-Stage Medical Large Models Enter Commercialization Phase

Mar 31, 2024 08:00 CST Updated 08:00

FUSIONTECH

Provider of Comprehensive Medical Information Solutions

CHINC 2024, held recently in Qingdao, was likely the healthcare exhibition with the highest “concentration” of large language models this year.

Across the entire healthcare sector, the health IT industry aggregates massive volumes of data and substantial demands for efficiency enhancement, thereby possessing greater potential for scalable implementation compared to other niche segments.

Reflected in the exhibition area, both established information system enterprises and emerging big data startups are striving to comprehensively showcase the capabilities of large language models. Some have integrated these models into their proprietary information systems to raise the technical barriers of their core products, while others are leveraging them as a foundation to develop novel solutions, aiming to open up new markets in the medical IT sector.

Although the core architectures of large language models developed by various enterprises differ, their focus is uniformly directed at hospitals as the primary customer segment. Beneath the overt displays of technological prowess, competition centered on next-generation smart hospital solutions has long been intensifying beneath the surface.

"Hospital scenarios for quality improvement and efficiency enhancement have basically been re-engineered by large language models."

Medicine is a serious discipline; any decision made by physicians during diagnosis and treatment must adhere to the principle of “evidence-based” practice. Therefore, the development of large model applications in healthcare cannot be as arbitrary or unconstrained as in industries such as design, finance, and manufacturing.

Last September, the “Implementation Measures for the Supervision of Internet-based Diagnosis and Treatment in Beijing (Trial),” led by the Beijing Municipal Health Commission, established clear boundaries for the appropriate use of large language models, strictly prohibiting AI from conducting diagnosis and treatment or automatically generating prescriptions without physician supervision.

Therefore, most large language models in the healthcare IT sector currently remain distant from clinical diagnosis and treatment processes. Instead, they focus on the flow of medical data, either by alleviating physicians' high-frequency repetitive tasks to enhance quality and efficiency, or by uncovering latent correlations within data to support scientific research.

Companies with a traditional healthcare IT background tend to be favored over those with a big data background. They typically possess their own Hospital Information Systems (HIS) or big data analytics platforms, which have already been implemented at scale in hospitals. The large language model (LLM) applications developed by such companies are often built around their proprietary systems, with the core objective of enhancing the market competitiveness of these systems.

As a typical example of large models focused on “quality improvement and efficiency enhancement,” Winning Health believes that medical applications derived from large models will become indispensable personal assistants for healthcare professionals. These applications can not only efficiently schedule and organize data resources but also stimulate autonomous thinking, thereby improving the work efficiency and quality of care provided by healthcare professionals, while offering effective decision support.

Guided by this approach, Winning Health released the healthcare-specific large language model WiNGPT in October 2023. Both WiNGPT and the AI-powered clinical assistant WiNEX Copilot, built on WiNGPT, embody the concept of a “close-at-hand assistant.” They serve Winning Health’s core product, the next-generation information system WiNex, by leveraging AI to handle high-frequency, repetitive tasks such as medical record quality control and automatic population of surgical case documentation. According to staff at the Winning Health exhibition booth, WiNEX Copilot now covers more than 100 scenarios, having addressed the vast majority of use cases within hospitals where large language models can enhance efficiency.

Fuxin Kechuang and Huimei Medical share a similar large language model (LLM) strategy with Winning Health, namely leveraging new technologies to revisit and enhance scenarios that previously required improvements in quality and efficiency.

Wu Di, CEO of FUSIONTECH, told VCBeat: “Although large language models and NLP address similar scenarios—empowering areas such as pre-consultation triage, electronic medical records, and medical record quality control—the capabilities of the products they yield differ significantly.”

For specific application scenarios, such as text decomposition, natural language processing (NLP) may offer higher computational performance and more precise results. However, NLP struggles when handling natural human language interactions in open-ended scenarios. In contrast, large language models (LLMs) excel in these situations by proactively posing correct and relevant questions. Leveraging techniques such as Retrieval-Augmented Generation (RAG), they can retrieve corresponding knowledge, perform reasoning based on that knowledge, generate subsequent personalized recommendations, and produce interaction outcomes that are easily accepted by humans, such as human-computer dialogues.

"Furthermore, if the computational costs of hospital deployment are excluded, applications developed based on large language models (LLMs) are relatively less expensive than those relying on traditional natural language processing (NLP). After all, LLMs can learn from publicly available knowledge, eliminating the need for dedicated knowledge bases and regular manual maintenance of such knowledge, thereby significantly reducing development costs."

Regarding research support, Digital Health China Technologies Co., Ltd. and Deepwise Healthcare, both with strong AI foundations, have achieved rapid growth in this niche sector.

Digital Health China Technologies Co., Ltd. leverages big data and AI as its core technological foundation. Relying on multimodal big data, the company has developed a medical-domain multimodal large language model supported by four foundational base models covering text, imaging, pathology, and precision medicine. It performs in-depth governance of clinical, imaging, pathological, and genomic data, empowering diverse application scenarios such as clinical diagnosis and treatment, intelligent scientific research, and health management. The solution not only enables automatic information extraction from massive clinical datasets, SNOMED CT terminology standardization, efficient construction of knowledge systems, and support for clinical research applications, but also facilitates multimodal image processing for CT, MRI, and pathology. It can automatically identify anatomical structures of multiple organs throughout the body, supporting the automatic recognition, omics extraction, and quantitative analysis of over 150 types of CT and MRI structures and lesions. Additionally, it provides automatic identification, subtype classification, and tumor segmentation for more than 30 types of cancer.

Deepwise Medical originally emerged from the field of AI-powered medical imaging. While its large language model (LLM)-based applications share similarities with those of Digital Health China Technologies Co., Ltd., Deepwise maintains a sharper focus on imaging. According to staff at the Deepwise Medical booth, the company’s previously certified AI products based on deep learning were capable of precisely delineating specific lesions in particular organs, such as the lungs, heart, and brain. However, large models have broken through these limitations. Currently, their products can delineate and annotate any lesion in any medical image, significantly enhancing the efficiency of medical research and enabling physicians to rapidly conduct studies on less common diseases.

Beyond mainstream in-hospital scenarios, many companies are also targeting out-of-hospital settings. Freed from the numerous constraints of hospitals, the latter enables a wider array of innovative and imaginative applications.

Taking iFlytek Healthcare as an example, the company leverages its self-developed iFlytek Spark Medical Large Language Model to empower its end-to-end patient management platform. By deeply focusing on patient management scenarios and addressing individualized patient needs, it helps healthcare institutions manage patients efficiently, enhance patient satisfaction, and facilitate the construction of a digital and intelligent end-to-end patient management system.

The full-course patient management platform leverages the knowledge retrieval, content generation, and multi-turn contextual analysis capabilities of large language models. Upon acquiring patients’ diagnostic and treatment information, the platform automatically extracts and identifies key data to generate patient profiles. It then creates individualized rehabilitation plans by integrating a disease-specific management knowledge base co-developed with industry experts. The AI automatically executes these plans and reaches out to patients promptly and conveniently through phone calls, SMS, WeChat notifications, and other channels.

During the rehabilitation process, AI can also intelligently answer patients' frequently asked questions by leveraging patient profiles; it can also update these profiles in a timely manner based on new medical records, follow-up results, and other information, thereby dynamically adjusting rehabilitation plans.

However, the shortcomings in the development process should not be overlooked. On one hand, since policies have defined the scope of application for large models, relevant enterprises inevitably face repetition when selecting in-hospital application scenarios, leading to serious product homogenization. On the other hand, most existing application scenarios are essentially re-implementations of previous AI solutions; while they deliver significant improvements in efficiency, they lack sufficient innovation.

Leveraging medical IT systems, numerous large language models have begun commercial deployment

Following the expansion of application scenarios, many large medical AI models have reached the stage of practical implementation.

Currently, mainstream deployment methods are divided into two categories: on-premises deployment and cloud-based SaaS. Among these, tertiary hospitals predominantly adopt on-premises deployment. In this model, clinical experience data remains within the hospital premises, and large language models (LLMs) can be packaged and deployed internally, thereby ensuring better privacy and more secure model operation. Furthermore, hospitals can independently iterate and update the models even when vendors have not yet released updates, thus guaranteeing the continuous evolution and improvement of the models.

Secondary and lower-tier hospitals show a stronger preference for the SaaS model. The vast majority of these institutions lack the financial capacity to deploy GPUs at scale, compelling them to access related services via the cloud. By adopting this model, they can ensure flexibility in large language model configurations while mitigating the potential financial risks associated with procuring new technologies.

In terms of sales strategies, enterprises that have already achieved widespread implementation of information systems tend to leverage their large medical AI models to collaborate with traditional systems for joint sales. Compared to independent sales, this model enables faster deployment but yields lower premiums.

Subsequent startups entering the medical IT sector often adopt a standalone product model, integrating their solutions as external modules with hospitals’ existing information systems.

For applications such as medical record quality control, the integration of large model technology can reduce development costs, resulting in a lower actual selling price compared to NLP-based quality control systems.

For research-oriented applications, the purpose of developing large models is to provide researchers with new tools; consequently, their pricing is higher than that of existing research systems. However, the current premium driven by this new technology is not substantial. According to estimates from a small number of companies that have already begun selling large-model-enabled solutions, a system previously valued at RMB 1 million could be sold for approximately RMB 1.2–1.3 million after integrating a large model. Compared to the exorbitant R&D costs, this figure appears somewhat insufficient.

Chip Procurement, Computing Power Allocation... Large Medical Models Still Face Practical Challenges in Large-Scale Deployment

With deep learning and NLP paving the way, large models have skipped scenario validation during development and user education during deployment, thereby accelerating commercialization. However, in practice, some enterprises still encounter issues related to production and configuration.

Companies have reported to VCBeat that due to the U.S. government's restrictions on NVIDIA's GPU exports to certain Chinese enterprises, their original large language model R&D projects cannot proceed as planned. "The chip shortage issue cannot be resolved in the short term. At times, we have had to organize multiple departments to share a single chip, which has severely impacted the R&D progress."

Training costs are another issue that cannot be overlooked. Existing large medical models typically keep the number of parameters within the range of 10 billion to 100 billion, aiming to compress training costs while ensuring the model possesses sufficient knowledge. However, although the parameter count of these vertical models is only a fraction of that of general-purpose models, each training run represents a significant financial burden for startups. As a result, some companies are forced to reduce their update frequency, which may adversely affect user experience.

Finally, there is the issue of configuration costs. At present, the existing resource environment in most hospitals is primarily based on CPUs designed for general-purpose computing, with few hospitals possessing GPU resources tailored for graphics processing and parallel computing. This results in a lack of infrastructure suitable for deploying large models. Consequently, it is necessary to equip GPU infrastructure alongside the acquisition of applications to operate large-model applications, while ensuring sufficient storage capacity and high-speed network connectivity.

This represents a significant cost. Based on the estimate of one RTX 4090 GPU per typical department, equipping a hospital campus with sufficient computational power would require an investment in the range of millions of yuan for chip configuration. Given that the value of large language models remains unclear today, many hospitals are hesitant to incur this expense.

Therefore, if large medical model companies wish to achieve the scaled deployment of new technologies, they must develop a killer application that compels hospitals to willingly upgrade their deployment environments in order to adopt these models.