Yidu Tech's Medical Large Language Model Tops Shanghai AI Lab's MedBench Evaluation with a Score of 61.3

Apr 23, 2024 10:37 CST Updated May 10, 14:30

On May 9, the open evaluation platform MedBench for Chinese medical large language models updated its leaderboard. Yidu Tech’s large language model (evaluation name: HH-YIDU-Med) achieved an outstanding comprehensive score of 61.3.# Reaching the Top Spot, becoming one of the listsThe First Medical Large Language Model with a Comprehensive Score Exceeding 60。

图片 1.png MedBench Benchmark Leaderboard Excerpt

MedBench is an authoritative evaluation platform launched by the Shanghai AI Laboratory and the Shanghai Digital Medical Innovation Center,Leveraging the expert experience and knowledge reserves of top-tier medical institutions, we have established five major dimensions: medical language understanding, medical language generation, medical knowledge question answering, complex medical reasoning, and healthcare safety and ethics. This framework encompasses 15 tasks, 20 datasets, and 300,000 questions, providing an objective and scientific performance evaluation benchmark for Chinese large language models in healthcare.

图片 2.png MedBench Benchmark Evaluation Dimensions

The healthcare industry, due to its unique professionalism and rigor, imposes extremely high demands on the capabilities of large medical models. Although GPT-4 has achieved significant breakthroughs in the field of general-purpose large models, the specificity of medical texts and knowledgeEven GPT-4 fails to achieve strong performance when addressing real-world medical scenarios without specialized training. Yidu Tech’s large language model leads the pack across three critical dimensions: medical knowledge Q&A, medical language understanding, and healthcare safety and ethics.It fully demonstrates its medical professional capabilities in terms of expertise, comprehension, logical reasoning, and safety.

As a leading enterprise in China’s healthcare intelligence industry, Yidu Tech has been deeply engaged in the field for nearly a decade. Its “Healthcare Intelligence Brain,” YiduCore, has been authorized to process and analyze over 4 billion medical records, accumulating a vast amount of multi-dimensional, quantifiable knowledge graphs. The construction of Yidu Tech’s large language model is based not only on the curation and governance of extensive clinical practice guidelines and medical literature but also leverages its proprietary data generation technology. By incorporating knowledge graphs accumulated through years of practice into the training of the large model, Yidu Tech has significantly enhanced the model’s professional performance and accuracy in the medical domain, while improving the authenticity and interpretability of the content generated by the large model.

Yidu Tech’s large language model has demonstrated outstanding performance, benefiting from the company’s extensive accumulation of medical knowledge and knowledge graphs, as well as its continuous innovation in model architecture and algorithms. Yidu Tech possesses comprehensive technical capabilities spanning hardware networking, training, fine-tuning, and inference, with full-stack adaptation to mainstream domestic and international chip hardware and software. Furthermore, the company has developed proprietary patented technologies tailored to the characteristics of medical data, further enhancing the model’s professional competencies in medical language understanding and healthcare safety. In addition, Yidu Tech employs techniques such as data augmentation and adversarial training to improve the model’s robustness.

As the core algorithm of YiduCore, the “medical brain,” Yidu Tech’s large language model delivers easy-to-use, high-quality, and customizable empowerment across various application scenarios in the healthcare industry. For business clients (B-end), the model leverages its robust professional capabilities to enhance quality and efficiency throughout key domains such as medical care, education, research, and management. Consequently, the company’s existing solutions for data governance, hospital management, clinical research, and clinical diagnosis and treatment have all been upgraded based on large model technologies. For consumer users (C-end), Yidu Tech’s large model provides professional, medical-grade personalized services, including report interpretation, health-related Q&A, and triage consultation. Currently, Yidu Tech’s large model is evolving from a large language model into a multimodal large model to meet the application demands of a broader range of scenarios.

This achievement of topping the MedBench benchmark leaderboard,The performance of Yidu Tech’s large language model has been validated and recognized across its “foundational” capabilities in understanding and generation, its “advanced” capabilities in complex reasoning, and its “high-level” capabilities in ethical governance.. Yidu Tech stated that the achievements made to date are merely a beginning, and there is still a long road ahead for large medical models. The company will continue to advance steadily, pursuing continuous breakthroughs and innovation to unlock greater potential in medical artificial intelligence, drive large-model technologies to higher levels of development, facilitate their penetration and application across multi-domain scenarios, and accelerate the intelligent transformation and upgrading of the healthcare industry.