
Endoscopic Image-Assisted Diagnosis Provider
In the field of medical AI, there are currently only five randomized controlled trials (RCTs) published worldwide. The first and largest of these is a randomized controlled study on Wision AI’s computer-aided colonoscopy for detecting polyps and adenomas, conducted by Sichuan Provincial People’s Hospital and Harvard Medical School. This study was published in February 2019 in the leading international journal GUT [IF=17.06].【1】As the first randomized controlled trial in the entire field of medical AI, this paper was awarded third place for Outstanding Paper by the journal *GUT*, and earned the distinctions of being ranked in the top 1% of all-time best papers, top 1% of best papers in this journal, and top 1% of best papers published in the same period.
One year later, Wision AI has achieved another No. 1 in the field of medical AI. In January 2020, The Lancet Gastroenterology & Hepatology [IF=12.26] published a double-blind randomized controlled trial conducted by Sichuan Provincial People’s Hospital and Harvard Medical School on Wision AI’s product for detecting colorectal precancerous lesions, EndoScreener.【2】, which became the world’s first double-blind RCT in the medical AI community, triggering strong repercussions across the industry.
Dr. Eric Topol, a member of the National Academy of Medicine and renowned “Clinician of the Century,” retweeted this *Lancet* paper on his personal Twitter account, announcing it as the first double-blind randomized controlled trial in the field of global medical AI—not from radiology, pathology, dermatology, or ophthalmology, but from gastrointestinal endoscopy. This also marks the first AI technology to have withstood the test of a double-blind randomized controlled trial.
Dr. Eric Topol’s post on his personal Twitter account on February 5
Wision AI has successfully employed a double-blind randomized controlled trial (RCT) to evaluate its AI system, marking a significant advancement in clinical validation and providing a reference for double-blind study designs in the global field of AI-assisted diagnosis. Previously, when AI technology was first included as a preliminary recommendation in the European clinical guidelines for gastrointestinal endoscopy, the primary evidence was derived from clinical trials conducted by Wision AI.
Double Blind—As the name suggests, "Double Blind" requires that both study participants and researchers be "blinded," meaning neither party is aware of the group assignments. The entire trial is arranged and controlled by the study designers. This experimental design eliminates subjective biases and personal preferences that may arise in the minds of experimenters and participants. Double-blind trials are among those adhering to the highest scientific standards and are widely used in the clinical development of new drugs. However, the medical AI field has long lacked double-blind randomized controlled trials (RCTs) due to various challenges, including the difficulty of implementing double-blind designs.
In clinical trials of medical AI, most studies use metrics such as the AI’s disease recognition rate relative to physician diagnoses as clinical evidence of AI performance. Strictly speaking, however, such results only demonstrate that computer-aided diagnosis (CAD) systems can autonomously detect relevant diseases; they do not provide rigorous scientific validation of the actual assistance and impact these systems offer to clinicians. Consequently, whether physicians truly benefit from CAD remains debatable. This has led some media outlets to criticize the U.S. Food and Drug Administration (FDA) for lacking sufficient caution in approving AI-based products.【3】。
The most intuitive manifestation is that when endoscopists are aware of AI assistance in medical diagnosis, they may exhibit abnormal diagnostic states, such as heightened focus due to a competitive spirit or reduced vigilance due to reliance on the AI system. The potential existence of these variables means that non-blinded clinical trials are not the most rigorous method of validation.
How to Eliminate Subjective Bias and Personal Preferences Among Participants (Physicians)? Drawing on the placebo-controlled groups used in double-blind clinical trials for new drug development, Wision AI, together with domestic and international experts, has designed a “blinding” methodology to evaluate the efficacy of its AI-assisted diagnostic system.
To conduct a double-blind trial for medical AI, the key lies in successfully “blinding” physicians assisted by AI. This is Wision AI’s contribution to the medical AI community: the company has designed a double-blind trial methodology involving a disguised AI system, which can also be extended to computer-aided detection (CADe) or computer-aided diagnosis (CADx) software across the entire field of medicine.
Ensuring that physicians remain unaware of whether AI assistance is being used during clinical diagnosis is a core element of a double-blind trial. It is essential not only to prevent physicians from guessing which system is being used but also to ensure that their mindset is not influenced by the involvement of an AI system. With no global precedents to draw upon, several experts from Sichuan Provincial People’s Hospital and Harvard Medical School engaged in extensive discussions and ultimately established the preliminary framework for the double-blind testing of Wision AI’s EndoScreener, a product designed for detecting precancerous colorectal lesions.
VCBeat interviewed Liu Jingjia, founder of Wision AI, seeking to reconstruct the full story behind the design of the double-blind trial from his perspective. “The core of a ‘blinded’ trial lies in:
First, introduce a disguised AI system and provide it to subjects in randomization alongside the real AI system;
“Secondly, a system-loyal intermediary role (the second observer) is established to relay the real-time identification results of either true AI or pseudo-AI to the operating physician via laser pointer or pre-designed syntactic cues, in accordance with trial principles,” Liu Jingjia explained to reporters.
Compared with non-double-blind RCTs, Wision AI’s double-blind trial incorporates two core elements: a “masked AI system” and a “second observer.” How should this be understood?
Prior to the commencement of clinical trials, Wision AI designed a sham AI system that does not flag true precancerous lesions while maintaining an ultra-low false-positive rate comparable to that of the genuine AI system, thereby preventing endoscopists from subjectively distinguishing between the two systems.
At the Endoscopy Center of the Caotang Branch of Sichuan Provincial People’s Hospital, Wision AI enrolled 1,046 patients aged 18 to 75 years for colonoscopic diagnosis and screening. After excluding invalid samples from patients with inflammatory bowel disease, colorectal cancer, a history of colorectal surgery, or contraindications to biopsy, the remaining valid samples were randomly assigned to two groups. Ultimately, 484 patients in the true AI system group and 478 patients in the sham AI system group were included in the analysis.
Patients in both the true AI system group and the pseudo-AI system group were blinded to their group assignments. During routine white-light colonoscopy of these patient cases, four senior endoscopists proceeded as follows: if a polyp was declared detected, both the true AI and pseudo-AI systems remained silent; however, if a polyp appeared within the endoscopic field of view and was about to move out of view without having been declared detected by the endoscopist, the true AI system issued an alert, while the pseudo-AI system remained silent.
Notably, to prevent endoscopists from directly interacting with the true or sham AI systems and thereby discerning differences between them, Wision AI introduced the role of a second observer. The primary function of the second observer is to view each output from either the true or sham AI system on a dedicated monitor (invisible to the endoscopist) and communicate this information to the endoscopist. When the endoscopist’s field of view is about to move away from the area flagged by the system, the second observer uses a laser pointer to indicate the system-detected region, guiding the endoscopist’s observation.
By ensuring a consistently low false-positive rate with high approximation accuracy, the system makes it impossible for physicians to distinguish between genuine and simulated alerts when receiving laser pointer cues from a second observer. Upon receiving an alert, the physician re-examines the indicated area: if no polyp is found upon re-evaluation, colonoscope withdrawal continues; if a polyp is identified, it is recorded in the statistical analysis.
It is worth noting how to ensure that every output from the pseudo-AI system does not represent a true precancerous lesion. This is indeed a major challenge in system design. Liu Jingjia introduced an innovative dual-model approach by Wision AI, which employs a strong-weak model subtraction method to ensure that the pseudo-AI system only flags polypoid non-polyp structures (such as air bubbles, fecal matter, undigested residue, and folded mucosa).
“Simply put, the probability of the output from the pseudo-AI system is derived by subtracting the probability map of a pre-developed weak AI system (with sensitivity and specificity far lower than those of the true AI system) from the probability map identified by the true AI system. Meanwhile, threshold adjustment ensures that the specificity of the pseudo-AI system more closely approximates that of the true AI system,” explained Liu Jingjia.
During colonoscopy, the detection and removal of adenomatous polyps represent the most effective strategy for reducing the incidence and mortality of colorectal cancer, which is one of the leading causes of cancer-related deaths. According to relevant studies published in The New England Journal of Medicine, every 1% increase in the adenoma detection rate (ADR) is associated with a 3% reduction in the risk of interval colorectal cancer.【4】。
Improving the adenoma detection rate (ADR) can effectively prevent colorectal cancer, making it a key quality metric for colonoscopy. Endoscopists with high ADRs are better positioned to benefit patients, prompting efforts to enhance ADR in colonoscopy through improvements in endoscopic hardware technology, bowel preparation methods, and observation techniques.
However, due to various reasons, up to 27% of adenomatous polyps are still missed in current clinical practice, even in developed countries such as the United States and Japan.
In the double-blind randomized controlled trial conducted by Wision AI, the results demonstrated that the adenoma detection rate (ADR) in the true AI system group was significantly higher than that in the sham AI system group, with the assistance of EndoScreener, a product for detecting colorectal precancerous lesions. Among the 484 patients in the CAD system experimental group (true AI system group), 165 patients (34%) were detected with one or more adenomas; whereas among the 478 patients in the control group using the sham prompt system (sham AI system group), 132 patients (28%) were detected with one or more adenomas.

Wision AI Double-Blind RCT Trial Results
In terms of polyp detection rate (PDR), the PDR in the CAD system experimental group was significantly higher than that in the sham-alert control group. Among the 478 patients in the sham-alert control group, polyps were detected in 176 (37%); among the 484 patients in the CAD system experimental group, polyps were detected in 252 (52%).
During colonoscopy, the sham cueing system control group detected an average of 0.38 adenomas and 0.64 polyps per procedure, whereas the CAD system experimental group detected an average of 0.58 adenomas and 1.04 polyps per procedure. Based on both the Adenoma Detection Rate (ADR) and Polyp Detection Rate (PDR), EndoScreener, Wision AI’s product for detecting colorectal precancerous lesions, can significantly improve the detection rates achieved by endoscopists during colonoscopy.
Another noteworthy point in the trial data is that 159 cases were missed by endoscopists even with the assistance of the true AI system. When these cases were retrospectively reviewed by experienced endoscopists who did not participate in the clinical trial, their sensitivity and specificity remained suboptimal. This indicates that the problem of missing polyps cannot be simply resolved by adding extra human observers, thereby demonstrating that computer-aided detection (CAD) systems may play a more effective role in assisting endoscopists in real-world clinical settings.
Rigorous and authentic clinical trials are the first step in supporting the practical implementation of related AI products. Wision AI has consistently adhered to the principles of evidence-based clinical medicine. The EndoScreener system used in this trial has undergone validation in several clinical studies; however, its training dataset comprised only slightly more than 5,000 endoscopic images, approximately half of which were negative samples.【5】, in the current field of medical image recognition, against the backdrop of high-cost data acquisition and complex data annotation, the advantages of few-shot deep learning are becoming increasingly prominent.
References:
【1】Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019; 68:1813-1819
【2】Wang P, Liu X, Berzin TM et al. Effect of a deep-learning computer-aided detection system on adenoma detection during colonoscopy (CADe-DB trial): a double-blind randomised study.Lancet Gastroenterol Hepatol. 2020 Jan 22. pii: S2468-1253(19)30411-X. doi: 10.1016/S2468-1253(19)30411-X. [Epub ahead of print]
【3】https://khn.org/news/a-reality-check-on-artificial-intelligence-are-health-care-claims-overblown/
【4】Corley D A, Jensen C D, Marks A R, et al. Adenoma detection rate and risk of colorectal cancer and death. N Engl J Med 2014; 370:1298–1306
【5】Wang P, Xiao X, Glissen Brown JR, et al. Development and validation of a deep-learning algorithm for the detection of polyps during colonoscopy. Nature Biomedical Engineering 2018;(2):741–748