On February 28, the paper co-authored by Dr. Wang Pu and Director Liu Xiaogang from Sichuan Provincial People’s Hospital, along with Professor Tyler Berzin from Harvard Medical School, titled “Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study,” was published in the journal Gut, which had an impact factor of 17.016 in 2018.

This article is the first to employ a randomized controlled trial design to evaluate the detection rates of polyps and adenomas by AI during colonoscopy, with the aim of exploring its impact on improving the Adenoma Detection Rate (ADR).
Adenoma Detection Rate (ADR) is regarded as the gold standard for colonoscopy. Studies have indicated that for every 1% increase in ADR, the risk of interval colorectal cancer decreases by 3%, and the risk of fatal interval colorectal cancer decreases by 5%. Relevant guidelines specify ADR benchmarks for colonoscopy screening in asymptomatic individuals aged 50 years and older, requiring a rate of no less than 30% for men and 20% for women. Currently, many studies focusing on imaging technologies and medical device design aim to reduce the rate of missed adenoma diagnoses by increasing ADR.
Artificial intelligence has recently been introduced for the detection and classification of polyps and adenomas, demonstrating promising results in preliminary studies. This paper further provides robust supporting evidence derived from real-world research.
This marks the first global publication of results from a randomized controlled clinical trial investigating the role of artificial intelligence in the auxiliary diagnosis of colorectal cancer; previously published AI-related studies on colorectal cancer have primarily been retrospective or observational in nature.
Unlike retrospective or observational studies, this study was a prospective randomized controlled trial. Patients were randomly enrolled between September 2017 and February 2018, while numerous variables that could potentially affect the outcomes were controlled. This design allowed for a clear comparison of a single factor to investigate the impact of artificial intelligence technology on adverse drug reactions (ADRs).
Wang Pu, the first author of the paper, told VCBeat, “Due to the rigor and innovation of this study, it is the first prospective randomized controlled trial (RCT) recognized and published by an authoritative international medical journal to investigate whether AI-assisted diagnosis can improve core clinical indicators. Randomized controlled trials are among the most rigorous methods in medical research and are the primary means commonly used to evaluate the clinical efficacy of new drugs.”The greatest significance of this study lies in its first demonstration that the use of AI-assisted diagnostic devices can indeed improve core clinical outcomes. Currently, most research employing AI technology remains at the stage of validating AI accuracy using retrospective data, which is far from sufficient. The AI technology truly anticipated by clinical medicine must be capable of significantly improving core clinical outcomes in large-scale prospective randomized controlled trials.”
This study aims to investigate whether a high-performance, real-time automated polyp detection system can improve the detection rates of polyps and adenomas in real-world clinical settings, specifically exploring the impact of such systems as endoscopist assistants on physicians’ adenoma detection rate (ADR).
The entire study was conducted at the Endoscopy Center of Sichuan Provincial People's Hospital in China. Both the study group and the control group underwent colon examination using high-definition endoscopes (Olympus CF-290 and CF-260) and high-definition monitors. During subject screening, patients with inflammatory bowel disease (IBD), hereditary colorectal cancer (CRC), a history of colorectal surgery, or contraindications to biopsy were excluded.
Prior to colonoscopy, 1,130 consecutive patients were assigned to two groups according to a pre-generated random sequence. The control group underwent conventional colonoscopy, while the study group (computer-aided detection [CADe] group) utilized a real-time automated polyp detection system to assist in intraluminal examination. The detection system was connected to the endoscopic processor to synchronously capture video streams.
In the study group, endoscopists primarily focused on the main monitor and were alerted to each polyp location detected by the examination system via auditory alarms; throughout the entire process, no nurses, trainees, or staff assistants were involved in decision-making support.
In the control group, staff assistants recorded the type of colonoscope used (CF-H290/CF-Q260), insertion time, withdrawal time, and Boston Bowel Preparation Scale (BBPS) score. When polyps were detected, nurses assisted with histological biopsies and documented the location, size, and morphological characteristics.
In the CADe group, personnel not involved in the experiment will additionally record polyps that were missed or falsely identified by the system. Missed polyps are defined as those confirmed by endoscopists but not detected by the system; false positives are defined as lesions identified by the system but ruled out by endoscopists upon observation.

Real-Time Automated Polyp Detection System for Assisted Decision-Making
Final results showed that, compared with the control group, the study group had a 72% increase in adenoma detection rate and an 89% increase in polyp detection count. Specifically, the CADe group demonstrated significant increases in ADR, PDR, and the mean number of polyps and adenomas detected per colonoscopy. Morphologically, the overall increase in adenoma detection was primarily attributable to an increase in small adenomas.
Most small adenomas detected by the CADe system are diminutive in size, which supports the conventional view that endoscopists are more likely to miss small polyps than larger or more prominent ones within the endoscopic field of view. Although small adenomas carry a lower risk of malignancy compared with larger adenomas, the overall increase in the adenoma detection rate may ultimately reduce the risk of missed colorectal cancer (CRC) diagnoses.
The results also showed a significant increase in the detection rate of small hyperplastic polyps, a type of polyp that often leads physicians to perform unnecessary polypectomies, thereby increasing their workload. In the future, CADe systems could be combined with CADx systems to support detection, diagnosis, and ignore strategies, thus avoiding excessive workload.

As shown by the above data, with a P-value < 0.001 and reasonable confidence intervals, AI-assisted polyp detection rate (PDR) increased from 0.291 to 0.4502, representing a 61% improvement; adenoma detection rate (ADR) increased from 0.2034 to 0.2912, representing a 95% improvement. Therefore, compared with manual lesion identification, the high performance, stability, and consistency of computer-aided detection (CADe) systems can significantly enhance clinical diagnostic capabilities. Furthermore, direct comparisons between automated polyp detection systems and assistance provided by healthcare professionals with varying levels of experience warrant further investigation.
The real-time automatic polyp detection system used in this study was developed by Wision AI (Shanghai Wuhe Medical Technology Co., Ltd.). In the research team’s previous study, published in the October 2018 issue of Nature Biomedical Engineering, the algorithm achieved a per-frame sensitivity of 94.38%, a per-frame specificity of 95.92%, and an area under the ROC curve of 0.984 on a retrospective database. By deploying a multi-threaded processing system, the system achieved a processing speed of 25 frames per second with a latency of 76.80 ± 5.60 ms during real-time video analysis. This latency is negligible for endoscopists. The system monitor was fixed adjacent to and parallel with the primary endoscopy monitor.

Wang Pu stated, “During the algorithm development process, we gave special consideration to the surface characteristics of polyps, rather than relying solely on their complete morphology. The algorithm presented in our paper exhibits distinct features compared to research conducted in this field over the past decade: it primarily relies on local features of the lesion. Therefore, even if a polyp appears only partially at the edge or corner of the endoscopic view, protrudes slightly from behind intestinal folds, or is partially obscured by intestinal fluid or feces, the algorithm can still provide effective early warnings. These are precisely the types of polyps most likely to be missed by physicians.”
Eight gastroenterologists participated in this study, including two senior endoscopists (each with more than 20,000 colonoscopies), two intermediate-level endoscopists (each with 3,000 to over 10,000 colonoscopies), and four junior endoscopists (each with 100 to over 500 colonoscopies).
A total of 1,130 patients were eligible for enrollment in this study. Among them, 72 patients who met the exclusion criteria were excluded (31 in the conventional group and 41 in the CADe group). Ultimately, 1,058 eligible patients participated in the study, with 536 randomly assigned to the control group and 522 randomly assigned to the CADe group.
Subsequent statistical analysis in the paper reported that a total of 767 polyps were detected throughout the experiment. There were 422 cases of adenomas (55.02%) and 31 cases of sessile serrated adenomas (4.04%). Overall, 269 polyps (35.07%) were found in the control group, and 498 polyps (64.93%) were found in the CADe group.
The mean number of polyps detected per colonoscopy was 0.51 in the control group and 0.97 in the CADe group (p < 0.001). The polyp detection rates (PDR) were 0.29 and 0.45 in the control and CADe groups, respectively (OR = 1.995; 95% CI, 1.532–2.544; p < 0.001). There were no statistically significant differences between the two groups in terms of baseline clinical and demographic variables; therefore, potential confounding effects were not considered.
A total of 422 adenomas were detected in this study. The mean number of adenomas detected per colonoscopy was 0.31 in the control group and 0.53 in the CADe group (p < 0.001). The adenoma detection rates (ADR) were 0.20 and 0.29 in the control and CADe groups, respectively (OR = 1.61, 95% CI 1.213 to 2.135, p < 0.001).
Compared with the control group, the CADe group detected a significantly higher number of polyps when non-pedunculated polyps were considered. When considering non-pedunculated polyps, polyps smaller than 0.5 cm, and polyps in all segments of the colon, the number of adenomas detected in the CADe group was also significantly increased, except in the cecum and ascending colon.
Excellent bowel preparation outcomes (BBPS ≥ 7): In cases of excellent bowel preparation, the adenoma detection rate (ADR) in the CADe group showed a trend of being 6% higher than that in the conventional group. However, due to insufficient sample size in the subgroup analysis, this difference did not reach statistical significance. In the CADe group, other outcomes, including the mean number of detected adenomas, the mean number of detected polyps, and the polyp detection rate (PDR), were significantly increased.

False Positives in the Automatic Polyp Detection System: The CADe group had a total of 39 false positives, with an average of 0.075 false positives per colonoscopy. None of the polyps detected in the study group were missed by the CADe system.
The paper discusses the limitations of this study in its concluding section. First, the exact contribution of the system may be difficult to assess because a double-blind design involving endoscopists and patients was not feasible. The physicians’ “competitive spirit” and “behavior when being observed” may have influenced the adenoma detection rate (ADR) outcomes in the experimental group. This mechanism may explain potential confounding factors in the computer-aided detection (CADe) group, namely that endoscopists might have been more attentive in settings where they knew they were being observed.
In this study, researchers subtracted the time of the biopsy procedure from each corresponding detection time, yielding similar but not statistically significant results (6.07 minutes vs. 6.18 minutes, p = 0.15), which to some extent reflects comparable levels of observational attention between the two groups.
In the future, researchers can design double-blind studies to explore the system’s exact contribution to the increased detection of adverse drug reactions (ADRs). Such studies can also help determine whether endoscopists and the system detect polyps simultaneously, or whether endoscopists initially missed the polyps—a question that current studies are not designed to address.
The second limitation is the lack of external validity. This study selected samples from a Chinese patient population, in whom baseline adenoma and polyp detection rates were relatively lower than those reported in Western countries. Differences in genetics, diet, lifestyle, and habits between Chinese and Western populations may largely account for this discrepancy. Therefore, the findings of this study may not be generalizable to global settings with higher baseline adenoma detection rates (ADR). Further research is needed to evaluate the adaptability and effectiveness of this system in such contexts.
Third, although the false positive rate is low, the system designers unexpectedly encountered some false positives caused by the detection of drug capsules, local bleeding sites, or undigested food residues, which may cause distraction during surgical procedures. This can be corrected by adding sufficient training data to the current system.
Fourth, this study did not control for the fatigue levels of the endoscopists involved, which may be an independent factor affecting the adenoma detection rate (ADR). Further research is needed to investigate the effectiveness of this CADe system across different levels of physician fatigue.
Fifth, due to the insufficient sample size of colonoscopies performed by novice endoscopists, further research is needed to demonstrate the role and effectiveness of this CADe system across different levels of training.
Finally, this study was conducted exclusively using Olympus colonoscopy equipment. Therefore, the adaptability of this system to devices manufactured by other companies should also be explored.
The paper points out that over the past decade, high-performance and high-stability automated colon polyp detection has been an attractive research topic, with the aim of increasing the adenoma detection rate (ADR). However, current technologies have not yet achieved sufficient diagnostic performance to be considered for clinical use. To be considered for practical clinical application, an automated polyp detection system must exhibit very high sensitivity and specificity, meet real-time processing standards, and incorporate an on-screen alert system.
Insufficient specificity leads to numerous false positives. Conversely, inadequate sensitivity not only fails to increase the polyp detection rate (PDR) or adenoma detection rate (ADR), but also increases the burden on physicians. Furthermore, for real-time detection to be effective, analysis must be rapid; that is, AI-assisted diagnosis must avoid significant latency. Due to these prerequisites, most current studies on automated polyp detection are small-scale, non-clinical investigations. However, with rapidly growing interest in this field and the advent of deep learning, substantial progress is anticipated in the coming years.
Currently, the application of artificial intelligence in the field of digestive endoscopy is primarily divided into two major directions. The first is Computer-Aided Diagnosis (CADx), which leverages the optical capabilities of equipment—such as high-magnification endoscopes with hundreds-fold zoom, Narrow Band Imaging (NBI), and fluorescence technology—combined with deep learning to determine the nature of lesions, aiming to replace pathological diagnosis. However, this approach, which relies on subtle surface features of lesions to predict pathological structures, remains to be validated. Although some traditional deep learning models have achieved relatively high predictive performance in this area, they do not correspond 100% with pathological structures. Furthermore, given significant variations in current clinical guidelines across different countries, CADx has not yet been adopted on a large scale.The second major direction is Computer-Aided Detection (CADe), where AI solely identifies the location of visible lesions within the field of view, leaving the specific diagnosis to the clinician’s immediate judgment. This type of application primarily addresses the limitations of the human eye, providing effective assistance to endoscopists in cases of fatigue, insufficient experience, or distracted attention. Since CADe does not fundamentally alter clinical guidelines and practices, it is likely to be more readily and widely accepted once it meets the relevant technical indicators.
This study falls into the latter category, and its clinical significance is substantial: In colonoscopy, a long-standing shared goal of clinicians and device manufacturers has been to improve the adenoma detection rate (ADR), defined as the proportion of screened patients in whom adenomas are detected. This study has demonstrated that computer systems can serve as a second observer, providing real-time lesion alerts to clinicians during colonoscopy. Previous clinical trials have shown that having non-specialists, such as nurses or trainees, act as a second observer during colonoscopy can increase ADR by 30%. Therefore, the use of AI with expert-level performance as a second observer holds considerable promise for further improving ADR.
Of course, the ideal scenario is the integration of CADe and CADx—i.e., “detection plus analysis”—to improve ADR/PDR and physicians’ diagnostic efficiency. Against the backdrop of scarce medical resources in China, artificial intelligence may be the only way to resolve the current contradictions.
This paper does not mark an end, but rather a new beginning. In the future, Wang Pu’s team will continue to conduct double-blind trials and multi-center studies, using data to demonstrate how collaboration between physicians and AI can maximize efficacy, while progressively driving sensitivity and specificity closer to 1, thereby achieving a breakthrough leap for the real-time automated polyp detection system from merely “effective” to truly transformative.
Finally, quoting Wang Pu’s remarks in the interview: “From the perspective of gastroenterology, the application of artificial intelligence (AI) technology can enhance clinical service quality while reducing healthcare costs and risks. Robust AI systems require validation in real-world clinical settings; current human–machine competitions are far from sufficient. I believe that rigorous prospective randomized controlled trials represent the optimal approach for validating AI technologies. In the field of digestive endoscopy, applications extend beyond automated polyp detection to include narrow-band imaging (NBI)-based pathological diagnosis of polyps, as well as detection and classification of precancerous lesions in esophagoscopy using NBI. Each of these technologies holds the potential to improve current clinical standards; the key lies in whether they can tangibly enhance core clinical metrics during routine practice. This is precisely the direction researchers need to pursue and strive toward.”