Home Illumina Launches SpliceAI, an Open-Source AI Tool for Interpreting Non-Coding Mutations in Rare and Undiagnosed Diseases

Illumina Launches SpliceAI, an Open-Source AI Tool for Interpreting Non-Coding Mutations in Rare and Undiagnosed Diseases

Jan 29, 2019 00:00 CST Updated 00:00
Illumina

Diagnostic Product Developer



SpliceAI Can Interpret Non-Coding Mutations in Rare and Undiagnosed Diseases


Illumina, Inc. recently announced the launch of new open-source artificial intelligence (AI) software capable of identifying previously overlooked non-coding mutations in patients with rare genetic diseases. This novel genome interpretation AI software has been publicly released via Illumina’s BaseSpace Sequence Hub and GitHub. Additionally, these AI capabilities will be integrated into Illumina’s advanced BaseSpace Variant Interpreter software.


Illumina Chief Technology Officer Mostafa Ronaghi stated, “The open-source release of the software demonstrates Illumina’s commitment not only to becoming the world’s largest driver of DNA sequencing data, but also to vigorously promoting AI tools that enable clinicians and researchers to keep pace with the depth and breadth of genomic data.”



The Illumina team, in collaboration with partners at the University of California, San Francisco, and Stanford University, jointly developed SpliceAI, an advanced deep neural network designed to identify previously overlooked non-coding mutations in patients with autism spectrum disorder and intellectual disability. The team utilized RNA sequencing to detect aberrant splicing events in patient cells and experimentally validated 75% of the predictions. The findings were published in the journal Cell on January 17.



About the Study


Advances in machine learning and artificial intelligence can help us uncover novel biological insights from next-generation sequencing (NGS) data that were previously untapped. A study published in Cell, led by Dr. Kyle Farh of Illumina, utilized deep neural networks to highly accurately predict the locations of splicing events within the genome. Subsequently, the researchers employed their model to predict mutations capable of altering splicing, including those identified in NGS data. In patients with neurodevelopmental disorders or other rare genetic diseases, approximately 10% of pathogenic mutations may include non-coding variants that disrupt splicing, leading to the loss of essential proteins.


“Protein-coding sequences account for less than 2% of human DNA, listing the components required to produce thousands of proteins,” said Dr. Stephan Sanders, BMBS, a leading autism researcher at the University of California, San Francisco and co-author of the paper. “DNA mutations in these regions can disrupt proteins, often causing human diseases. The remaining 98% is non-coding DNA, which contains key information on when, where, and how these proteins are expressed. In contrast to coding regions, we know little about the impact of non-coding mutations on human disease; we are only just beginning to understand how they affect human health and disease.”


This study leverages NGS data to present a novel framework by which Apple AI derives actionable insights in biology and disease. This advances our understanding of the clinical impact of mutations, particularly those in the non-coding genome.