About the Study
Advances in machine learning and artificial intelligence can help us uncover novel biological insights from next-generation sequencing (NGS) data that were previously untapped. A study published in Cell, led by Dr. Kyle Farh of Illumina, utilized deep neural networks to highly accurately predict the locations of splicing events within the genome. Subsequently, the researchers employed their model to predict mutations capable of altering splicing, including those identified in NGS data. In patients with neurodevelopmental disorders or other rare genetic diseases, approximately 10% of pathogenic mutations may include non-coding variants that disrupt splicing, leading to the loss of essential proteins.
“Protein-coding sequences account for less than 2% of human DNA, listing the components required to produce thousands of proteins,” said Dr. Stephan Sanders, BMBS, a leading autism researcher at the University of California, San Francisco and co-author of the paper. “DNA mutations in these regions can disrupt proteins, often causing human diseases. The remaining 98% is non-coding DNA, which contains key information on when, where, and how these proteins are expressed. In contrast to coding regions, we know little about the impact of non-coding mutations on human disease; we are only just beginning to understand how they affect human health and disease.”
This study leverages NGS data to present a novel framework by which Apple AI derives actionable insights in biology and disease. This advances our understanding of the clinical impact of mutations, particularly those in the non-coding genome.


