Home Dr. Jingwei Bai of Tsinghua University Highlights Evolution of Gene Sequencing Technologies and Emerging Applications in Pathogen Research

Dr. Jingwei Bai of Tsinghua University Highlights Evolution of Gene Sequencing Technologies and Emerging Applications in Pathogen Research

Mar 14, 2020 08:00 CST Updated 08:00

Compared with the SARS coronavirus, which took four months to identify in 2002, the novel coronavirus in 2019 was confirmed much more rapidly. From the disclosure of the first reported pneumonia case in public records to its definitive diagnosis, it took less than ten days to identify the underlying culprit—the novel coronavirus (COVID-19). Such highly efficient identification of this novel virus was made possible by the application of high-throughput gene sequencing technology.

 

High-throughput gene sequencing is a core technology of precision medicine. VCBeat (WeChat ID: vcbeat), in collaboration with the Private Equity Research Institute of Tsinghua University, jointly organized a series of events on “The Impact of the Pandemic on the Healthcare Industry.” Bai Jingwei, a researcher and doctoral supervisor at Tsinghua University, was invited to deliver an academic presentation on the prominent applications of gene sequencing technology during the pandemic and on future technological products worthy of attention in this field.

 

Dr. Bai Jingwei earned his Ph.D. from the University of California, conducted postdoctoral research at IBM Watson Laboratory, and worked in R&D at Illumina, a U.S.-based gene sequencing technology company, from 2013 to 2016.

 

VCBeat has compiled the insightful content shared by Bai Jingwei during this VB Group interview, providing an interpretation from multiple dimensions, including an overview, development, and innovations in gene sequencing technology.

 

Genomic sequencing is at the tipping point of explosive growth, transitioning from scientific research to large-scale clinical application.

 

Genes are the most fundamental genetic material of living organisms on Earth, and gene sequencing is an important tool that helps us interpret this genetic information. From a scientific perspective, gene sequencing is one of the most basic tools for understanding life and conducting scientific research.

 

In recent years, with advances in gene sequencing technology, the cost of gene sequencing has continued to decline; meanwhile, gene sequencing has gradually entered the healthcare industry. Within this sector, there is a common misconception that a single gene sequencing test throughout one’s lifetime is sufficient. In reality, from the perspective of medical health monitoring, gene sequencing is a technology that will be utilized throughout an individual’s life.

 

For example, at the very beginning of life, during in vitro fertilization (IVF), several companies in China have already launched gene sequencing services to screen embryos for genetic disorders prior to transfer. Furthermore, for pregnant women aged 35 and older undergoing non-invasive prenatal testing (NIPT), the recommended technique is also based on next-generation high-throughput gene sequencing. This method involves extracting cell-free DNA from the maternal peripheral blood and using sequencing combined with statistical analysis to determine whether the fetus has aneuploidy chromosomal abnormalities.

 

In addition to non-invasive prenatal testing and genetic disease screening, another popular application of gene sequencing is early cancer screening, which diagnoses the stage of cancer in patients by detecting tumor-associated circulating tumor DNA (ctDNA) in the blood. Beyond ctDNA detection, there are also scenarios where early cancer prediction is achieved by analyzing methylation-based epigenetic loci, a process that likewise relies on gene sequencing technology.

 

In cancer classification, the medical community also categorizes tumors based on types of gene mutations. These mutations are identified through gene sequencing, which, to a certain extent, guides targeted therapy in precision medicine.

 

Amid the impact of this epidemic, another clinical application of gene sequencing—pathogen research and microbiomics—has been proven to be a booming market. High-throughput gene sequencing technology has greatly improved the detection efficiency of COVID-19, with results typically available within one day from sampling to final conclusion.

 

Evolution and Iteration of Gene Sequencing Technologies Across Generations

 

The development of gene sequencing technology can be broadly divided into three stages. The first stage is what we commonly refer to as first-generation sequencing—the Sanger sequencing method. This sequencing technique offers long read lengths, reaching up to 1 kbp, with an accuracy rate of 99%. However, it has drawbacks such as low throughput, lack of portability, and high costs; a single whole-human-genome sequencing could cost tens of millions of US dollars.

 

With the advancement of gene sequencing technology, next-generation sequencing (NGS) has emerged. After nearly a decade of development, NGS has reduced the cost of gene sequencing to approximately $1,000.

 

Second-generation sequencing is a high-throughput, high-accuracy (99.5%) sequencing technology; however, it also has drawbacks such as short read lengths, slow reading speeds, high per-run costs, and poor portability.

 

How Is Next-Generation Sequencing Achieved? In simple terms, it is a massively parallel sequencing method. The DNA to be sequenced is fragmented into short segments, which are then “seeded” onto the surface of a detection chip via cluster generation or single-molecule multi-copy amplification. Subsequently, cyclic array synthesis sequencing is accomplished through biochemical cycles that drive sequential enzymatic reactions. The high-throughput capability of next-generation sequencing stems from its ability to simultaneously detect hundreds of millions of DNA fragments on a single chip.

 

Current Development Trends of Next-Generation Sequencing (NGS) Technology Are as Follows:


Automated Library Preparation: Sample to Answer
DNA Cluster Arrays with Higher Density/Stronger Signals
Faster and More Efficient Low-Cost Sequencing Biochemical Reagents
More Efficient Fluidic Systems and Cost-Effective Optical Systems

 

In recent years, a new gene sequencing technology has emerged, referred to as either third-generation or fourth-generation gene sequencing. This approach employs single-molecule real-time sequencing, with throughput now comparable to that of second-generation methods, although its accuracy still requires improvement.

 

This technology, based on single-molecule sequencing, is inherently associated with significant uncertainty; however, it offers several advantages, such as long read lengths, high sequencing speed, low cost per run, and miniaturized, portable instrumentation.

 

Single-molecule continuous sequencing does not require cluster generation. The sequencing workflow is relatively simple, as single-stranded DNA is continuously passed through a sensor to generate time-varying signals, a process that can be likened to playing a cassette tape.

 

In the field of single-molecule gene sequencing, I will briefly introduce Single-Molecule Real-Time (SMRT) fluorescent sequencing and nanopore gene sequencing technology.

 

Single-molecule real-time fluorescent sequencing primarily refers to zero-mode waveguide (ZMW) + gamma-phosphate fluorescence technology. Zero-mode waveguide technology involves coating the surface of a sequencing chip with an aluminum film containing an array of nanoscale pores (tens of nanometers in diameter). When the half-wavelength of the excitation light is larger than the pore diameter, the light cannot penetrate the aluminum layer and instead forms an evanescent wave only at the bottom of the pores. This evanescent wave excites fluorescent molecules only near the pore bottom, resulting in very low fluorescent background.

 

On the gamma phosphate, A, T, C, and G are conjugated to fluorophores of different colors. Individual complexes (polymerase + DNA template + primer) are immobilized near the sensor. Single fluorescent nucleotides are captured by DNA polymerase, generating a fluorescent signal. After the extension reaction, the beta- and gamma-phosphates are enzymatically cleaved and released along with the fluorophore, thereby preventing interference with the incorporation of the next nucleotide.

 

The principle of nanopore gene sequencing technology involves placing electrodes in two solution chambers separated by a thin film, with a hole—known as the nanopore—punched in the membrane. When an external voltage is applied, ions in the solution are driven by the electric field to pass through the nanopore, generating an ionic current. The magnitude of this current depends on factors such as pore size and shape, internal charge, ion type and concentration in the solution, and ambient temperature. Charged biomolecules, driven by the electric field, traverse the nanopore and obstruct the ionic current, thereby producing a blockade current signal. The magnitude of the blockade current, along with the blockade duration, correlates with the biomolecule’s size, polarity, charge, and its interactions with the nanopore.

 

Nanopore gene sequencing technology can be further categorized into strand sequencing, exonuclease sequencing, and tag sequencing, which will not be elaborated on here.

 

 微信截图_20200313144402.png

Performance Comparison of High-Throughput Sequencers (Excerpt from Guest Speaker’s PPT)

 

Looking ahead, I believe that third- and fourth-generation sequencing technologies still have a long way to go, with considerable technical potential yet to be tapped in terms of increasing throughput and improving accuracy. Further into the future, I predict that if gene sequencing development bypasses intermediate biological molecules for signal transduction or transmission and instead adopts solid-state nanopores for direct detection, it will better enable large-scale mass production of gene sequencing and reduce costs. Currently, many researchers are pursuing this direction, although their work remains at the stage of proof-of-concept. Nevertheless, the future of gene sequencing holds great promise.