Home VNET Subsidiary InterConnect Tech Files IPO Prospectus: Building Digital New Infrastructure for Genomic Sequencing

VNET Subsidiary InterConnect Tech Files IPO Prospectus: Building Digital New Infrastructure for Genomic Sequencing

Jun 02, 2023 08:00 CST Updated 08:00

"You reap what you sow; this is the power of genes."

 

The advent of gene sequencing technology has made it possible for humans to decipher the genome. Today, gene sequencing is widely applied across various fields, ranging from scientific research to clinical practice, including personalized treatment of chronic diseases, susceptibility gene screening, personalized diagnosis and treatment of tumors, diagnosis of rare diseases, assessment of risk for tumor metastasis and recurrence, personalized medication in advanced-stage cancer, as well as drug and therapy development for diseases such as cancer and AIDS.

 

In practice, gene sequencing is a massive undertaking that involves multiple steps, including DNA extraction, DNA fragmentation, library construction, DNA amplification, instrument-based sequencing, and data analysis. This process converts the “invisible” DNA within cells into identifiable ATGC base sequences, which are then analyzed and interpreted using bioinformatics and other methods.

 

After converting non-visual biological information into visual textual data, it is necessary to leverage IT resources such as computing, storage, and networking to analyze and study the textual information using gene sequencing algorithms. Through the integration of biological sciences and computer science, the transformation of genetic information is ultimately achieved.

 

Gene Sequencing Requires the Power of Computer Science, with Extremely High Technical and Cost Barriers

 

According to BCC Research, the market size of China’s gene sequencing industry was USD 1.59 billion in 2021 and is projected to reach USD 4.235 billion by 2026, with a compound annual growth rate (CAGR) of 21.6%, indicating that the sector is in a phase of rapid development. Alongside the swift expansion of market size, the scale effects of big data in gene sequencing are becoming increasingly prominent, thereby posing new challenges to the construction and operation and maintenance (O&M) of IT infrastructure.

 

As a new entrant in the gene sequencing industry, Hulian Technology (a sub-brand of VNET Group [VNET.US]) has entered the market as a “gene sequencing solutions provider,” offering its own insights and observations on the current pain points within the gene sequencing sector.Deng Shiyou, Head of Interconnected Technology Cloud SolutionsIt is stated that the demand for IT infrastructure among gene sequencing companies is relatively complex, a characteristic determined by the nature of the gene sequencing industry.

 

First, the volume of gene sequencing data is large.Public data indicates that a single human cell contains approximately 3.1 billion DNA base pairs, equivalent to 3 GB of data. To ensure the integrity of genomic data, sequencing must be performed with 30x coverage, ultimately generating around 30 × 3 GB ≈ 100 GB of data. After undergoing necessary processes such as grouping, conversion, assembly, and annotation, the data volume further increases to 600 GB. This means that a single gene sequencing company can generate hundreds of terabytes (TB) of data per month. With the widespread adoption of gene technology in healthcare, agriculture, food, and other sectors, data volumes are poised for explosive growth, imposing stringent requirements on data storage, computation, and transmission.

 

Second, the cost of gene sequencing is high.The entire workflow of common gene conversion, splicing, alignment, and annotation takes more than 30 hours to run once. Moreover, for sequencing companies, the financial cost of building their own HPC (High-Performance Computing) sequencing clusters is also very high. More importantly, the sequencing business itself has characteristics of peak and off-peak seasons with unpredictable workloads. The fluctuation in computing power demand, IT facility operation and maintenance costs, hardware equipment upgrades, software algorithm scaling, data storage, and other associated labor, financial, and time costs are unavoidable challenges that enterprises and research institutions face during actual testing operations.

 

Third, gene sequencing technology has a high barrier to entry.The gene sequencing workflow is complex. After acquiring raw data from the sequencer, computational analysis involves multiple steps, including mapping, filtering, deduplication, sorting, indexing, and alignment. This process entails numerous stages and requires a wide array of specialized software tools. In practical software deployment and sequencing operations, optimizing these tools to align with underlying computational infrastructure presents an exceptionally high technical barrier.

 

Elastic Computing Power + End-to-End One-Stop Service: Maximizing Optimization of Sequencing Compute Costs

 

Where there are pain points, there are opportunities; the demands of the gene sequencing industry are being recognized by internet technology.Deng Shiyou stated that the Interconnected Technology gene sequencing solution is designed to address concerns regarding the construction and operation and maintenance of IT infrastructure for gene sequencing, enabling enterprises to focus more on genomics research itself.

 

On one hand, leveraging its own infrastructure resource advantages, Interconnected Technology has established elastic computing power resource pools in data centers across China, enabling it to provide on-demand computing services to sequencing enterprises from nearby locations. Meanwhile, Interconnected Technology also collaborates with partners such as SenseTime and Alibaba Cloud to provide supplementary computing capacity.

 

On the other hand, to address the massive data transmission needs of gene sequencing companies, Interconnected Technology provides data synchronization services. By establishing network connections between data centers and the production facilities of gene sequencing enterprises, as well as between data centers and public clouds, it achieves low-latency and highly reliable data transmission.

 

From the perspective of interconnected technology itself, in addition to data centers distributed across China and robust interconnectivity capabilities, it can also collaborate with major cloud service providers to deliver comprehensive, one-stop services.

 

Genomic sequencing is a typical multi-domain, multi-business scenario. For security reasons, users typically host part of their operations in their own data centers while deploying other components in public and elastic environments to deliver public-facing services. Interconnect Technology’s Full-Domain Managed Cloud Service covers users’ private domains, managed domains, elastic domains, and public domains, providing corresponding services for each.


For the user-owned domain, Interconnect Technology provides operation, maintenance, and monitoring services; in the managed domain, users can colocate their servers in Interconnect Technology’s data centers; in the elastic domain, Interconnect Technology leverages its internal computing resource pool to provide scalable capacity supplementation; and in the public domain, Interconnect Technology partners with leading domestic cloud providers to deliver cloud services to users.

 

Deng ShiyouFurther stated,The prominent value of the Interconnected Technology gene sequencing solution lies in its ability to effectively address challenges faced by gene sequencing companies, such as high hardware investment costs, low infrastructure utilization during off-peak seasons, and lengthy sequencing analysis times, ultimately achieving cost reduction and efficiency improvement.

 

Hyper-Connected New Computing Power: Exploring New Possibilities for Sequencing Computational Power

 

The genetic testing industry is in a period of rapid development, with computing power being the key factor determining how far it can go.In another key focus area of interconnected technology—“Hyper-Connected New Computing Power”—new possibilities are being explored to break through the computational bottlenecks in gene sequencing.

 

“Hyper-Connectivity” is city-centric, operating on the philosophy that “the city is a computer.” By developing new municipal infrastructure, it aims to provide ubiquitous connectivity. “New Computing Power,” meanwhile, establishes an economic incentive mechanism to loosely integrate idle computing resources from existing industries, institutions, regions, and even the internet, thereby forming a robust supply of computing power. The project’s task types not only enable duration-based computing power supply similar to cloud computing but also offer task-level, fine-grained usage models, truly realizing ubiquitous computing power services.

 

Gene sequencing is a typical compute-intensive task. It is foreseeable that this distributed computing power network, built and operated in the era of large models through a model of participation, construction, operation, and ownership by all, with the goal of benefiting the entire population, will bring infinite possibilities to the gene sequencing industry.

 

From a broader perspective, whether it is full-domain managed services or hyper-connected new computing power, they are merely a microcosm of applications within the gene sequencing industry.Interconnected technology is empowering various sectors of the real economy, true to its mission: “To be a partner throughout the entire lifecycle of enterprise digital transformation.”