On October 25, 2018, VCBeat (WeChat ID: VCbeat) learned that AWS (Amazon Web Services) would join the NIH (National Institutes of Health) STRIDES (Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability) initiative, leveraging advanced technologies on the AWS cloud to foster innovative biomedical research.
The STRIDES Initiative was launched in July 2018 to provide commercial cloud computing services to NIH biomedical researchers. Initially, the NIH focused on helping researchers access high-value datasets and exploring new approaches to optimize technology-intensive research. The STRIDES Initiative is a component of the NIH Common Fund’s New Model for Data Sharing (NMDS). Another component of the NMDS is the NIH Data Sharing Pilot Program, which aims to test the integration of high-value biomedical datasets into cloud computing systems and to establish and evaluate best practices for data use. AWS became the second cloud service provider to join the STRIDES Initiative, following Google Cloud.
AWS is a subsidiary of Amazon that provides on-demand cloud computing platform services to governments, companies, and individuals through paid subscriptions. This service always allows subscribers to access virtual computer datasets via the internet.
The agreement between the NIH and AWS will help researchers at the NIH and more than 2,500 academic institutions across China access NIH funding and a range of AWS technologies. The lead of the STRIDES initiative hopes that data provided through partnerships with commercial cloud service providers (CSPs) like AWS will meet the Findable, Accessible, Interoperable, and Reusable (FAIR) standards recognized by the biomedical research community.
They also hope that AWS will collaborate directly with the NIH and its funded researchers to develop and test new methods, assemble additional datasets and related computational tools, and make them accessible to the broader research community. Researchers participating in the pilot phase of the CSP and NIH data-sharing initiative will establish cloud storage services for three pilot datasets, while defining guidelines, policies, and procedures. Following the completion of a series of pilot programs and the revision of the associated policies and procedures, this service will be made available to NIH-funded research institutions.
The three test datasets funded by the NIH were selected based on their value to the biomedical research community, data diversity, and coverage of both basic and clinical research.
Currently, these three datasets include the following: the Genotype-Tissue Expression (GTEx) dataset, which explores gene expression and regulation across human tissues, as well as the role of genomic variation in altering gene expression; the Alliance of Genome Resources (AGR), which comprises six Model Organism Databases (MODs) that provide in-depth biological data for advanced research on model organisms; and the Trans-Omics for Precision Medicine (TOPMed) program, whose dataset collects and pairs whole-genome sequencing (WGS) data.
About AWS
AWS is a subsidiary of Amazon, providing cloud computing IaaS (Infrastructure as a Service) and PaaS (Platform as a Service) to Amazon. AWS offers users a comprehensive suite of cloud computing services, including elastic computing, storage, databases, and applications, helping businesses reduce IT investment and maintenance costs.
About NIH
The National Institutes of Health (NIH) is part of the U.S. Department of Health and Human Services and is one of the largest biomedical research agencies in the world. The NIH comprises 27 institutes and centers, 24 of which receive direct appropriations from the U.S. Congress to fund research projects. The NIH aims to conduct and support basic, clinical, and translational medical research, and is currently investigating the causes and treatments of both common and rare diseases.