Home Genome Meets Blockchain: Empowering Data Ownership and Privacy in the Era of Precision Medicine

Genome Meets Blockchain: Empowering Data Ownership and Privacy in the Era of Precision Medicine

Apr 17, 2018 08:00 CST Updated 08:00

Judging from the trends in recent years, genomics is emerging as a dominant force in the new era of medicine. Leveraging genomic data, healthcare institutions can provide patients with more precise and personalized treatments. Pioneers in genetic testing services, such as 23andMe, have significantly driven the market adoption of this technology; however, this has also sparked new ethical considerations within society. 23andMe is well known for its $99 testing kits, but its business model—acquiring users and data through low-priced offerings and then selling user data to pharmaceutical companies—comes at the cost of user data ownership.

 

Genetic data is highly sensitive, encompassing information such as lifespan, health, ethnicity, and intelligence. In the future, genomics may reveal even more insights; if leaked, this information could cause significant problems.

 

For example, if it is discovered that an individual carries genes associated with breast cancer, insurance companies may deny them coverage; similarly, if DNA analysis indicates that a person’s aptitudes do not meet the job performance expectations of their employer, their career development could be adversely affected.

 

On the other hand, de-identified genetic data hold significant importance for scientific advancement. Only through analysis based on big data can we derive conclusions that more closely reflect real-world conditions and better achieve personalized medicine. This is precisely what many genomics scientists hope to see: leveraging large-scale data research to identify optimal treatment regimens for specific populations and making breakthroughs in the study of genetic diseases and immunotherapy.

 

Undoubtedly, a conflict has emerged between personal privacy and scientific progress. The advent of blockchain technology appears to reconcile this conflict.

 

Resolving Data Ownership Issues


Ten years ago, numerous companies engaged in debates over whether BRCA genes could be patented. Ultimately, the Supreme Courts of both the United States and Australia declared such patents invalid.

 

But can such a conclusion be applied to the issue of data ownership? As providers of data, users do not enjoy copyright protection for their own data, nor do they benefit from data transactions. Unfortunately, no law explicitly stipulates who owns these data, and there are no legal provisions mandating that such data be protected by copyright.

 

It seems that the only way to prevent such data leaks is to refrain from sequencing altogether, keeping the data concealed within the body. However, this approach clearly runs counter to the advances of modern medicine and genomics.

 

Economist Hernando de Soto describes blockchain as an “invisible hand” whose system can encompass people worldwide. Scientists in genomics also appear to be exploring whether this technology can address current challenges in the field.

 

Blockchain is commonly used for issuing virtual currencies, such as Bitcoin. So what does it have to do with protecting genomic data?

 

In fact, the applications of blockchain have long extended beyond virtual currencies; it is simply that public attention has focused more on Bitcoin. Bitcoin is useful and valuable because blockchain creates an immutable, distributed ledger system that is virtually impervious to hacking. Owners of Bitcoin accounts have absolute control over their assets.

 

For data storage, this is a highly sensitive, even nearly perfect solution. For instance, DARPA is considering using blockchain technology to protect nuclear weapons data. Furthermore, blockchain technology is being applied in areas such as diamond tracking, intellectual property management, and real-world logistics.

 

Based on this logic, efforts to integrate blockchain technology into genomics are primarily aimed at creating specific scenarios that maximize ethical and moral safeguards.

 

Secure and Private: The Possibilities Unlocked by a Remarkable Combination


So, what are the possibilities of "genetics + blockchain"?

 

For individuals, this is a secure place to store genetic data. If you have undergone testing and wish to access this data at any time, storing it on the “Gene Chain” is an excellent choice. Moreover, unlike Google Genomics, you do not need to pay for this service.

 

Storing data here is significantly more secure than in most other locations. After all, carrying a USB drive poses a risk of loss, while uploading data to the cloud or other platforms carries some potential for leakage. On the “Gene Chain,” your data encryption is virtually unbreakable. Blockchain employs a distributed chain-based storage architecture; if one node is compromised, the remaining tens of thousands of nodes will immediately reject the operational record.

 

You can authorize your doctor by setting access restrictions, ensuring they only receive the information you wish to share.


Similarly, you can also trace who has misused your data through a unique signature.

 

For scientists, they can access metadata and obtain the data they need for research by searching for potential topics. These search results do not disclose donors' personal information, nor do they provide access to the genomic data itself. However, they can submit requests to donors, and this arrangement is compensated.

 

These developments could truly revolutionize genomics and provide stronger data protection for data providers.


However, this is only one of its functions. In addition to its robust encryption capabilities, blockchain can also be used for data management. Research institutions and companies possessing large volumes of genetic data can purchase licenses to store their data on the “GeneChain,” without having to worry about ethical issues. This allows them to devote more energy to scientific research.

 

Who Is Doing It


Storing and sharing genomic data is a technical challenge, with computing emerging as a focal point of research. A single raw omics dataset is approximately 5–6 gigabytes in size, encompassing 3 billion base pairs. Furthermore, the substantial volume of data requiring annotation during the sequencing process poses significant management difficulties.

 

The “1000 Genomes Project,” initiated by Harvard University, has made all whole-genome sequencing data available online for free download. However, the management of these data remains relatively traditional, involving compression using compression tools prior to transmission and storage.

 

In 2016, Harvard genetics pioneer George Church, together with Cambridge University computer scientist Kamal Obbad and Harvard scientist Dennis Grishin, co-founded a startup called Nebula Genomics.

 

After obtaining the data, users can store it on Nebula Genomics’ blockchain platform. Other research institutions can access de-identified data through this platform, subject to payment. The system is built on specially customized encrypted data, and research institutions must first purchase tokens to pay for the data.


The purchase of these data is not a one-time, lifetime license; each use requires a separate payment, and the same dataset can be sold to multiple institutions. Tokens obtained by users can be redeemed for testing services through Nebula Genomics’ partner organizations, currently primarily with Veritas Genetics (another company founded by Church).

 

Nebula Genomics aims to transform users’ genetic data into copyright-protected assets akin to patents, granting users ownership and copyright over their data so they can benefit from it. Initially, users are required to pay a certain amount for the testing service; the current price for a whole-genome sequencing test is $1,000 per session. As sequencing costs decline, service prices will correspondingly decrease. The company plans to officially commence operations and services within the next six months, collaborating with security experts to create a more secure, protected, and anonymous environment.

 

Nebula Genomics is a company with its own platform and data sources, whereas blockchain-enabled companies such as EncrypGen, Luna DNA, and Zenome do not provide sequencing services to users and typically need to obtain data through third parties.

 

Luna DNA represents one of the early attempts to apply blockchain technology in healthcare, aligning with Nebula Genomics’ original vision of transforming personal data into data assets through blockchain. However, Luna DNA does not provide sequencing services, thereby avoiding direct competition with 23andMe and Ancestry at the sequencing level. The company believes that incentivizing data sharing through Luna Coin can, in turn, boost sales for sequencing service providers.

 

“Data at the individual level is of limited significance; statistically meaningful data requires the participation of hundreds of thousands to millions of individuals,” said Bob Kain, Co-founder and CEO of Luna DNA. “Unless community-level data is aggregated, it is difficult to address issues related to genomics and health.”

 

Although the blockchain industry is still in its nascent, wild-west phase, some investors have recognized the opportunities it presents. The company secured $2 million in seed funding from former executives of Illumina.

 

EncrypGen from Russia has adopted a strategy similar to that of Luna DNA and is currently conducting its Initial Coin Offering (ICO). These cryptocurrencies are not intended as investment vehicles, and their issuance will conclude in July.

 

Having resolved the issue of data ownership at the user level, how can the demand for large-scale data from research institutions be met? Shiva may hold the answer.

 

Founded in 2017, this German company aims to transform the current state of global healthcare through cutting-edge technologies, including blockchain, cloud computing, gene sequencing, artificial intelligence, and big data analytics. They believe these new technologies will usher medical research into a new era.

 

In addition to individual-oriented services, they also establish large-scale databases by partnering with healthcare institutions and governments worldwide through public project sponsorships, such as the collaboration reached with the Andhra Pradesh government in March 2018. Furthermore, they engage with regions having a high prevalence of rare diseases to generate more characteristic large-scale data on their platform.


Shiva has also launched personal services. Rather than providing services directly, Shiva acts as an ecosystem builder, bringing service providers and users together on a blockchain platform. Users can exchange their data—typically sourced from third-party testing agencies—for services. The service providers on the platform are not limited to sequencing companies; they also include insurance providers, health check-up institutions, and others.

 

Shiva aims to build an ecosystem based on blockchain technology, providing an open environment for service providers and users. In addition to genomics and personalized medical services, participating institutions can also integrate other applications and services.

 

The aforementioned companies primarily address issues of data ownership and trading. Beyond transactions, blockchain also offers storage capabilities, along with robust privacy and security features. Leveraging these attributes, Zenome aims to develop applications based on them.

 

Their Phase I plan is to establish a decentralized genomic data storage system and create a secure environment for free data exchange. Zenome does not provide sequencing services; instead, the platform’s data are primarily sourced from network participants. Subsequently, they will ensure data authenticity through questionnaires and an evaluation system. Once the data reach a certain scale, Zenome will attract large corporations and research centers to purchase the data.

 

But their ultimate goal is not transactional; rather, they aim to encourage these companies to store data on their platform, thereby building a community similar to Google Genomics.

 

Regulatory Ambiguity and Technological Limitations: Perhaps There Is No Perfect Solution

 

But can the integration of blockchain technology truly resolve all issues? This is difficult to answer, as no technology is perfect.

 

Blockchain technology also has its limitations. The complete data file of Bitcoin, from the genesis block to the present, has already reached 105 GB, and the data volume continues to grow. With the development of blockchain, the size of blockchain data stored by nodes is becoming increasingly larger.

 

Secondly, in a public blockchain, every participant can obtain a complete data backup, and all transaction data are public and transparent. In virtual currency transactions, the parties involved are anonymous, but the transactions themselves are public and accessible to everyone.

 

Finally, the application of blockchain in genomics is still in its exploratory stage, regulatory frameworks for this technology remain unclear, and its so-called security is not absolute.


These issues introduce uncertainty into the commercialization of the technology. We can only say that blockchain has inspired applications in genomics, but whether it truly resolves current contradictions remains to be seen through continuous experimentation and adjustment.