In an era of much debated debates on data ownership, privacy and monetization, a particular data could be considered the most personal of all: the human genome.
While we are 99.9% identical in our makeup for the whole species, the remaining 0.1 percent contains unique variations in the code that are designed to influence our predisposition towards certain diseases and even our temperamental biases – a project as far as we are sensitive to everything from heart disease and Alzheimer's to jealousy, unconsciousness and anxiety.
2018 provided ample examples of how bad actors can wreak havoc on the use of even trivial data. For those interested in protecting this most critical form of identity, blockchain has aroused considerable interest as a powerful alternative to closed architectures and proprietary exploits of the existing genomic data market, promising in their place a secure and open protocol for the code of life.  Encrypted Chains
The sequencing of the human genome up to the molecular level of the four "letters" that bind to the double stranded propellers of our DNA was completed for the first time in 2003. The project cost $ 3, 7 billion and 13 years of computing power. Today it costs $ 1,000 for a single genome and takes a few days. The estimates are that they will soon cost only $ 100.
Who owns your genome? Resurrection of the Woolly Mammoth … and Blockchain
For Professor George Church, Harvard's famous nonconformist geneticist, the boundaries between the technologies inside and outside the laboratory are porous. Having co-pioneered the direct sequencing of the genome in 1984, a brief summary of his recent ambitions includes attempts to resuscitate the long-extinct mammoth, create virus-free cells and even reverse aging.
He now placed another bleeding edge in the technology at the heart of the genomic revolution: blockchain.
Last year, Church – together with Harvard colleagues Dennis Grishin and Kamal Obbad – co-founded the nebula Genomics, startup blockchain. The Church had been trying for years to accelerate and guide the generation of large-scale genomic data. He had appealed to volunteers to contribute to his nonprofit Personal Genome Project (PGP) – a "Wikipedia" of open access human genomic data that aggregated around 10,000 samples until then.
PGP relied on people who lost privacy and property in the pursuit of advanced science. As the Church said in a recent interview, they were mostly "particularly altruistic", or people interested in accelerating the search for a particular disease due to family experiences.
In other cases, as Dr Ben Sam Brama, cyber security expert at DNABits, told Cointelegraph, the same patients generate data and are "sick enough to throw away any privacy and property concerns":
to the health system and say: "We will give you whatever you want, take it, sign any document, we will consent." Just heal us, find a cure. ""
The challenge is to get everyone else. While no one knows exactly how many people have sequenced their genomes to date, some estimates suggest it is about a million.
Startups like Nebula and DNABits propose that a tokenized ecosystem enabled to blockchain could be the technological turning point for onboarding the masses.
By allowing people to monetize their genomes and sell access directly to data buyers, Nebula thinks that its platform could help reduce sequencing costs "to zero or even offer [people] a net profit" . "
While Nebula will not directly subsidize whole genome sequencing, a blockchain model would allow interested buyers – for example, two drug companies – to place money for someone's sequence in exchange for access to their own data.
Tokenization opens up the flexibility and granular consent to enable different scenarios. As suggested by Brama, a data owner may have the right to share any drug that may be developed based on the research that has allowed or be reimbursed for his medical prescription in cryptographic tokens.
Driving and accelerating data generation is only part of the equation.
Nebula conducted a survey that found that rather than convenience, privacy and ethical concerns eclipsed all other factors when people were asked if they would consider sequencing their genome. In another study of 13,000 people, 86% said they were worried about the misuse of their genetic data: over half of them echoed privacy concerns.
These concerns are not simply grounded in the dystopian scioptic sci-fi 90 years – thinks the imaginary biopunk of Gattaca of a future society in the grips of a neo-Eugenics fever.
As Ofer Lidsky – co-founder, CEO and CTO of DNAtix, blockchain genomic startup – put:
"Once your DNA has been compromised, you can not change it, like a credit card that you can cancel and receive a new one.Your genetic code is with you for a lifetime […] Once it has been compromised, there is no way back. "
Data are increasingly being intercepted, commercialized and Even armed Sequencing – not to mention sharing – your ge name is perhaps a further step ahead of many are willing to take, given its uniqueness, irrevocability and longevity.
The Craving of DNABits has given its cybersecurity, saying that:
"The consequences are very difficult to imagine, but in a world [in which] people are building carriers like viruses that will spread to the cells of the body and change – it's scary, but in reality, all the constituent elements are already present: genome sequencing, data breaches, gene editing Now people are working to solve the main health conditions using in vivo gene editing. But we must assume that every instrument out there will end up in the wrong hands. "
He added," We're not talking about taking advantage of someone just for one night with GHB or some other drug "- this Impact on the rest of a individual.
This April, in the wake of the Cambridge Analytica scandal, the news spread that police investigators had extracted a hobbyist's genealogical database for DNA fragments of individuals who they hoped could help solve a murder case that was went cold for over thirty years.
The forces of order did not resist access to a centralized storehouse of genetic material that had been loaded by an unwitting public. And while many greeted the arrest of the Golden State Killer through a tangle of DNA, others expressed considerable discomfort.
This obscurity of access has implications beyond the scientific. While Brama's dystopia could be far away, today there are concerns about genetic discrimination on the part of employers and insurance companies – the latter of which is currently only partially legally prohibited. Grishin echoed this, pointing out that in the United States "life insurance can be denied because of its own DNA."
This May, the US Federal Trade Commission opened an investigation into genetic testing companies of popular consumers, including 23andMe and Ancestry.com – on their policies for managing personal and genetic information and on how they share this data with third parties.
23andMe and Ancestry.com are a recent phenomenon of so-called genetic tests aimed at the consumer, whose popularity is estimated to have more than doubled in the last year
These companies use a narrower technique called genotyping, which identifies 600,000 positions spaced out at regular intervals of about 6.4 billion letters of an entire genome. Although limited, it still reveals intrinsically personal genetic information.
The famous 23andMe home genotyping kit – packaged as "Welcome to You" – promises to tell people everything, from ancestral makeup to the likelihood that they will spend their nights in the irritable clutches of insomnia. The kit comes with a minimum price of $ 99.
This July, the sixth largest pharmaceutical company in the world, GlaxoSmithKline (GSK), invested $ 300 million in a four-year agreement to access the 23andMe database and test company It is estimated that it has earned $ 130 million from the sale of access to about one million human genotypes, working at an average price of about $ 130. By way of comparison, Facebook could generate about $ 82 of gross revenue from data from a single active user.
Anonymous blockchain systems for the battlefield genomic revolution
In this increasingly opaque The panorama of genomic data, private companies monetize the genotypic data generated by their consumers and sequence data are fragmented through proprietary and centralized silos, both in the cumbersome legacy systems of research and health care institutions and in the private companies of biotech companies.  Bringing genomics to the blockchain would allow the circulation needed to accelerate research while protecting this uniquely personal information by keeping anonymized identities separate from cryptographic identifiers. Users keep control of their data and decide exactly who they are sharing with for what purpose. This access, in turn, would be traced to a controllable and immutable register.
Grishin outlined the version of Nebula, which would place asymmetric requirements on different members of the ecosystem. Users would have the opportunity to remain anonymous, but an authorized blockchain system with verified nodes and validators would require buyers of data using the network to be completely transparent about their identity:
"If someone contacts you, it should not just be a Cryptographic network ID, but this should be John Smith of Johnson & Johnson, who works, for example, in oncology. "
Grishin added that Nebula has experienced both Blockstack and the Ethereum blockchain (ETH), but has when he decided to switch to an internal prototype, considering that the Ethereum capacity of 15 transactions per second is insufficient for its ecosystem.
The Brama of DNABits, also engaged in using an authorized system, proposed to use "the simplest and most robust form of blockchain – that is, a Bitcoin-type network."
"The most powerful and the more capable you use, the greater the surface attack. "
Lie-proofing the blockchain
23andMesi says that it stores about five million profiles of genotypical customers and the rival company Ancestry.com about 10 million . For each profile, they collect about 300 phenotypic data points – creating surveys that aim to find out how many cigarettes (you think) you've smoked during your life or whether yoga or Prozac has been more effective in managing depression.
A phenotype is the set of observable characteristics of an individual that derives from the interaction of his genotype with their environment. Generating and sharing access to these data is essential to decode the genome through a correlation of variants and traits. But as Grishin notes, being largely self-referential, the quality of much of the existing data is uncertain, and a tokenised genomics addresses an obstacle in this sense:
"If people are able to monetize their personal genomic data, then you can imagine that some people might think, "If I claim to have a rare condition, many pharmaceutical companies will be interested in buying access to my genome," which is not necessarily true.The value of a genome is rather difficult to It is not correct to say that if you have something rare, your genome will be more valuable. "Indeed, many studies require many control samples that are quite normal."
Education can help make people aware of the fact that they will no longer make money by lying and that a genome of the middle way could be just as interesting for a buyer as unusual. But Grishin also noted that a blockchain system can offer unique mechanisms that deter deception, even where education fails:
"Blockchain can help create phenotypic polls that detect wrong answers or identify where a single participant has tried to lie, and this can be combined with blockchain deposit systems, where, for example, before participating in a survey, you must deposit a small amount of cryptocurrency into an escrow wallet. " If the conflicting answers indicate that someone has tried to lie about their medical condition, then their deposit could be hidden in a way that is much easier to implement within a blockchain system than one that uses currencies Legal.
2018: Viruses and chromosomes hit the blockchain
Even with only a fraction of the population on board, given the intensity of body code data, a sequencing tsunami is already flooding existing centralized stores.
The complex set of raw data of a single genome reaches 200 gigabytes: in June In 2017, the GenBank of the National Institute of Health of the United States contained over two trillion bases of sequence. One of the world's largest biotechnology companies, the Chinese BGI Genomics, announced that in the same month it planned to produce five new DNA petabases in 2017, increasing each year to 100 petabases by 2020.
In his interview with Cointelegraph, Lidsky proposed that the 200 gigabyte raw dataset is not necessary for analysts, pointing out that the initial genome sequencing is read several times "say 30 or 100 times", to mitigate inaccuracies. Once combined, he explained, "the sequence size is reduced to 1.5 gigabytes". This still requires significant compression to bring it to the blockchain. As a reference, the average size of a transaction on the Bitcoin blockchain (BTC) was 423 kilobytes, as of mid-June 2018.
Average size of the Bitcoin blockchain transaction, 2014-18. TradeBlock.com
In June, DNAtix announced the first transfer of a complete chromosome using blockchain technology, in particular IBM Lidsky's Hyperledger fabric told Cointelegraph that the company had managed to obtain a 99% compression rate for DNA in August 19659002] The nebula, for its part, imagines that even on a blockchain the transfer of data is not necessary and discouraged, given the particular sensitivity of the in genomics, it instead proposes the sharing of data access. The solution would combine blockchain with advanced encryption techniques and calculated calculation methods. As explained by Grishin:
"Your data can be analyzed locally on your computer by running an app yourself on your data […] with additional security measures in place – for example, using homomorphic encryption to share data in an encrypted form ".
Grishin explained that the homomorphic techniques encrypt data but ensure that it is not "nonsense" – creating "transformations that transform data without disturbing them":
"The data buyer does not obtain the underlying data but it calculates its encrypted form to get the results out of it.The code is then moved to the data rather than the data transferred to the researchers. "
The encrypted data can be made available to developers of the so-called genomics apps, which Nebula, DNAtix and many other emerging startups in the field all offer as a means of providing users with an interpretation of their data. They could also provide an additional source of monetization for researchers and other third-party developers.
But is the genomic interpretation of "outsourcing" to an app so simple? The ten-year model of health care has directed patients to genetic counselors to overcome risks and talk about expectations, helping to translate disconcerting and often frightening results.
Consumer genetic testing companies have already been accused of leaving their customers a lot of data and few answers. "In addition to satisfying genealogical curiosity and interpreting a range of" wellness "genes, 23andMe can reveal if it carries a genetic variant that could impact your child's future health and that – starting in 2017 – has even been authorized to disseminate genetic health risks, including breast cancer and Parkinson's disease
Blockchain may not be much more effective when it comes to leaving individuals in the dark, in front of the blue glow of their computer screens. and DNAtix are both evaluating how to integrate genetic counselors into their ecosystems, and Grishin has also proposed that users be able to "choose" whether they really want to "know everything" or just want "viable" insights – that is, things that medicine modern may face.
Blockchain and big pharma
Sales of prescription drugs worldwide are expected to reach It was $ 1.2 trillion by 2024. But closing the feedback loop between drugs and the millions of people who take their pills every day still faces significant obstacles.
Drug development is based on the correlation and monitoring of the life cycle of medical studies, genetic tests, prescription side effects and long-term lifestyle effects; tokenization can encourage individuals and businesses to share data generated across multiple streams. As Brama stressed:
"Lifestyle data come from wearables, smartphones, smart homes, smart cities, shopping, business interactions, social media, etc. Another is brought by everyone, and this is our genome The third is given by the clinical and health data generated in the health system. "
Brama used the analogy of a deck of cards to explain how the blockchain could be the key to starting to bring these data together, all the while protecting anonymity of data owners.
An individual can hold an unlimited number of unique addresses in his digital wallet. Entering a pharmacy to buy a particular drug – for example, vitamin C, stamped with a QR code – would generate a transaction for one of these addresses. A visit to a family doctor could generate a further hash for a diagnosis on the electronic medical record (EMR) – for example, a runny nose. This transaction is between the caregiver and another address in the wallet.
A user might choose to put the correlation between transactions for different portfolios on the blockchain and make it public for people who bid on the underlying data. Or, they could maintain the off-chain correlation and send evidence only when, for example, an insurance company or research institute advertises users who have a particular set of transactions:
"You have the deck. cards, you decide if you say, if you do not say, and you can put them on the table and let everyone see, or you can indicate in private that you actually have them … really leave the choice and the implementation to you. "
Professor Church has made an analogy that probably sounds bells for anyone who is inserted into the crypto and blockchain space, saying that "at this time, the sequencing of the genome is like the internet at the end of the years". 80. It was there, but nobody used it. "
Blockchain and the vanguard of genomics research have perhaps approached each other like never before. Now that the DNA in our cells is intended as a lifeline of information, a new and revolutionary technology is needed to safely and flexibly manage the interdependent networks of body code.
The advent of genomics raises questions that can not be solved by science alone. For all our respondents, blockchain could only be the key to creating fair and transparent means of ownership and circulation that will ensure that these raw biomaterial propellers do not get out of control.