DeepMind is poised to transform life sciences by solving the problem of protein folding



[ad_1]

Google’s DeepMind AI division recently made significant progress toward solving one of the oldest challenges in biology by calculating the shape of a protein from a sequence of amino acids. According to Nature, the breakthrough has the potential to transform the fields of biology and chemistry, allowing scientists to determine the function of many currently mysterious proteins.

The shape of a protein defines its function and most biological functions depend on proteins. “Protein folding” is the name given to the process that converts amino acid chains into the three-dimensional structures needed by protions to perform their functions. If scientists can determine the relationship between amino acid sequences and the shape of the proteins they generate, they can determine which proteins influence the different biological processes.

Scientists speculate that there are at least 80,000 proteins within the human proteome, but only a small fraction of these proteins have known structures. The traditional method of determining the shape of a protein can take years of laboratory experiments, also harnessing the power of computer algorithms and models. The work done by DeepMind can greatly accelerate the discovery process of protein structures by reliably determining the structure of proteins in a fraction of the normal time.

DeepMind researchers trained their algorithms on a database consisting of approximately 170,0000 protein sequences and the shapes corresponding to those sequences. The algorithms developed by the researchers were trained on between 100 and 200 GPUs, and the training process took a few weeks to complete. The model developed by the researchers was dubbed “AlphaFold”.

AlphaFold works through a “voltage algorithm”, starting by connecting small pieces of the protein together and then scaling up to connect larger and larger sections. At first, small clusters of amino acids were connected, so the algorithm tried to find ways to connect these clusters.

AlphaFold researchers initially tried using conventional deep learning algorithms on genetic and structural data to predict the relationship between amino acids and proteins. AlphaFold then created consensus models for the protein style. When this technique proved to have too many limitations, the researchers tried a new strategy. The AlphaFold research team created models trained on multiple features and this time they had the model return predictions for the final structure of the protein sequences.

The engineering team stressed AlphaFold by participating in a competition in which computer algorithms compete to evaluate the structure of a protein from amino acid sequences. The competition was the “Critical Assessment of Protein Structure Prediction” or CASP. Contest participants are given 100 amino acid sequences and their models must work out the structure of the proteins. AlphaFold not only beat other computer models in terms of accuracy, but it also performed comparably to traditional laboratory-based modeling techniques. The final median AlphaFold score was approximately 92 out of 100, with the experimental laboratory methods being awarded a score of 90. The median AlphaFold score dropped to 87% on the more difficult proteins.

According to DeepMind CEO and co-founder Demis Hassabis, the company is already planning to give researchers access to AlphaFold, with scientists from the Max Planck Institute for Development Biology already using the model to discover protein structures on they have worked on for over a decade.

Janet Thornton, director emeritus of the European Institute of Bioinformatics, was quoted via ScienceMag as saying that DeepMind’s findings “will change the future of structural biology and protein research.” Meanwhile, University of Maryland biologist Shady Grove John Moult says he never thought the protein folding problem would never be solved in this lifetime.

While it is highly unlikely that AlphaFold will completely replace traditional experimental methods for discovering protein structures, it could greatly increase the rate at which protein structures are discovered. Researchers may require lower-quality experimental data to determine a protein structure, and researchers already have access to a large volume of genomic data that could be translated into structures using AlphaFold’s solutions.

[ad_2]
Source link