DeepMind accurately predicts the structure of proteins, advancing a decades-old challenge



[ad_1]

Two examples of protein targets in the free model category show the prediction of AlphaFold with respect to the shape of the proteins determined by the experimental results. AlphaFold’s predictions are in blue and the experimental results are in green. Screenshot from DeepMind.

DeepMind, the Google subsidiary that beat chess and Go players with artificial intelligence, set out to solve a decades-old problem: predicting the structures of proteins.

In a two-year challenge where participants have to blindly predict the structure of 100 proteins based on their amino acid sequences, a system developed by DeepMind caught the attention of researchers when it predicted their shape with a high level of accuracy. .

Called AlphaFold, the system determined the shape of about two-thirds of the proteins with an accuracy comparable to long laboratory experiments. Its accuracy with most other proteins was also high, according to findings shared Monday by CASP (the Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction). The results were compared with the shape of the proteins discovered in the laboratory and were evaluated by independent scientists.

This is an important breakthrough because the shape of proteins is closely related to their function, but it is difficult to predict the structure of a protein based on its amino acid sequence. Proteins can theoretically fold into a multitude of shapes before assuming their final structure. It can take years of research and expensive equipment to work out their shape.

“Proteins are extremely complicated molecules and their precise three-dimensional structure is critical for the many roles they play, such as insulin which regulates our blood sugar levels and antibodies which help us fight infections. Even small rearrangements of these vital molecules can have catastrophic effects on our health, so one of the most effective ways to understand the disease and find new treatments is to study the proteins involved, “John Moult, computational biologist at the University of Maryland in College Park who co-founded CASP, said in a press release.

London-based DeepMind has been working on AlphaFold for four years. He also beat the other teams in the latest CASP challenge in 2018, but did so by a much greater margin over the past year.

The accuracy of the model is measured using the Global Distance Test, which roughly measures the percentage of amino acid residues within a certain distance from the correct location. On a scale of 1 to 100, DeepMind’s latest AlphaFold system achieved a median of 92.4 across all targets.

For the latest iteration of AlphaFold, DeepMind designed a neural network that interprets the structure of a protein as a “spatial graph”. He trained the system on 170,000 protein structures from the protein database and on databases with proteins whose structure was unknown.

This allowed the system to determine the structures in days, the team that developed it wrote in a blog post. An internal confidence measure also indicated which parts of each predicted protein structure are reliable.

What does this all mean? It could have broad implications for drug discovery and a better understanding of specific diseases. Andrei Lupas, director of the Max Planck Institute for Developmental Biology and CASP evaluator, said the system helped his team solve a protein structure they have been stuck on for nearly a decade.

Andriy Kryshtafovych, researcher at UC Davis and one of the judges, described the result as a “triumph for team science”, crediting the collaborative work of the researchers over the years to achieve this result.

“Being able to investigate the shape of proteins quickly and accurately has the potential to revolutionize the life sciences,” he said in a news release. “Now that the problem has been largely solved for individual proteins, the way is open for the development of new methods for determining the shape of protein complexes – collections of proteins that work together to form much of the mechanism of life and other applications. “

.

[ad_2]
Source link