[ad_1]
An artificial intelligence (AI) network developed by Google’s DeepMind has taken a leap forward in solving one of biology’s greatest challenges: determining the 3D shape of a protein from its amino acid sequence.
DeepMind’s program, called AlphaFold, outperformed nearly 100 other teams in a two-year protein structure prediction challenge called CASP (which stands for Critical Structure Prediction Assessment). The results were released on November 30, at the start of the conference – held this year on the Internet – that evaluates the results of the exercise, according to Nature.
“This was very important,” said John Moult, a computational biologist at the University of Maryland at College Park, co-founder of CASP in 1994 to improve computational methods for accurately predicting protein structures. “In a way, the problem is solved.”
The ability to accurately predict protein structures from their amino acid sequence would be of great benefit to life sciences and medicine. This would greatly accelerate efforts to understand the “building blocks” that make up cells and enable faster and more advanced drug discovery.
AlphaFold hit the top spot at the last CASP in 2018, the first year London-based DeepMind participated. But this year, the equipment’s deep learning network was far beyond that of other teams and, scientists say, performed so surprisingly that it could herald a revolution in biology.
“It’s a game changer,” says Andrei Lupas, an evolutionary biologist at the Max Planck Institute for Developmental Biology in Tübingen, Germany, who has evaluated the performance of several teams at CASP. AlphaFold has already helped him find the structure of a protein that has been bothering his lab for a decade and he hopes it will change its functioning and the problems it faces. “This will change medicine. This will change the search. This will change bioengineering. Everything will change ”, adds Lupas.
In some cases, AlphaFold’s structure predictions were indistinguishable from those determined using “gold standard” experimental methods, such as X-ray crystallography and, in recent years, cryo-electron microscopy. AlphaFold may not avoid the need for these laborious and expensive methods – for now – scientists say, but artificial intelligence will make it possible to study living things in new ways.
The problem of the structure
Proteins are the “building blocks” of life, responsible for most of what happens inside cells. How a protein works and what it does is determined by its 3D shape; “Structure is function” is an axiom of molecular biology. Proteins tend to take their shape unaided, guided only by the laws of physics.
For decades, laboratory experiments have been the primary way to obtain good protein structures. The first complete protein structures were determined, starting in the 1950s, using a technique in which X-ray beams are fired at crystallized proteins and diffracted light translated into the atomic coordinates of a protein. X-ray crystallography produced most of the protein structures. But over the past decade, cryocomputer microscopy has become the preferred tool of many structural biology labs.
Scientists have long wondered how the constituent parts of a protein – chains of amino acids – map the many turns and folds of its final shape. Early attempts to use computers to predict protein structures in the 1980s and 1990s failed, the researchers say.
Moult initiated CASP to bring more rigor to these efforts. The event challenges teams to predict protein structures that have been resolved using experimental methods, but for which the structures have not been revealed. Moult attributes to the experiment – he does not call it competition – to have greatly improved the field.
DeepMind’s 2018 performance at CASP13 surprised many scientists in the field, which has long been the stronghold of small academic groups. But his approach was very similar to that of other teams applying AI, says Jinbo Xu, a computational biologist at the University of Chicago, Illinois.
The first iteration of AlphaFold applied the AI method known as deep learning to structural and genetic data to predict the distance between amino acid pairs in a protein. In a second phase that doesn’t use AI, AlphaFold uses this information to arrive at a “consensus” model of what the protein should look like, says John Jumper of DeepMind, who leads the project.
The team tried to develop this approach, but it ended up nowhere. So he changed course, Jumper says, and developed an artificial intelligence network that incorporated additional information about the physical and geometric constraints that determine how a protein folds. They also defined a more difficult task: instead of predicting the relationships between amino acids, the network predicts the final structure of a target protein sequence. “It’s a little bit more complex,” Jumper says.
Incredible accuracy
CASP lasts for several months. Target proteins or portions of proteins called domains – about 100 in total – are released regularly, and teams have several weeks to submit their predictions about the structure. A team of independent scientists then evaluates the predictions using metrics that measure how similar a predicted protein is to the experimentally determined structure. The evaluators don’t know who is making a prediction.
AlphaFold’s predictions were called “group 427,” but the surprising accuracy of many of its voices made them stand out, Lupas says. “I assumed it was AlphaFold. Most people have, “he says.
Some predictions were better than others, but nearly two-thirds were comparable in quality to experimental facilities. In some cases, Moult says, it wasn’t clear whether the discrepancy between AlphaFold’s predictions and the experimental result was a prediction error or an artifact (an “error”) of the experiment.
AlphaFold’s predictions did not match the experimental structures determined by a technique called nuclear magnetic resonance, but this could be due to the way raw data is converted into a model, Moult says. The network also endeavors to model individual structures in protein complexes, or groups, through which interactions with other proteins distort their shapes.
Overall, the teams predicted the facilities more accurately this year than the last CASP, but much of the progress can be attributed to AlphaFold, Moult says. On protein goals considered moderately difficult, top performers from other teams typically scored 75 on a 100-point prediction accuracy scale, while AlphaFold scored about 90 on the same goals, says Moult.
About half of the teams mentioned “deep learning” in the summary, summing up their approach, Moult says, suggesting that AI is having a broad impact in the field. Most of them came from academic teams, but Microsoft and Chinese technology company Tencent also joined CASP14.
Mohammed AlQuraishi, a computational biologist at Columbia University in New York and a CASP participant, is eager to delve into the details of AlphaFold’s performance in the competition and learn more about how the system works when the DeepMind team unveils its approach on December 1. . It’s possible – but unlikely, he says – that an easier-than-normal range of protein targets contributed to performance. AlQuraishi’s strong hypothesis is that AlphaFold will be transformative.
“I think it’s fair to say that this will be very detrimental to the field of protein structure prediction. I suspect many will leave the field because the core problem has undoubtedly been solved, ”he says. “It’s a first-rate advance, certainly one of the most significant scientific achievements of my life.”
Faster structures
An AlphaFold prediction helped determine the structure of a bacterial protein that Lupas’s lab has been trying to uncover for years. Lupas’s team had previously collected raw X-ray diffraction data, but turning these random patterns into a structure requires some information about the shape of the protein. The tricks for obtaining this information, as well as other forecasting tools, have failed. “The 427 Group model gave us our facility in half an hour, after a decade of experience with it all,” says Lupas.
Demis Hassabis, co-founder and CEO of DeepMind, says the company intends to make AlphaFold useful for other scientists. (He had previously published enough details on the first version of AlphaFold to allow other scientists to replicate the approach.) AlphaFold could take days to arrive at a predicted structure, which includes estimates of the reliability of different regions of the protein. “We’re just starting to understand what biologists want,” adds Hassabis, who sees drug discovery and protein design as potential applications.
In early 2020, the company published predictions about the structures of a handful of SARS-CoV-2 proteins that had not yet been determined experimentally. DeepMind’s predictions for a protein called Orf3a turned out to be very similar to that determined later using cryo-electron microscopy, says Stephen Brohawn, a molecular neurobiologist at the University of California, Berkeley, whose team launched the facility in June. “What they have managed to do is really impressive,” he adds.
Impact in the non-real world
AlphaFold is unlikely to shut down laboratories, such as Brohawn’s, that use experimental methods to resolve protein structures. But that could mean that lower quality, easier to collect experimental data would be all you need to get a good structure. Some applications, such as evolutionary protein analysis, are set to thrive because the tsunami of available genomic data can now be reliably translated into structures. “This will allow a new generation of molecular biologists to ask more advanced questions,” says Lupas. “It will require more reflection and less pipetting.”
“This is a problem that I was starting to think would not be solved in my life,” says Janet Thornton, a structural biologist at the European Molecular Biology Laboratory-European Bioinformics Institute in Hinxton, UK, and a former CASP consultant. He hopes the approach will help unravel the function of thousands of unsolved proteins in the human genome and make sense of the variations in disease-causing genes that differ between people.
AlphaFold’s performance also marks a turning point for DeepMind. The company is best known for using AI to master games like Go and chess, but its long-term goal is to develop programs capable of achieving broad human intelligence. Addressing major scientific challenges, such as protein structure prediction, is one of the most important applications its AI can accomplish, says Hassabis. “I think it’s the most significant thing we’ve done, in terms of its impact in the real world.” [Nature]
Source link