Google DeepMind Says Its New Artificial Intelligence Tool Can Predict Which Genetic Variants Are Likely to Cause Disease
Genetic engineers at the lab used the new tool to generate a catalog of 71 million possible missense variants, classifying 89% as either benign or pathogenic
Genetic engineers continue to use artificial intelligence (AI) and deep learning to develop research tools that have implications for clinical laboratories. The latest development involves Google’s DeepMind artificial intelligence lab which has created an AI tool that, they say, can predict whether a single-letter substitution in DNA—known as a missense variant (aka, missense mutation)—is likely to cause disease.
The Google engineers used their new model—dubbed AlphaMissense—to generate a catalog of 71 million possible missense variants. They were able to classify 89% as likely to be either benign or pathogenic mutations. That compares with just 0.1% that have been classified using conventional methods, according to the DeepMind engineers.
This is yet another example of how Google is investing to develop solutions for healthcare and medical care. In this case, DeepMind might find genetic sequences that are associated with disease or health conditions. In turn, these genetic sequences could eventually become biomarkers that clinical laboratories could use to help physicians make earlier, more accurate diagnoses and allow faster interventions that improve patient care.
The Google engineers published their findings in the journal Science titled, “Accurate Proteome-wide Missense Variant Effect Prediction with AlphaMissense.” They also released the catalog of predictions online for use by other researchers.
“AI tools that can accurately predict the effect of variants have the power to accelerate research across fields from molecular biology to clinical and statistical genetics,” wrote Google DeepMind engineers Jun Cheng, PhD (left), and Žiga Avsec, PhD (right), in a blog post describing the new tool. Clinical laboratories benefit from the diagnostic biomarkers generated by this type of research. (Photo copyrights: LinkedIn.)
AI’s Effect on Genetic Research
Genetic experiments to identify which mutations cause disease are both costly and time-consuming, Google DeepMind engineers Jun Cheng, PhD, and Žiga Avsec, PhD, wrote in a blog post. However, artificial intelligence sped up that process considerably.
“By using AI predictions, researchers can get a preview of results for thousands of proteins at a time, which can help to prioritize resources and accelerate more complex studies,” they noted.
Of all possible 71 million variants, approximately 6%, or four million, have already been seen in humans, they wrote, noting that the average person carries more than 9,000. Most are benign, “but others are pathogenic and can severely disrupt protein function,” causing diseases such as cystic fibrosis, sickle-cell anemia, and cancer.
“A missense variant is a single letter substitution in DNA that results in a different amino acid within a protein,” Cheng and Avsec wrote in the blog post. “If you think of DNA as a language, switching one letter can change a word and alter the meaning of a sentence altogether. In this case, a substitution changes which amino acid is translated, which can affect the function of a protein.”
In the Google DeepMind study, AlphaMissense predicted that 57% of the 71 million variants are “likely benign,” 32% are “likely pathogenic,” and 11% are “uncertain.”
The AlphaMissense model is adapted from an earlier model called AlphaFold which uses amino acid genetic sequences to predict the structure of proteins.
“AlphaMissense was fed data on DNA from humans and closely related primates to learn which missense mutations are common, and therefore probably benign, and which are rare and potentially harmful,” The Guardian reported. “At the same time, the program familiarized itself with the ‘language’ of proteins by studying millions of protein sequences and learning what a ‘healthy’ protein looks like.”
The model assigned each variant a score between 0 and 1 to rate the likelihood of pathogenicity [the potential for a pathogen to cause disease]. “The continuous score allows users to choose a threshold for classifying variants as pathogenic or benign that matches their accuracy requirements,” Avsec and Cheng wrote in their blog post.
However, they also acknowledged that it doesn’t indicate exactly how the variation causes disease.
The engineers cautioned that the predictions in the catalog are not intended for clinical use. Instead, they “should be interpreted with other sources of evidence.” However, “this work has the potential to improve the diagnosis of rare genetic disorders, and help discover new disease-causing genes,” they noted.
Genomics England Sees a Helpful Tool
BBC noted that AlphaMissense has been tested by Genomics England, which works with the UK’s National Health Service. “The new tool is really bringing a new perspective to the data,” Ellen Thomas, PhD, Genomics England’s Deputy Chief Medical Officer, told the BBC. “It will help clinical scientists make sense of genetic data so that it is useful for patients and for their clinical teams.”
AlphaMissense is “a big step forward,” Ewan Birney, PhD, Deputy Director General of the European Molecular Biology Laboratory (EMBL) told the BBC. “It will help clinical researchers prioritize where to look to find areas that could cause disease.”
Other experts, however, who spoke with MIT Technology Review were less enthusiastic.
“DeepMind is being DeepMind,” Insilico Medicine founder/CEO Alex Zhavoronkov, PhD, told the MIT publication. “Amazing on PR and good work on AI.”
Heidi Rehm, PhD, co-director of the Program in Medical and Population Genetics at the Broad Institute, suggested that the DeepMind engineers overstated the certainty of the model’s predictions. She told the publication that she was “disappointed” that they labeled the variants as benign or pathogenic.
“The models are improving, but none are perfect, and they still don’t get you to pathogenic or not,” she said.
“Typically, experts don’t declare a mutation pathogenic until they have real-world data from patients, evidence of inheritance patterns in families, and lab tests—information that’s shared through public websites of variants such as ClinVar,” the MIT article noted.
Is AlphaMissense a Biosecurity Risk?
Although DeepMind has released its catalog of variations, MIT Technology Review notes that the lab isn’t releasing the entire AI model due to what it describes as a “biosecurity risk.”
The concern is that “bad actors” could try using it on non-human species, DeepMind said. But one anonymous expert described the restrictions “as a transparent effort to stop others from quickly deploying the model for their own uses,” the MIT article noted.
And so, genetics research takes a huge step forward thanks to Google DeepMind, artificial intelligence, and deep learning. Clinical laboratories and pathologists may soon have useful new tools that help healthcare provider diagnose diseases. Time will tell. But the developments are certain worth watching.
—Stephen Beale
Related Information:
AlphaFold Is Accelerating Research in Nearly Every Field of Biology
A Catalogue of Genetic Mutations to Help Pinpoint the Cause of Diseases
Accurate Proteome-wide Missense Variant Effect Prediction with AlphaMissense
Google DeepMind AI Speeds Up Search for Disease Genes
DeepMind Is Using AI to Pinpoint the Causes of Genetic Disease