News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel

News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel
Sign In

Google DeepMind Says Its New Artificial Intelligence Tool Can Predict Which Genetic Variants Are Likely to Cause Disease

Genetic engineers at the lab used the new tool to generate a catalog of 71 million possible missense variants, classifying 89% as either benign or pathogenic

Genetic engineers continue to use artificial intelligence (AI) and deep learning to develop research tools that have implications for clinical laboratories. The latest development involves Google’s DeepMind artificial intelligence lab which has created an AI tool that, they say, can predict whether a single-letter substitution in DNA—known as a missense variant (aka, missense mutation)—is likely to cause disease.

The Google engineers used their new model—dubbed AlphaMissense—to generate a catalog of 71 million possible missense variants. They were able to classify 89% as likely to be either benign or pathogenic mutations. That compares with just 0.1% that have been classified using conventional methods, according to the DeepMind engineers.

This is yet another example of how Google is investing to develop solutions for healthcare and medical care. In this case, DeepMind might find genetic sequences that are associated with disease or health conditions. In turn, these genetic sequences could eventually become biomarkers that clinical laboratories could use to help physicians make earlier, more accurate diagnoses and allow faster interventions that improve patient care.

The Google engineers published their findings in the journal Science titled, “Accurate Proteome-wide Missense Variant Effect Prediction with AlphaMissense.” They also released the catalog of predictions online for use by other researchers.

Jun Cheng, PhD (left), and Žiga Avsec, PhD (right)

“AI tools that can accurately predict the effect of variants have the power to accelerate research across fields from molecular biology to clinical and statistical genetics,” wrote Google DeepMind engineers Jun Cheng, PhD (left), and Žiga Avsec, PhD (right), in a blog post describing the new tool. Clinical laboratories benefit from the diagnostic biomarkers generated by this type of research. (Photo copyrights: LinkedIn.)

AI’s Effect on Genetic Research

Genetic experiments to identify which mutations cause disease are both costly and time-consuming, Google DeepMind engineers Jun Cheng, PhD, and Žiga Avsec, PhD, wrote in a blog post. However, artificial intelligence sped up that process considerably.

“By using AI predictions, researchers can get a preview of results for thousands of proteins at a time, which can help to prioritize resources and accelerate more complex studies,” they noted.

Of all possible 71 million variants, approximately 6%, or four million, have already been seen in humans, they wrote, noting that the average person carries more than 9,000. Most are benign, “but others are pathogenic and can severely disrupt protein function,” causing diseases such as cystic fibrosis, sickle-cell anemia, and cancer.

“A missense variant is a single letter substitution in DNA that results in a different amino acid within a protein,” Cheng and Avsec wrote in the blog post. “If you think of DNA as a language, switching one letter can change a word and alter the meaning of a sentence altogether. In this case, a substitution changes which amino acid is translated, which can affect the function of a protein.”

In the Google DeepMind study, AlphaMissense predicted that 57% of the 71 million variants are “likely benign,” 32% are “likely pathogenic,” and 11% are “uncertain.”

The AlphaMissense model is adapted from an earlier model called AlphaFold which uses amino acid genetic sequences to predict the structure of proteins.

“AlphaMissense was fed data on DNA from humans and closely related primates to learn which missense mutations are common, and therefore probably benign, and which are rare and potentially harmful,” The Guardian reported. “At the same time, the program familiarized itself with the ‘language’ of proteins by studying millions of protein sequences and learning what a ‘healthy’ protein looks like.”

The model assigned each variant a score between 0 and 1 to rate the likelihood of pathogenicity [the potential for a pathogen to cause disease]. “The continuous score allows users to choose a threshold for classifying variants as pathogenic or benign that matches their accuracy requirements,” Avsec and Cheng wrote in their blog post.

However, they also acknowledged that it doesn’t indicate exactly how the variation causes disease.

The engineers cautioned that the predictions in the catalog are not intended for clinical use. Instead, they “should be interpreted with other sources of evidence.” However, “this work has the potential to improve the diagnosis of rare genetic disorders, and help discover new disease-causing genes,” they noted.

Genomics England Sees a Helpful Tool

BBC noted that AlphaMissense has been tested by Genomics England, which works with the UK’s National Health Service. “The new tool is really bringing a new perspective to the data,” Ellen Thomas, PhD, Genomics England’s Deputy Chief Medical Officer, told the BBC. “It will help clinical scientists make sense of genetic data so that it is useful for patients and for their clinical teams.”

AlphaMissense is “a big step forward,” Ewan Birney, PhD, Deputy Director General of the European Molecular Biology Laboratory (EMBL) told the BBC. “It will help clinical researchers prioritize where to look to find areas that could cause disease.”

Other experts, however, who spoke with MIT Technology Review were less enthusiastic.

“DeepMind is being DeepMind,” Insilico Medicine founder/CEO Alex Zhavoronkov, PhD, told the MIT publication. “Amazing on PR and good work on AI.”

Heidi Rehm, PhD, co-director of the Program in Medical and Population Genetics at the Broad Institute, suggested that the DeepMind engineers overstated the certainty of the model’s predictions. She told the publication that she was “disappointed” that they labeled the variants as benign or pathogenic.

“The models are improving, but none are perfect, and they still don’t get you to pathogenic or not,” she said.

“Typically, experts don’t declare a mutation pathogenic until they have real-world data from patients, evidence of inheritance patterns in families, and lab tests—information that’s shared through public websites of variants such as ClinVar,” the MIT article noted.

Is AlphaMissense a Biosecurity Risk?

Although DeepMind has released its catalog of variations, MIT Technology Review notes that the lab isn’t releasing the entire AI model due to what it describes as a “biosecurity risk.”

The concern is that “bad actors” could try using it on non-human species, DeepMind said. But one anonymous expert described the restrictions “as a transparent effort to stop others from quickly deploying the model for their own uses,” the MIT article noted.

And so, genetics research takes a huge step forward thanks to Google DeepMind, artificial intelligence, and deep learning. Clinical laboratories and pathologists may soon have useful new tools that help healthcare provider diagnose diseases. Time will tell. But the developments are certain worth watching.

—Stephen Beale

Related Information:

AlphaFold Is Accelerating Research in Nearly Every Field of Biology

A Catalogue of Genetic Mutations to Help Pinpoint the Cause of Diseases

Accurate Proteome-wide Missense Variant Effect Prediction with AlphaMissense

Google DeepMind AI Speeds Up Search for Disease Genes

DeepMind Is Using AI to Pinpoint the Causes of Genetic Disease

DeepMind’s New AI Can Predict Genetic Diseases

Proteomics-based Clinical Laboratory Testing May Get a Major Boost as Google’s DeepMind Research Lab Is Making Public Its Entire AI Database of Human Protein Predictions

DeepMind hopes its unrivaled collection of data, enabled by artificial intelligence, may advance development of precision medicines, new medical laboratory tests, and therapeutic treatments

‘Tis the season for giving, and one United Kingdom-based artificial intelligence (AI) research laboratory is making a sizeable gift. After using AI and machine learning to create “the most comprehensive map of human proteins,” in existence, DeepMind, a subsidiary of Alphabet Inc. (NASDAQ:GOOGL), parent company of Google, plans to give away for free its database of millions of protein structure predictions to the global scientific community and to all of humanity, The Verge reported.

Pathologists and clinical laboratory scientists developing proteomic assays understand the significance of this gesture. They know how difficult and expensive it is to determine protein structures using sequencing of amino acids. That’s because the various types of amino acids in use cause the [DNA] string to “fold.” Thus, the availability of this data may accelerate the development of more diagnostic tests based on proteomics.

“For decades, scientists have been trying to find a method to reliably determine a protein’s structure just from its sequence of amino acids. Attraction and repulsion between the 20 different types of amino acids cause the string to fold in a feat of ‘spontaneous origami,’ forming the intricate curls, loops, and pleats of a protein’s 3D structure. This grand scientific challenge is known as the protein-folding problem,” a DeepMind statement noted.

Enter DeepMind’s AlphaFold AI platform to help iron things out. “Experimental techniques for determining structures are painstakingly laborious and time consuming (sometimes taking years and millions of dollars). Our latest version [of AlphaFold] can now predict the shape of a protein, at scale and in minutes, down to atomic accuracy. This is a significant breakthrough and highlights the impact AI can have on science,” DeepMind stated.

Release of Data Will Be ‘Transformative’

In July, DeepMind announced it would begin releasing data from its AlphaFold Protein Structure Database which contains “predictions for the structure of some 350,000 proteins across 20 different organisms,” The Verge reported, adding, “Most significantly, the release includes predictions for 98% of all human proteins, around 20,000 different structures, which are collectively known as the human proteome. By the end of the year, DeepMind hopes to release predictions for 100 million protein structures.”

According to Edith Heard, PhD, Director General of the European Molecular Biology Laboratory (EMBL), the open release of such a dataset will be “transformative for our understanding of how life works,” The Verge reported.  

Demis Hassabis

“I see this as the culmination of the entire 10-year-plus lifetime of DeepMind,” company CEO and co-founder Demis Hassabis (above), told The Verge. “From the beginning, this is what we set out to do: to make breakthroughs in AI, test that on games like Go and Atari, [and] apply that to real-world problems, to see if we can accelerate scientific breakthroughs and use those to benefit humanity.” The release of DeepMind’s entire protein prediction database will certainly do that. Clinical laboratory scientists worldwide will have free access to use it in developing new precision medicine treatments based on proteomics. (Photo copyright: BBC.)

Free Data about Proteins Will Accelerate Research on Diseases, Treatments

Research into how protein folds and, thereby, functions could have implications to fighting diseases and developing new medicines, according to DeepMind. 

“This will be one of the most important datasets since the mapping of the human genome,” said Ewan Birney, PhD, Deputy Director General of the EMBL, in the DeepMind statement. EMBL worked with DeepMind on the dataset.

DeepMind protein prediction data are already being used by scientists in medical research. “Anyone can use it for anything. They just need to credit the people involved in the citation,” said Demis Hassabis, DeepMind CEO and Co-founder, in The Verge.

In a blog article, Hassabis listed several projects and organizations already using AlphaFold. They include:

“As researchers seek cures for diseases and pursue solutions to other big problems facing humankind—including antibiotic resistance, microplastic pollution, and climate change—they will benefit from fresh insights in the structure of proteins,” Hassabis wrote.

Because of the deep financial backing that Alphabet/Google can offer, it is reasonable to predict that DeepMind will make progress with its AI technology that regularly adds capabilities and accuracy, allowing AlphaFold to be effective for many uses.

This will be particularly true for the development of new diagnostic assays that will give clinical laboratories better tools for diagnosing disease earlier and more accurately.

—Donna Marie Pocius

Related Information:

DeepMind Creates ‘Transformative’ Map of Human Proteins Drawn by Artificial Intelligence

AlphaFold Can Accurately Predict 3D Models of Protein Structures and Has the Potential to Accelerate Research in Every Field of Biology

Putting the Power of AlphaFold into the World’s Hands

Highly Accurate Protein Structure Prediction with AlphaFold

;