News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel

News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel
Sign In

Google DeepMind Says Its New Artificial Intelligence Tool Can Predict Which Genetic Variants Are Likely to Cause Disease

Genetic engineers at the lab used the new tool to generate a catalog of 71 million possible missense variants, classifying 89% as either benign or pathogenic

Genetic engineers continue to use artificial intelligence (AI) and deep learning to develop research tools that have implications for clinical laboratories. The latest development involves Google’s DeepMind artificial intelligence lab which has created an AI tool that, they say, can predict whether a single-letter substitution in DNA—known as a missense variant (aka, missense mutation)—is likely to cause disease.

The Google engineers used their new model—dubbed AlphaMissense—to generate a catalog of 71 million possible missense variants. They were able to classify 89% as likely to be either benign or pathogenic mutations. That compares with just 0.1% that have been classified using conventional methods, according to the DeepMind engineers.

This is yet another example of how Google is investing to develop solutions for healthcare and medical care. In this case, DeepMind might find genetic sequences that are associated with disease or health conditions. In turn, these genetic sequences could eventually become biomarkers that clinical laboratories could use to help physicians make earlier, more accurate diagnoses and allow faster interventions that improve patient care.

The Google engineers published their findings in the journal Science titled, “Accurate Proteome-wide Missense Variant Effect Prediction with AlphaMissense.” They also released the catalog of predictions online for use by other researchers.

Jun Cheng, PhD (left), and Žiga Avsec, PhD (right)

“AI tools that can accurately predict the effect of variants have the power to accelerate research across fields from molecular biology to clinical and statistical genetics,” wrote Google DeepMind engineers Jun Cheng, PhD (left), and Žiga Avsec, PhD (right), in a blog post describing the new tool. Clinical laboratories benefit from the diagnostic biomarkers generated by this type of research. (Photo copyrights: LinkedIn.)

AI’s Effect on Genetic Research

Genetic experiments to identify which mutations cause disease are both costly and time-consuming, Google DeepMind engineers Jun Cheng, PhD, and Žiga Avsec, PhD, wrote in a blog post. However, artificial intelligence sped up that process considerably.

“By using AI predictions, researchers can get a preview of results for thousands of proteins at a time, which can help to prioritize resources and accelerate more complex studies,” they noted.

Of all possible 71 million variants, approximately 6%, or four million, have already been seen in humans, they wrote, noting that the average person carries more than 9,000. Most are benign, “but others are pathogenic and can severely disrupt protein function,” causing diseases such as cystic fibrosis, sickle-cell anemia, and cancer.

“A missense variant is a single letter substitution in DNA that results in a different amino acid within a protein,” Cheng and Avsec wrote in the blog post. “If you think of DNA as a language, switching one letter can change a word and alter the meaning of a sentence altogether. In this case, a substitution changes which amino acid is translated, which can affect the function of a protein.”

In the Google DeepMind study, AlphaMissense predicted that 57% of the 71 million variants are “likely benign,” 32% are “likely pathogenic,” and 11% are “uncertain.”

The AlphaMissense model is adapted from an earlier model called AlphaFold which uses amino acid genetic sequences to predict the structure of proteins.

“AlphaMissense was fed data on DNA from humans and closely related primates to learn which missense mutations are common, and therefore probably benign, and which are rare and potentially harmful,” The Guardian reported. “At the same time, the program familiarized itself with the ‘language’ of proteins by studying millions of protein sequences and learning what a ‘healthy’ protein looks like.”

The model assigned each variant a score between 0 and 1 to rate the likelihood of pathogenicity [the potential for a pathogen to cause disease]. “The continuous score allows users to choose a threshold for classifying variants as pathogenic or benign that matches their accuracy requirements,” Avsec and Cheng wrote in their blog post.

However, they also acknowledged that it doesn’t indicate exactly how the variation causes disease.

The engineers cautioned that the predictions in the catalog are not intended for clinical use. Instead, they “should be interpreted with other sources of evidence.” However, “this work has the potential to improve the diagnosis of rare genetic disorders, and help discover new disease-causing genes,” they noted.

Genomics England Sees a Helpful Tool

BBC noted that AlphaMissense has been tested by Genomics England, which works with the UK’s National Health Service. “The new tool is really bringing a new perspective to the data,” Ellen Thomas, PhD, Genomics England’s Deputy Chief Medical Officer, told the BBC. “It will help clinical scientists make sense of genetic data so that it is useful for patients and for their clinical teams.”

AlphaMissense is “a big step forward,” Ewan Birney, PhD, Deputy Director General of the European Molecular Biology Laboratory (EMBL) told the BBC. “It will help clinical researchers prioritize where to look to find areas that could cause disease.”

Other experts, however, who spoke with MIT Technology Review were less enthusiastic.

“DeepMind is being DeepMind,” Insilico Medicine founder/CEO Alex Zhavoronkov, PhD, told the MIT publication. “Amazing on PR and good work on AI.”

Heidi Rehm, PhD, co-director of the Program in Medical and Population Genetics at the Broad Institute, suggested that the DeepMind engineers overstated the certainty of the model’s predictions. She told the publication that she was “disappointed” that they labeled the variants as benign or pathogenic.

“The models are improving, but none are perfect, and they still don’t get you to pathogenic or not,” she said.

“Typically, experts don’t declare a mutation pathogenic until they have real-world data from patients, evidence of inheritance patterns in families, and lab tests—information that’s shared through public websites of variants such as ClinVar,” the MIT article noted.

Is AlphaMissense a Biosecurity Risk?

Although DeepMind has released its catalog of variations, MIT Technology Review notes that the lab isn’t releasing the entire AI model due to what it describes as a “biosecurity risk.”

The concern is that “bad actors” could try using it on non-human species, DeepMind said. But one anonymous expert described the restrictions “as a transparent effort to stop others from quickly deploying the model for their own uses,” the MIT article noted.

And so, genetics research takes a huge step forward thanks to Google DeepMind, artificial intelligence, and deep learning. Clinical laboratories and pathologists may soon have useful new tools that help healthcare provider diagnose diseases. Time will tell. But the developments are certain worth watching.

—Stephen Beale

Related Information:

AlphaFold Is Accelerating Research in Nearly Every Field of Biology

A Catalogue of Genetic Mutations to Help Pinpoint the Cause of Diseases

Accurate Proteome-wide Missense Variant Effect Prediction with AlphaMissense

Google DeepMind AI Speeds Up Search for Disease Genes

DeepMind Is Using AI to Pinpoint the Causes of Genetic Disease

DeepMind’s New AI Can Predict Genetic Diseases

Genomics England Increases Goal of Whole Genome Sequencing Project from 100,000 to 500,000 Sequences in Five Years

Genomic sequencing continues to benefit patients through precision medicine clinical laboratory treatments and pharmacogenomic therapies

EDITOR’S UPDATE—Jan. 26, 2022: Since publication of this news briefing, officials from Genomics England contacted us to explain the following:

  • The “five million genome sequences” was an aspirational goal mentioned by then Secretary of State for Health and Social Care Matt Hancock, MP, in an October 2, 2018, press release issued by Genomics England.
  • As of this date a spokesman for Genomics England confirmed to Dark Daily that, with the initial goal of 100,000 genomes now attained, the immediate goal is to sequence 500,000 genomes.
  • This goal was confirmed in a tweet posted by Chris Wigley, CEO at Genomics England.

In accordance with this updated input, we have revised the original headline and information in this news briefing that follows.

What better proof of progress in whole human genome screening than the announcement that the United Kingdom’s 100,000 Genome Project has not only achieved that milestone, but will now increase the goal to 500,000 whole human genomes? This should be welcome news to clinical laboratory managers, as it means their labs will be positioned as the first-line provider of genetic data in support of clinical care.

Many clinical pathologists here in the United States are aware of the 100,000 Genome Project, established by the National Health Service (NHS) in England (UK) in 2012. Genomics England’s new goal to sequence 500,000 whole human genomes is to pioneer a “lasting legacy for patients by introducing genomic sequencing into the wider healthcare system,” according to Technology Networks.

The importance of personalized medicine and of the power of precise, accurate diagnoses cannot be understated. This announcement by Genomics England will be of interest to diagnosticians worldwide, especially doctors who diagnose and treat patients with chronic and life-threatening diseases.

Building a Vast Genomics Infrastructure

Genetic sequencing launched the era of precision medicine in healthcare. Through genomics, drug therapies and personalized treatments were developed that improved outcomes for all patients, especially those suffering with cancer and other chronic diseases. And so far, the role of genomics in healthcare has only been expanding, as Dark Daily covered in numerous ebriefings.

In the US, the National Institute of Health’s (NIH’s) Human Genome Project sequenced the first whole genome in 2003. That achievement opened the door to a new era of precision medicine.

Genomics England, which is wholly owned by the Department of Health and Social Care in the United Kingdom, was formed in 2012 with the goal of sequencing 100,000 whole genomes of patients enrolled in the UK National Health Service. That goal was met in 2018, and now the NHS aspires to sequence 500,000 genomes.

Richard Scott, MD, PhD

“The last 10 years have been really exciting, as we have seen genetic data transition from being something that is useful in a small number of contexts with highly targeted tests, towards being a central part of mainstream healthcare settings,” Richard Scott, MD, PhD (above), Chief Medical Officer at Genomics England told Technology Networks. Much of the progress has found its way into clinical laboratory testing and precision medicine diagnostics. (Photo copyright: Genomics England.)

Genomics England’s initial goals included:

  • To create an ethical program based on consent,
  • To set up a genomic medicine service within the NHS to benefit patients,
  • To make new discoveries and gain insights into the use of genomics, and
  • To begin the development of a UK genomics industry.

To gain the greatest benefit from whole genome sequencing (WGS), a substantial amount of data infrastructure must exist. “The amount of data generated by WGS is quite large and you really need a system that can process the data well to achieve that vision,” said Richard Scott, MD, PhD, Chief Medical Officer at Genomics England.

In early 2020, Weka, developer of the WekaFS, a fully parallel and distributed file system, announced that it would be working with Genomics England on managing the enormous amount of genomic data. When Genomics England reached 100,000 sequenced genomes, it had already gathered 21 petabytes of data. The organization expects to have 140 petabytes by 2023, notes a Weka case study.

Putting Genomics England’s WGS Project into Action

WGS has significantly impacted the diagnosis of rare diseases. For example, Genomics England has contributed to projects that look at tuberculosis genomes to understand why the disease is sometimes resistant to certain medications. Genomic sequencing also played an enormous role in fighting the COVID-19 pandemic.

Scott notes that COVID-19 provides an example of how sequencing can be used to deliver care. “We can see genomic influences on the risk of needing critical care in COVID-19 patients and in how their immune system is behaving. Looking at this data alongside other omics information, such as the expression of different protein levels, helps us to understand the disease process better,” he said.

What’s Next for Genomics Sequencing?

As the research continues and scientists begin to better understand the information revealed by sequencing, other areas of scientific study like proteomics and metabolomics are becoming more important.

“There is real potential for using multiple strands of data alongside each other, both for discovery—helping us to understand new things about diseases and how [they] affect the body—but also in terms of live healthcare,” Scott said.

Along with expanding the target of Genomics England to 500,000 genomes sequenced, the UK has published a National Genomic Strategy named Genome UK. This plan describes how the research into genomics will be used to benefit patients. “Our vision is to create the most advanced genomic healthcare ecosystem in the world, where government, the NHS, research and technology communities work together to embed the latest advances in patient care,” according to the Genome UK website.

Clinical laboratories professionals with an understanding of diagnostics will recognize WGS’ impact on the healthcare industry. By following genomic sequencing initiatives, such as those coming from Genomics England, pathologists can keep their labs ready to take advantage of new discoveries and insights that will improve outcomes for patients.

Dava Stewart

Related Information:

The 100,000 Genomes Project

Genome Sequencing in Modern Medicine: An Interview with Genomics England

WekaIO Accelerates Five Million Genomes Project at Genomics England

Genomics England Improved Scale and Performance for On-Premises Cluster

Whole Genome Sequencing Increases Rare Disorder Diagnosis by 31%

Genome UK: The Future of Healthcare

;