Expanded genomic dataset includes a wider racial diversity which may lead to improved diagnostics and clinical laboratory tests
Human genomic research has taken another important step forward. The National Institutes of Health’s All of Us research program has reached a milestone of 250,000 collected whole genome sequences. This accomplishment could escalate research and development of new diagnostics and therapeutic biomarkers for clinical laboratory tests and prescription drugs.
The NIH’s All of Us program “has significantly expanded its data to now include nearly a quarter million whole genome sequences for broad research use. About 45% of the data was donated by people who self-identify with a racial or ethnic group that has been historically underrepresented in medical research,” the news release noted.
“For years, the lack of diversity in genomic datasets has limited our understanding of human health,” said Andrea Ramirez, MD, Chief Data Officer, All of Us Research Program, in the news release. Clinical laboratories performing genetic testing may look forward to new biomarkers and diagnostics due to the NIH’s newly expanded gene sequencing data set. (Photo copyright: Vanderbilt University.)
Diverse Genomic Data is NIH’s Goal
NIH launched the All of Us genomic sequencing program in 2018. Its aim is to involve more than one million people from across the country and reflect national diversity in its database.
So far, the program has grown to include 413,450 individuals, with 45% of participants self-identifying “with a racial or ethnic group that has been historically under-represented in medical research,” NIH said.
“By engaging participants from diverse backgrounds and sharing a more complete picture of their lives—through genomic, lifestyle, clinical, and social environmental data—All of Us enables researchers to begin to better pinpoint the drivers of disease,” said Andrea Ramirez, MD, Chief Data Officer of the All of Us research program, in the news release.
More than 5,000 researchers are currently registered to use NIH’s All of Us genomic database. The vast resource contains the following data:
245,350 whole genome sequences, which includes “variation at more than one billion locations, about one-third of the entire human genome.”
1,000 long-read genome sequences to enable “a more complete understanding of the human genome.”
“Through a partnership with participants, researchers, and diverse communities across the country, we are seeing incredible progress towards powering scientific discoveries that can lead to a healthier future for all of us,” said Josh Denny, MD, Chief Executive Officer, All of Us Research Program, in the news release.
“[Researchers] can get access to the tools and the data they need to conduct a project with our resources in as little as two hours once their institutional data use agreement is signed,” said Fornessa Randal, Executive Director, Center for Asian Health Equity, University of Chicago, in a YouTube video about Researcher Workbench.
Given the large number of mutations found in the SARS-CoV-2 Omicron variant, experts in South Africa speculate it likely evolved in someone with a compromised immune system
As the SARS-CoV-2 Omicron variant spreads around the United States and the rest of the world, infectious disease experts in South Africa have been investigating how the variant developed so many mutations. One hypothesis is that it evolved over time in the body of an immunosuppressed person, such as a cancer patient, transplant recipient, or someone with uncontrolled human immunodeficiency virus infection (HIV).
One interesting facet in the story of how the Omicron variant was being tracked as it emerged in South Africa is the role of several medical laboratories in the country that reported genetic sequences associated with Omicron. This allowed researchers in South Africa to more quickly identify the growing range of mutations found in different samples of the Omicron virus.
“In someone where immunity is suppressed, then we see virus persisting,” she added. “And it doesn’t just sit around, it replicates. And as it replicates it undergoes potential mutations. And in somebody where immunity is suppressed that virus may be able to continue for many months—mutating as it goes.”
Multiple factors can suppress the immune system, experts say, but some are pointing to HIV as a possible culprit given the likelihood that the variant emerged in sub-Saharan Africa, which has a high population of people living with HIV.
Li “was among the first to detail extensive coronavirus mutations in an immunosuppressed patient,” the LA Times reported. “Under attack by HIV, their T cells are not providing vital support that the immune system’s B cells need to clear an infection.”
Omicron Spreads Rapidly in the US
Genomics surveillance Data from the CDC’s SARS-CoV-2 Tracking system indicates that on Dec. 11, 2021, Omicron accounted for about 7% of the SARS-CoV-2 variants in circulation, the agency reported. But by Dec. 25, the number had jumped to nearly 60%. The data is based on sequencing of SARS-CoV-2 by the agency as well as commercial clinical laboratories and academic laboratories.
Experts have pointed to several likely factors behind the variant’s high rate of transmission. The biggest factor, NPR reported, appears to be the large number of mutations on the spike protein, which the virus uses to attach to human cells. This gives the virus an advantage in evading the body’s immune system, even in people who have been vaccinated.
One study from Norway cited by NPR suggests that Omicron has a shorter incubation period than other variants, which would increase the transmission rate. And researchers have found that it multiplies more rapidly than the Delta variant in the upper respiratory tract, which could facilitate spread when people exhale.
Using Genomics Testing to Determine How Omicron Evolved
But how did the Omicron variant accumulate so many mutations? In a story for The Atlantic, virologist Jesse Bloom, PhD, Professor, Basic Sciences Division, at the Fred Hutchinson Cancer Research Center in Seattle, described Omicron as “a huge jump in evolution,” one that researchers expected to happen “over the span of four or five years.”
Hence the speculation that it evolved in an immunosuppressed person, perhaps due to HIV, though that’s not the only theory. Another is “that the virus infected animals of some kind, acquired lots of mutations as it spread among them, and then jumped back to people—a phenomenon known as reverse zoonosis,” New Scientist reported.
The Network for Genomic Surveillance, he told The New Yorker, consists of multiple facilities around the country. Team members noticed what he described as a “small uptick” in COVID cases in Gauteng, so on Nov. 19 they decided to step up genomic surveillance in the province. One private clinical laboratory in the network submitted “six genomes of a very mutated virus,” he said. “And, when we looked at the genomes, we got quite worried because they discovered a failure of one of the probes in the PCR testing.”
Looking at national data, the scientists saw that the same failure was on the rise in PCR (Polymerase chain reaction) tests, prompting a request for samples from other medical laboratories. “We got over a hundred samples from over thirty clinics in Gauteng, and we started genotyping, and we analyzed the mutation of the virus,” he told The New Yorker. “We linked all the data with the PCR dropout, the increase of cases in South Africa and of the positivity rate, and then we began to see it might be a very suddenly emerging variant.”
Oliveira’s team first reported the emergence of the new variant to the World Health Organization, on Nov. 24. Two days later, the WHO issued a statement that named the newly classified Omicron variant (B.1.1.529) a “SARS-CoV-2 Variant of Concern.”
Microbiologists and clinical laboratory specialists in the US should keep close watch on Omicron research coming out of South Africa. Fortunately, scientists today have tools to understand the genetic makeup of viruses that did not exist at the time of SARS 2003, Swine flu 2008/9, MERS 2013.
“This represents another important landmark for both the program and for Personalis,” said John West, Chief Executive Officer, Personalis, in a news release. “We congratulate the VA MVP for reaching this important milestone.
“We strongly believe that the research projects being performed today will enable precision medicine in healthcare systems in the future across a wide range of disease areas,” he added. This is a positive development for clinical laboratories, as personalized medicine services require a lab to sequence and interpret the patient’s DNA.
Personalis was contracted with the US federal government to perform genetic research in 2012 and has delivered 50,000 genomes to the VA MVP during the past twelve months.
The Personalis and VA MVP researchers seek to gain a better understanding of how genetic variants affect health. Before the COVID-19 pandemic hit the US, the VA was enrolling veterans in the Million Veterans Program at 63 VA medical centers across the country. There are currently about 830,000 veterans enrolled in the venture and the VA is expecting two million veterans to eventually sign up for the sequencing project.
“As a global leader in genomic sequencing and comprehensive analytics services, Personalis is uniquely suited to lead these population-scale efforts and we are currently in the process of expanding our business operations internationally,” West added.
According to the press release, “the VA MVP provides researchers with a rich resource of genetic, health, lifestyle, and military-exposure data collected from questionnaires, medical records, and genetic analyses. By combining this information into a single database, the VA MVP promises to advance knowledge about the complex links between genes and health.”
NIH All of Us Research Program Supports Precision Medicine Goals Another genetic research project being conducted by the US National Institutes of Health (NIH) is the All of Us Research Program. Using donated personal health information from thousands of participants, the NIH researchers seek to “learn how our biology, lifestyle, and environment affect health,” according to the program’s website.
The All of Us Research Program intends to have at least one million US participants take part in the research. The researchers hope to help scientists discover new knowledge regarding how biological, environmental, and behavioral factors influence health, and to learn to tailor healthcare to patients’ specific medical needs, a key component of precision medicine.
Participants in the project share personal information via a variety of methods, including surveys, electronic health records, and biological samples.
A Better Sampling of Under-Represented Communities
Since opening enrollment in 2018, more than 270,000 people have contributed blood, urine, and saliva samples to the All of Us Research Program. More than 80% of the participants come from communities that are traditionally under-represented in biomedical research.
“We need programs like All of Us to build diverse datasets so that research findings ultimately benefit everyone,” said Brad Ozenberger, PhD, Genomics Program Director, All of Us, in the NIH news release. “Too many groups have been left out of research in the past, so much of what we know about genomics is based mainly on people of European ancestry. And often, genomic data are explored without critical context like environment, economics, and other social determinants of health. We’re trying to help change that, enabling the entire research community to help fill in these knowledge gaps.”
The All of Us Research Project’s analysis of the collected data includes both whole-genome sequencing (WGS) and genotyping and is taking a phased approach in returning genetic data to participants.
Participants initially receive data about their genetic ancestry and traits. That is followed later by health-related results, such as how their genetic variants may increase the risk of certain diseases and how their DNA may affect their reaction to drug therapies.
Genetic researchers hope programs like these will lead to improved in vitro diagnostics and drug therapies. Genetic sequencing also may lead to new diagnostic and therapeutic biomarkers for clinical laboratories.
Results of the UK study confirm for clinical laboratory professionals the importance of fully understanding the design and function of SNP chips they may be using in their labs
Here is another example of a long-established clinical laboratory test that—upon new evidence—turns out to be not as accurate as once thought. According to research conducted at the University of Exeter in Devon, UK, Single-nucleotide polymorphism (SNP) chips (aka, SNP microarrays)—technology commonly used in commercial genetic testing—is inadequate at detecting rare gene variants that can increase breast cancer risk.
A news release announcing the results of the large-scale study states, “A technology that is widely used by commercial genetic testing companies is ‘extremely unreliable’ in detecting very rare variants, meaning results suggesting individuals carry rare disease-causing genetic variants are usually wrong.”
Why is this a significant finding for clinical laboratories? Because medical laboratories performing genetic tests that use SNP chips should be aware that rare genetic variants—which are clinically relevant to a patient’s case—may not be detected and/or reported by the tests they are running.
UK Researchers Find ‘Shockingly High False Positives’
The conclusion reached by the Exeter researchers, the BMJ study states, is that “SNP chips are extremely unreliable for genotyping very rare pathogenic variants and should not be used to guide health decisions without validation.”
Leigh Jackson, PhD, Lecturer in Genomic Medicine at University of Exeter and co-author of the BMJ study, said in the news release, “The number of false positives on rare genetic variants produced by SNP chips was shockingly high. To be clear: a very rare, disease-causing variant detected using [an] SNP chip is more likely to be wrong than right.”
Large-Scale Study Taps UK Biobank Data
The Exeter researchers were concerned about cases of unnecessary invasive medical procedures being scheduled by women after learning of rare genetic variations in BRCA1 (breast cancer type 1) and BRCA2 (breast cancer 2) tests.
“The inherent technical limitation of SNP chips for correctly detecting rare genetic variants is further exacerbated when the variants themselves are linked to very rare diseases. As with any diagnostic test, the positive predictive value for low prevalence conditions will necessarily be low in most individuals. For pathogenic BRCA variants in the UK Biobank, the SNP chips had an extremely low positive predictive value (1-17%) when compared with sequencing. Were these results to be fed back to individuals, the clinical implications would be profound. Women with a positive BRCA result face a lifetime of additional screening and potentially prophylactic surgery that is unwarranted in the case of a false positive result,” they wrote.
Using UK Biobank data from 49,908 participants (55% were female), the researchers compared next-generation sequencing (NGS) to SNP chip genotyping. They found that SNP chips—which test genetic variation at hundreds-of-thousands of specific locations across the genome—performed well when compared to NGS for common variants, such as those related to type 2 diabetes and ancestry assessment, the study noted.
“Because SNP chips are such a widely used and high-performing assay for common genetic variants, we were also surprised that the differing performance of SNP chips for detecting rare variants was not well appreciated in the wider research or medical communities. Luckily, we had recently received both SNP chip and genome-wide DNA sequencing data on 50,000 individuals through the UK Biobank—a population cohort of adult volunteers from across the UK. This large dataset allowed us to systematically investigate the performance of SNP chips across millions of genetic variants with a wide range of frequencies, down to those present in fewer than 1 in 50,000 individuals,” wrote Wright and Associate Professor of Bioinformatics and Human Genetics at Exeter, Michael Weedon, PhD, in a BMJ blog post.
The Exeter researchers also analyzed data from a small group of people in the Personal Genome Project who had both SNP genotyping and sequencing information available. They focused their analysis on rare pathogenic variants in BRCA1 and BRCA2 genes.
The researchers found:
The rarer the variant, the less reliable the test result. For example, for “very rare variants” in less than one in 100,000 people, 84% found by SNP chips were false positives.
Low positive predictive values of about 16% for very rare variants in the UK Biobank.
Nearly all (20 of 21) customers of commercial genetic testing had at least one false positive rare disease-causing variant incorrectly genotyped.
SNP chips detect common genetic variants “extremely well.”
Advantages and Capabilities of SNP Chips
Compared to next-gen genetic sequencing, SNP chips are less costly. The chips use “grids of hundreds of thousands of beads that react to specific gene variants by glowing in different colors,” New Scientist explained.
Common variants of BRCA1 and BRCA2 can be found using SNP chips with 99% accuracy, New Scientist reported based on study data.
However, when the task is to find thousands of rare variants in BRCA1 and BRCA2 genes, SNP chips do not fare so well.
“It is just not the right technology for the job when it comes to rare variants. They’re excellent for the common variants that are present in lots of people. But the rarer the variant is, the less likely they are to be able to correctly detect it,” Wright told CNN.
SNP chips can’t detect all variants because they struggle to cluster needed data, the Exeter researchers explained.
“SNP chips perform poorly for genotyping rare genetic variants owing to their reliance on data clustering. Clustering data from multiple individuals with similar genotypes works very well when variants are common,” the researchers wrote. “Clustering becomes more difficult as the number of people with a particular genotype decreases.”
Clinical laboratories Using SNP Chips
The researchers at Exeter unveiled important information that pathologists and medical laboratory professionals will want to understand and monitor. Cancer patients with rare genetic variants may not be diagnosed accurately because SNP chips were not designed to identify specific genetic variants. Those patients may need additional testing to validate diagnoses and prevent harm.
In December, cancer genomics company Personalis, Inc. (NASDAQ:PSNL) of Menlo Park, Calif., achieved a milestone and delivered its 100,000th whole human genome sequence to the MVP, according to a news release, which also states that Personalis is the sole sequencing provider to the MVP.
The VA’s MVP program, which started in 2011, has 850,000 enrolled veterans and is expected to eventually involve two million people. The VA’s aim is to explore the role genes, lifestyle, and military experience play in health and human illness, notes the VA’s MVP website.
Health conditions affecting veterans the MVP is researching include:
The VA has contracted with Personalis through September 2021, and has invested $175 million, Clinical OMICS reported. Personalis has earned approximately $14 million from the VA. That’s about 76% of the company’s revenue, according to 2nd quarter data, Clinical OMICS noted.
Database of Veterans’ Genomes Used in Current Research
What has the VA gained from their investment so far? An MVP fact sheet states researchers are tapping MVP data for these and other veteran health-related studies:
Differentiating between prostate cancer tumors that require treatment and others that are slow-growing and not life-threatening.
How genetics drives obesity, diabetes, and heart disease.
How data in DNA translates into actual physiological changes within the body.
Gene variations and patients’ response to Warfarin.
NIH Research Program Studies Effects of Genetics on Health
Another research program, the National Institutes of Health’s All of Us study, recently began returning results to its participants who provided blood, urine, and/or saliva samples. The NIH aims to aid research into health outcomes influenced by genetics, environment, and lifestyle, explained a news release. The program, launched in 2018, has biological samples from more than 270,000 people with a goal of one million participants.
The news release notes that more than 80% of biological samples in the All of Us database come from people in communities that have been under-represented in biomedical research.
“We need programs like All of Us to build diverse datasets so that research findings ultimately benefit everyone,” said Brad Ozenberger, PhD, All of Us Genomics Program Director, in the news release.
Precision medicine designed for specific healthcare populations is a goal of the All of Us program.
“[All of Us is] beneficial to all Americans, but actually beneficial to the African American race because a lot of research and a lot of medicines that we are taking advantage of today, [African Americans] were not part of the research,” Chris Crawford, All of US Research Study Navigator, told the Birmingham Times. “As [the All of Us study] goes forward and we get a big diverse group of people, it will help as far as making medicine and treatment that will be more precise for us,” he added.
Large Databases Could Advance Care
Genome sequencing technology continues to improve. It is faster, less complicated, and cheaper to sequence a whole human genome than ever before. And the resulting sequence is more accurate.
Thus, as human genome sequencing databases grow, researchers are deriving useful scientific insights from the data. This is relevant for clinical laboratories because the new insights from studying bigger databases of genomic information will produce new diagnostic and therapeutic biomarkers that can be the basis for new clinical laboratory tests as well as useful diagnostic assays for anatomic pathologists.