Genomic Scientists Are Working to Make Human Reference Genome More Inclusive by Expanding the Pangenome
Project aims to create a new pangenome for genetic testing that will ensure better clinical laboratory testing and healthcare outcomes
Recent advances in genetics are motivating some scientists to proclaim the need to update the existing “master human genome”—currently based on a single individual’s genetic sequence—to make it more inclusive. This international research effort will have implications for personalized clinical laboratory testing and precision medicine.
Genetic scientists at the Human Pangenome Reference Consortium (HPRC), a project funded by the National Human Genome Research Institute (NHGRI), are working “to sequence and assemble genomes from individuals from diverse populations in order to better represent [the] genomic landscape of diverse human populations,” according to the organization’s website.
The project plans to evaluate a wide variety of reference genomes and develop a more diverse human pangenome (a multi-genome reference sequence) that will contain a larger cross-section of the human population. The HPRC scientists will be looking at genomes from specific countries, including Denmark, Japan, South Korea, Sweden, and the United Arab Emirates, The Guardian reported.
The increased diversity of reference genetic data will enable genomic researchers to increase the accuracy of precision medicine diagnostics and clinical laboratory testing.
“One person is not representative of the world,” Pui-Yan Kwok, MD, PhD (above), Henry Bachrach Distinguished Professor, Cardiovascular Research Institute at the University of California, San Francisco, told The Guardian. “As a result, most genome sequencing is fundamentally biased.” And that bias, the researchers claim, affects the accuracy of clinical laboratory treatments and diagnostics. (Photo copyright: UCSF.)
Reference Genome for Genetic Sequencing is Based on One Person
Launched in 1990, The Human Genome Project studied all DNA in a select set of organisms. The project completed its first sequence of the human genome in 2003, which became the reference genome for thousands of genomic discoveries since then.
But there’s a problem.
Although a revolutionary breakthrough in genetic sequencing, that reference genome came from just one person. This means a significant portion of the human population is not represented in genetic research, and that bias, according to some scientists, “limits the kind of genetic variation that can be detected, leaving some patients without diagnoses and potentially without proper treatment,” according to The Guardian.
“Getting the right medicine to the right patient at the right time is the tagline,” Neil Hanchard, MD, DPhil, physician scientist and senior investigator for precision health research at the NHGRI in Bethesda, Maryland, told The Guardian.
The HPRC’s goal is to help mitigate reference biases that could hamper disease diagnoses and ensure all populations receive the best treatments for illness.
According to its website, the organization’s main purpose includes:
- Gene sequencing from a diverse set of samples with the newest technologies.
- Fostering an ecosystem of assembly and pangenome tools.
- Creating and releasing high-quality assemblies and pangenomes.
- Embedding a team of scholars to address ethical, legal, and social implications of their work.
- Forming international partnerships for the research.
HPRC Scientists Find Never-Sequenced Genetic Variants in Africa
Standard gene sequencing works by dividing DNA into tiny portions known as short reads, then sequencing and organizing the reads into a genome using an existing reference as a guide. However, this process renders larger blocks of variants, called structural variants (SVs), more difficult to read or even remain undetected, which can translate to a sequence that does not completely represent personal variations.
In 2019, the HPRC team of scientists analyzed genetic samples from 154 people from various parts of the world and discovered SV content that was missing from their reference sequence. A further study of genetic samples from 338 individuals that examined only extra inserted DNA detected the presence of almost 130,000 new sequences.
More recently, the HPRC researchers sampled 426 individuals from 50 ethnolinguistic groups from Africa and discovered a few million new single nucleotide variants (SNVs). Most of these distinct SNVs derived from populations that had not been previously sampled.
“We haven’t even touched SVs,” Hanchard told The Guardian. “But our preliminary data suggests it’s going to be more of the same.”
If an individual “is from a population quite different from the person from which the genome referenced is derived, there will be more misalignment when their short reads are mapped to the reference,” Pui-Yan Kwok, MD, PhD, Henry Bachrach Distinguished Professor, Cardiovascular Research Institute at the University of California, San Francisco, told The Guardian.
“We may miss risk variants in those regions not represented in the reference,” he added.
HPRC Receives Clearance from NHGRI to Continue Research
Hanchard recognizes the benefits of regional references in genomic sequencing and is optimistic about the future of genomics and the ability to sequence more diverse populations.
“I would love to get to a point where everyone feels represented and that this is for them, as much as it is for any particular group,” he told The Guardian. “We are from one humanity, that’s the important part.”
On February 13, the HPRC received concept clearance for renewal of the program from the NHGRI, which plans to commit up to $10 million in total costs per year for the program over the next five years.
Genetic sequencing continues to emerge as a vital tool in the diagnoses and treatment of diseases. Ensuring that as many diverse populations as possible are included in genomic research is an important element for precision medicine and optimal healthcare.
Clinical laboratory managers and pathologists will want to stay updated on these developments, because much of this new knowledge about the pangenome will need to be incorporated when interpreting genetic sequences and developing diagnoses in support of personalized medicine.
—JP Schlingman
Related Information:
The Human Genome Needs Updating. But How Do We Make It Fair?
The Human Pangenome Project: A Global Resource to Map Genomic Diversity
Human Genome Project Fact Sheet
National Advisory Council for Human Genome Research (NACHGR)