Oxford University Creates Largest Ever Human Evolutionary Family Tree with 231 Million Ancestral Lineages
Researchers say their method can trace ancestry back 100,000 years and could lay groundwork for identifying new genetic markers for diseases that could be used in clinical laboratory tests
Cheaper, faster, and more accurate genomic sequencing technologies are deepening scientific knowledge of the human genome. Now, UK researchers at the University of Oxford have used this genomic data to create the largest-ever human family tree, enabling individuals to trace their ancestry back 100,000 years. And, they say, it could lead to new methods for predicting disease.
This new database also will enable genealogists and medical laboratory scientists to track when, where, and in what populations specific genetic mutations emerged that may be involved in different diseases and health conditions.
New Genetic Markers That Could Be Used for Clinical Laboratory Testing
As this happens, it may be possible to identify new diagnostic biomarkers and genetic indicators associated with specific health conditions that could be incorporated into clinical laboratory tests and precision medicine treatments for chronic diseases.
“We have basically built a huge family tree—a genealogy for all of humanity—that models as exactly as we can the history that generated all the genetic variation we find in humans today,” said Yan Wong, DPhil, an evolutionary geneticist at the Big Data Institute (BDI) at the University of Oxford, in a news release. “This genealogy allows us to see how every person’s genetic sequence relates to every other, along all the points of the genome.”
Researchers from University of Oxford’s BDI in London, in collaboration with scientists from the Broad Institute of MIT and Harvard; Harvard University, and University of Vienna, Austria, developed algorithms for combining different databases and scaling to accommodate millions of gene sequences from both ancient and modern genomes.
The researchers published their findings in the journal Science, titled, “A Unified Genealogy of Modern and Ancient Genomes.”
Tracking Genetic Markers of Disease
The BDI team overcame the major obstacle to tracing the origins of human genetic diversity when they developed algorithms to handle the massive amount of data created when combining genome sequences from many different databases. In total, they compiled the genomic sequences of 3,601 modern and eight high-coverage ancient people from 215 populations in eight datasets.
The University of Oxford researchers noted in their news release that their method could be scaled to “accommodate millions of genome sequences.”
“This structure is a lossless and compact representation of 27 million ancestral haplotype fragments and 231 million ancestral lineages linking genomes from these datasets back in time. The tree sequence also benefits from the use of an additional 3,589 ancient samples compiled from more than 100 publications to constrain and date relationships,” the researchers wrote in their published study.
Wong believes his research team has laid the groundwork for the next generation of DNA sequencing.
“As the quality of genome sequences from modern and ancient DNA samples improves, the tree will become even more accurate and we will eventually be able to generate a single, unified map that explains the descent of all the human genetic variation we see today,” he said in the news release.
Developing New Clinical Laboratory Biomarkers for Modern Diagnostics
In a video illustrating the study’s findings, evolutionary geneticist Yan Wong, DPhil, a member of the BDI team, said, “If you wanted to know why some people have some sort of medical conditions, or are more predisposed to heart attacks or, for example, are more susceptible to coronavirus, then there’s a huge amount of that described by their ancestry because they’ve inherited their DNA from other people.”
Wohns agrees that the significance of their tree-recording methods extends beyond simply a better understanding of human evolution.
“[This study] could be particularly beneficial in medical genetics, in separating out true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history,” he said.
The underlying methods developed by Wohns’ team could have widespread applications in medical research and lay the groundwork for identifying genetic predictors of disease risk, including future pandemics.
Clinical laboratory scientists will also note that those genetic indicators may become new biomarkers for clinical laboratory diagnostics for all sorts of diseases currently plaguing mankind.
—Andrea Downing Peck