Achievement at University of Chicago may help clinical laboratories analyze large quantities of genomic data much faster than ever before, thus shortening the time required to produce a diagnostic result
It’s a breakthrough in the time required to analyze data from whole human genome sequencing. Researchers at the University of Chicago have successfully demonstrated that genome analysis can be radically accelerated.
This could be a big deal for pathologists and clinical laboratory scientists. That’s because a faster time-to-answer from gene sequencing would increase its diagnostic and therapeutic value to clinicians.
Faster and more accurate analysis of genomic data holds the promise of advances in patient management and greater understanding of the genetic causes of risk and disease. This could mean expanded opportunities for pathologists to engage with clinicians in the use of genomic data to inform diagnosis, choice of treatment, and disease management.
Faster, Cheaper Sequencing Has Created an Analysis Bottleneck
Faster and cheaper gene sequencing is generating enormous amounts of genetic data. However, to be useful, researchers must analyze the data for manageable descriptions of disease risks and other genetic predispositions, pointed out a story in Technology Review. That process requires enormous computational power and can take months to accomplish.
“Whole genome analysis requires the alignment and comparison of raw sequence data,” explained study author Elizabeth McNally, M.D., Ph.D., “[This] results in a computational bottleneck because of limited ability to analyze multiple genomes simultaneously.”
McNally is a Professor of Medicine and Human Genetics and Director of the Cardiovascular Genetics Clinic at University of Chicago Medicine. She was writing in the abstract of her team’s study that was published in the journal Bioinformatics.
Massively Parallel Supercomputer Could Accelerate Genomic Medicine
The findings of this new study may contribute to reducing the lengthy time required to analyze gene-sequencing data. Using a supercomputer, the Chicago team was able to achieve sufficient parallelization to analyze 240 whole genomes simultaneously, according to the abstract.
For the study, they adapted Argonne National Laboratory’s “Beagle,” a Cray XE6 supercomputer, and used publicly available software. They analyzed raw sequencing data from 61 human genomes.
Beagle is massively parallel, reported a story published online at cnet.com. The analysis required only about 25% of its capacity. It was completed in less than 50 hours. By comparison, it would take a single 2.1 GHz CPU roughly 47 years to analyze the same data.
“The supercomputer can process many genomes simultaneously rather than one at a time,” stated first author Megan Puckelwartz, a graduate student in McNally’s laboratory, in an online Genetic Engineering and Biotechnology News story. “It converts whole-genome sequencing, which has primarily been used as a research tool, into something that is immediately valuable for patient care.”
“[The University of Chicago study] vividly demonstrates the benefits of dedicating a powerful supercomputer resource to biomedical research,” declared co-author Ian Foster, Ph. D., Director of the Computation Institute and Arthur Holly Compton Distinguished Service Professor of Computer Science, in a story published online at news-medical.net (News-Medical). “The methods developed here will be instrumental in relieving the data analysis bottleneck that researchers face as genetic sequencing grows cheaper and faster.”
The researchers published their study, titled “Supercomputing for the parallelization of whole genome analysis,” online in the February 12, 2014 edition of the journal Bioinformatics.
Possible Immediate Medical Applications
Whole genome sequencing would be the method of choice for researchers, News-Medical reported. However, the data processing and computational challenges of analysis often make this approach impractical. Consequently, clinical genetics researchers have shifted focus to exome sequencing.
The exome comprises the roughly 2% of the genome that codes for proteins. It is the source of an estimated 85% of disease-causing mutations, according to News-Medical. Scientists. The remaining 15% “junk DNA” was once considered, but is now known to be a part of the exome that performs important functions. (See Dark Daily, “New Research Findings Determine that ‘Dark Matter’ DNA Does Useful Work and Opens Door to Develop More Sophisticated Clinical Pathology Laboratory Tests”, February 26, 2014.)
Overcoming the enormous challenges of data processing and analysis would enable researchers to focus on the entire genome.
“[Beagle] is a resource that can change patient management and, over time, add depth to our understanding of the genetic causes of risk and disease,” observed McNally in the News-Medical piece.
Supercomputer Genome Analysis Could Help Solve ‘Big Data’ Problem
Inherited diseases underscore the importance of finding a solution to the genomic data bottleneck. McNally’s lab focuses on diagnosing and treating patients with inherited forms of disease. Faster, more accurate data from gene sequencing analysis could have significant implications for such patients and their families.
“By paying close attention to family members with genes that place the[m] at increased risk, but who do not yet show signs of disease, we can investigate early phases of a disorder. In this setting, each patient is a big-data problem,” stated McNally in a story published at redorbit.com.
Pathologists and clinical laboratory managers can expect life sciences super computing to accelerate clinical applications of genomic medicine going forward. This means growing opportunities to engage with clinicians for more cost-effective patient management and better patient outcomes.
—by Pamela Scherer McLeod