With 100% of the human genome mapped, new genetic diagnostic and disease screening tests may soon be available for clinical laboratories and pathology groups
Utilizing technology developed by two different biotechnology/genetic sequencing companies, an international consortium of genetic scientists claim to have sequenced 100% of the entire human genome, “including the missing parts,” STAT reported. This will give clinical laboratories access to the complete 3.055 billion base pair (bp) sequence of the human genome.
If validated, this achievement could greatly impact future genetic research and genetic diagnostics development. That also will be true for precision medicine and disease-screening testing.
Completing the First “End-to-End” Genetic Sequencing
In June of 2000, the Human Genome Project (HGP) announced it had successfully created the first “working draft” of the human genome. But according to the National Human Genome Research Institute (NHGRI), the draft did not include 100% of the human genome. It “consists of overlapping fragments covering 97% of the human genome, of which sequence has already been assembled for approximately 85% of the genome,” an NHGRI press release noted.
“The original genome papers were carefully worded because they did not sequence every DNA molecule from one end to the other,” Ewan Birney, PhD, Deputy Director General of the European Molecular Biology Laboratory (EMBL) and Director of EMBL’s European Bioinformatics Institute (EMBL-EBI), told STAT. “What this group has done is show that they can do it end-to-end. That’s important for future research because it shows what is possible,” he added.
In their published paper, the T2T scientists wrote, “Addressing this remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium has finished the first truly complete 3.055 billion base pair (bp) sequence of a human genome, representing the largest improvement to the human reference genome since its initial release.”
Tale of Two Genetic Sequencing Technologies
Humans have a total of 46 chromosomes in 23 pairs that represent tens of thousands of individual genes. Each individual gene consists of numbers of base pairs and there are billions of these base pairs within the human genome. In 2000, scientists estimated that humans have only 30,000 to 35,000 genes, but that number has since been reduced to just above 20,000 genes.
According to STAT, “The work was possible because the Oxford Nanopore and PacBio technologies do not cut the DNA up into tiny puzzle pieces.”
PacBio used HiFi sequencing, which is only a few years old and provides the benefits of both short and long reads. STAT noted that PacBio’s technology “uses lasers to examine the same sequence of DNA again and again, creating a readout that can be highly accurate.” According to the company’s website, “HiFi reads are produced by calling consensus from subreads generated by multiple passes of the enzyme around a circularized template. This results in a HiFi read that is both long and accurate.”
Oxford Nanopore uses electrical current in its sequencing devices. In this technology, strands of base pairs are pressed through a microscopic nanopore one molecule at a time. Those molecules are then zapped with electrical currents to enable scientists to determine what type of molecule they are and, in turn, identify the full strand.
The T2T Consortium acknowledge in their paper that they had trouble with approximately 0.3% of the genome, but that, though there may be a few errors, there are no gaps.
Might New Precision Medicine Therapies Come from T2T Consortium’s Research?
The researchers claim in their paper that the number of known base pairs has grown from 2.92 billion to 3.05 billion and that the number of known genes has increased by 0.4%. Through their research, they also discovered 115 new genes that code for proteins.
The T2T Consortium scientists also noted that the genome they sequenced for their research did not come from a person but rather from a hydatidiform mole, a rare growth that occasionally forms on the inside of a women’s uterus. The hydatidiform occurs when a sperm fertilizes an egg that has no nucleus. As a result, the cells examined for the T2T study contained only 23 chromosomes instead of the full 46 found in most humans.
Although the T2T Consortium’s work is a huge leap forward in the study of the human genome, more research is needed. The consortium plans to publish its findings in a peer-reviewed medical journal. In addition, both PacBio and Oxford Nanopore plan to develop a way to sequence the entire 46 chromosome human genome in the future.
The future of genetic research and gene sequencing is to create technologies that will allow researchers to identify single nucleotide polymorphisms (SNPs) that contain longer strings of DNA. Because these SNPs in the human genome correlate with medical conditions and response to specific genetic therapies, advancing knowledge of the genome can ultimately provide beneficial insights that may lead to new genetic tests for medical diagnoses and help medical professionals determine the best, personalized therapies for individual patients.
Medical laboratories are already using gene sequencing as part of a global effort to identify new variants of the coronavirus and their genetic ancestors
Thanks to advances in genetic sequencing technology that enable medical laboratories to sequence organisms faster, more accurately, and at lower cost than ever before, clinical pathology laboratories worldwide are using that capability to analyze the SARS-CoV-2 coronavirus and identify variants as they emerge in different parts of the world.
The US Centers for Disease Control and Prevention (CDC) now plans to harness the power of gene sequencing through a new consortium called SPHERES (SARS-CoV-2 Sequencing for Public Health Emergency Response, Epidemiology, and Surveillance) to “coordinate SARS-CoV-2 sequencing across the United States,” states a CDC news release. The consortium is led by the CDC’s Advanced Molecular Detection (AMD) program and “aims to generate information about the virus that will strengthen COVID-19 mitigation strategies.”
The consortium is comprised of 11 federal agencies, 20 academic institutions, state public health laboratories in 21 states, nine non-profit research organizations, and 14 lab and IVD companies, including:
Abbott Diagnostics
bioMérieux
Color Genomics
Ginkgo Bioworks
IDbyDNA
Illumina
In-Q-Tel
LabCorp
One Codex
Oxford Nanopore Technologies
Pacific Biosciences
Qiagen
Quest Diagnostics
Verily Life Sciences
‘Fundamentally Changing How Public Health Responds’
Gene sequencing and related technologies have “fundamentally changed how public health responds in terms of surveillance and outbreak response,” said Duncan MacCannell, PhD, Chief Science Officer for the CDC’s Office of Advanced Molecular Detection (OAMD), in an April 30 New York Times (NYT) article, which stated that the CDC SPHERES program “will help trace patterns of transmission, investigate outbreaks, and map how the virus is evolving, which can affect a cure.”
The CDC says that rapid DNA sequencing of SARS-CoV-2 will help monitor significant changes in the virus, support contact tracing efforts, provide information for developers of diagnostics and therapies, and “advance public health research in the areas of transmission dynamics, host response, and evolution of the virus.”
The sequencing laboratories in the consortium have agreed to “release their information into the public domain quickly and in a standard way,” the NYT reported, adding that the project includes standards for what types of information medical laboratories should submit, including, “where and when a sample was taken,” and other critical details.
Sharing Data Between Sequencing Laboratories and Biotech Companies
The CDC announced the SPHERES initiative on April 30, although it launched in early April, the NYT reported.
According to the CDC, SPHERES’ objectives include:
To bring together a network of sequencing laboratories, bioinformatics capacity and subject matter expertise under the umbrella of a massive and coordinated public health sequencing effort.
To identify and prioritize capabilities and resource needs across the network and to align sources of federal, non-governmental, and private sector funding and support with areas of greatest impact and need.
To improve coordination of genomic sequencing between institutions and jurisdictions and to enable more resilience across the network.
To champion concepts of openness, standards-based analysis, and rapid data sharing throughout the United States and worldwide during the COVID-19 pandemic response.
To provide a common forum for US public, private, and academic institutions to share protocols, methods, bioinformatics tools, standards, and best practices.
To establish consistent data and metadata standards, including streamlined repository submission processes, sample prioritization criteria, and a framework for shared, privacy-compliant unique case identifiers.
To align with other national sequencing and bioinformatics networks, and to support global efforts to advance the use of standards and open data in public health.
Implications for Developing a Vaccine
As the virus continues to mutate and evolve, one question is whether a vaccine developed for one variant will work on others. However, several experts told The Washington Post that the SARS-CoV-2 coronavirus is relatively stable compared to viruses that cause seasonal flu (influenza).
“At this point, the mutation rate of the virus would suggest that the vaccine developed for SARS-CoV-2 would be a single vaccine, rather than a new vaccine every year like the flu vaccine,” Peter Thielen, a molecular biologist at the Johns Hopkins University Applied Physics Laboratory, told the Washington Post.
Nor, he said, is one variant likely to cause worse clinical outcomes than others. “So far, we don’t have any evidence linking a specific virus [strain] to any disease severity score. Right now, disease severity is much more likely to be driven by other factors.”
Fast improvements in gene sequencing technology have made it faster, more accurate, and cheaper to sequence. Thus, as the COVID-19 outbreak happened, there were many clinical laboratories around the world with the equipment, the staff, and the expertise to sequence the novel coronavirus and watch it mutate from generation to generation and from region to region around the globe. This capability has never been available in outbreaks prior to the current SARS-CoV-2 outbreak.
This and similar research initiatives expected to increase the number of genetic markers that would be useful for creating clinical pathology laboratory tests and therapeutic drugs
Whole human genome sequencing continues to become faster, easier, cheaper, and more accurate to do. Because of these advances, the sheer number of human genomes being sequenced is skyrocketing. This huge increase in data is helping researchers unlock many new insights that, in turn, are fueling efforts to develop useful new medical laboratory tests and therapeutic drugs.
This is happening at the University of Washington (UW), where researchers using new genome sequencing technology are uncovering thousands of never-before-seen genetic variants. The application of “long read” gene sequencing technologies is allowing these researchers to identify genetic variants previously unknown, and that are made up of between 50 and 5,000 base pairs.
The discovery is important for two reasons. First, it could close existing gaps in the genome map. Second, it could help scientists identify new genomic variations that are closely associated with difficult-to-diagnose diseases. Of interest to pathologists and clinical laboratory professionals, such discoveries could point to expanded use of genetic testing for diagnosis and treatment of disease. (more…)
One impressive example of the fast pace of technology improvements is the Ion Torrent, which is a semiconductor-based DNA sequencer now capable of sequencing 100 million base pairs. That is ten times the sequencer’s capacity when it was launched just last December!
It was August of last year when Life Technologies (NASDAQ: LIFE) in Carlsbad, California, paid $375 million to acquire Ion Torrent Systems, a start-up with operations in Guilford, Connecticut, and South San Francisco. If Ion Torrent achieves certain technical milestones through 2012, it will earn another $350 million.
Disruptive technology drops the cost of DNA methylation sequencing by 100-fold
As sequencing of individual human genomes becomes more affordable and useful, the next big hurdle in genetic science will be to map the human epigenome. While DNA provides the blueprint for building a human being, the epigenome determines the details of how that blueprint is expressed in an individual. Pathologists and clinical laboratory administrators will want to track efforts to map and understand the human epigenome.
The epigenome is a set of chemical modifications to the genome that are not encoded in the DNA but which modulate how and when genes are expressed. Methylation is only one marker in the complex epigenetic map, but it is an important one. Methylation suppresses gene activity, and is thought to be responsible for suppressing some genes that prevent cancer. Though researchers are a long way from using this knowledge to cure cancer or other diseases, faster, more affordable DNA methylation sequencing will help move that research forward.