Researchers at Penn State identified 160,000 ‘transcription initiation machines’ throughout the human genome
DNA “dark matter” may have something in common with comedian Rodney Dangerfield, who liked to say, “I don’t get no respect!” As many pathologists know, for years the human exome that has been the focus of most research. This is the 1% of the human genome that contains the genes that produce proteins and do other useful functions.
Meanwhile, the remaining 99% of the human genome—sometimes called “junk DNA” and generally known as dark matter—got relatively little attention from researchers. But that is changing. At Pennsylvania State University, a research team has discovered that coding and noncoding RNA, or genomic dark matter, originates at the same types of locations along the human genome.
Finding Where Disease Traits Reside
The research team, headed by B. Franklin Pugh, Ph.D., who holds Penn State’s Willaman Chair in Molecular Biology, and postdoctoral graduate Bryan Venters, Ph.D., who is now on the faculty of Vanderbilt University Medical Center, believe their findings may eventually pinpoint where complex-disease traits reside, since the genetic origins of many diseases reside outside the coding region of the genome, noted a report published by Genetic Engineering News (GEN).
Only Small Portion of the Genome Is Involved in Coding
“The human genome is pervasively transcribed, yet only a small fraction is coding,” wrote the study authors. “Widespread coding and noncoding transcription across the human genome arises from discrete transcription initiation complexes assembled at four core promoter elements [BREu, TATA, BREd, and INR, in constrained positions].”
Discovery of Genetic Transcription’s Initiation Machine
Attempting to understand the role of dark matter, scientists previously had looked directly at RNA to find where transcription begins. Pugh and Venters decided, instead, to determine the location of human chromosomol proteins that initiate transcription of noncoding RNA, GEN reported.
“We took this approach because so many RNAs are rapidly destroyed soon after they are made, and this makes them hard to detect,” explained Pugh. “So rather than look for the RNA product of transcription, we looked for the ‘initiation machine’ that makes the RNA. This machine assembles RNA polymerase, which goes on to make RNA, which goes on to make a protein,” he added.
Pugh and Venters were amazed to find 160,000 transcription initiation machines across the human genome. By contrast, humans have about 30,000 genes. Noting that only about 5% of them are associated with messenger RNA genes, or mRNA, Pugh said in the GEN article, “This finding is even more remarkable, given that fewer than 10,000 of these machines actually were found right at the site of genes. Since most genes are turned off in cells, it is understandable why they are typically devoid of the initiation machinery.”
Mystery Remains about the Function of 150,000 Initiation Machines
The other 150,000 initiation machines remain shrouded in mystery. “In the early days, these fragments of RNA were generally dismissed as irrelevant since they did not code protein,” Pugh continued, noting that it was easy to disregard these fragments as useless because they lacked a feature called polyadenylation, which protects RNA from being destroyed. But, he observed, “These initiation machines that are not associated with genes were clearly active, since they were making RNA and aligned with fragments of RNA discovered by other scientists.”
The Answer to ‘Missing Heritability’ May Lie in Dark Matter
Pugh believes this research may help solve the problem of “missing heritability,” a concept that describes how most human traits, including propensity for diseases, cannot be accounted for by individual genes, but seem to originate in regions of the genome that do not code for proteins.
The researchers validated their astonishing findings by determining that these noncoding initiation machines recognized the same DNA sequences as the ones at coding genes. This suggests that they have a specific origin and, similar to coding genes, their production is regulated, pointed out the GEN report.
“These noncoding RNAs have been called the ‘dark matter’ of the genome because, just like the dark matter of the universe, they are massive in terms of coverage [of the human genome],” Pugh pointed out. “However, they are difficult to detect and no one knows exactly what they are doing or why they are there. Now at lest we know that they are real, and not just ‘noise’ or ‘junk.’ Of course, the next step is to answer the question, ‘what, in fact do they do?’” he said.
Dark Matter Discoveries Could Upend Ideas on How DNA and RNA Work
Most scientists realize that Mother Nature’s creations are both simple and elegant, so that begs the question: “Why create a human genome with 3 billion base pairs, if only 1% of it is useful?” Now that the scientific community is beginning to focus on the other 99% of the genome, researchers’ discoveries about noncoding RNA are likely to reveal new insights about the function of dark matter.
This groundbreaking research work is likely to have significant impact on how clinical laboratories perform genetic testing. At the moment, first-mover and early-adopter institutions are beginning to acquire next-generation gene sequencing systems and use them for clinical purposes. But, as noted above, most of this activity is focused on sequencing the human exome. Pathologists and clinical chemists performing this type of testing will increasingly need to consider how dark matter regions are involved in disease process, in response to new research discoveries.
—by Patricia Kirk