News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel

News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel
Sign In

The Problems with Ancestry DNA Analyses

Diagnostic medical laboratories may sequence DNA genetic tests correctly, but there are issues with how companies analyze the information

In 2017, some 12 million people paid to spit in a tube and have their genetic data analyzed, according to Technology Review. Many companies offer this type of DNA testing, and each of them works with one or more clinical laboratories to get the actual sequencing performed. For example, Ancestry.com, one of the largest direct-to-consumer genetic data testing companies, works with both Quest Diagnostics and Illumina.

In the case of Quest Diagnostics, the clinical laboratory company does the actual sequencing for Ancestry. But the analysis of the genetic data for an individual and its interpretation is performed by Ancestry’s team.

There are critics of the booming direct-to-consumer genetic testing business, but it’s not due to the quality of the sequencing. Rather, critics cite other issues, such as:

  • Privacy concerns;
  • How the physical samples are stored and used;
  • Who owns the data; and,
  • That this branch of genetics is an area of emerging study and not clearly understood.

What Does All That Genetic Data Mean?

The consumer DNA testing market was worth $359 million dollars in 2017 and is projected to grow to $928 million by 2023, according to a report from Research and Markets. Those numbers represent a lot of spit, and an enormous amount of personal health information. As of now, some one in every 25 adults in the US has access to their genetic data. But, what does all that data mean?

The answer depends, in large part, on who you ask. Many reporters, scientists, and others have taken multiple DNA tests from different companies and received entirely different results. In some cases, the sequencing from one sample submitted to different companies for analysis have rendered dramatically different results.

“There is a wild-west aspect to all of this,” Erin Murphy, a New York University law professor and genetics specialist who focuses on privacy implications, told McClatchy. “It just takes one person in a family to reveal the genetic information of everyone in the family,” she notes. (Photo copyright: New York University.)

It’s All About the Database

Although some people purchase kits from multiple companies, the majority of people take just one test. Each person who buys genetic analysis from Ancestry, for example, consents to having his/her data become part of Ancestry’s enormous database, which is used to perform the analyses that people pay for. There are some interesting implications to how these databases are built.

First, they are primarily made up of paying customers, which means that the vast majority of genetic datasets in Ancestry’s database come from people who have enough disposable income to purchase the kit and analysis. It may not seem like an important detail, but it shows that the comparison population is not the same as the general population.

Second, because the analyses compare the sample DNA to DNA already in the database, it matters how many people from any given area have taken the test and are in the database. An article in Gizmodo describes one family’s experience with DNA testing and some of the pitfalls. The author quotes a representative from the company 23andMe as saying, “Different companies have different reference data sets and different algorithms, hence the variance in results. Middle Eastern reference populations [for example] are not as well represented as European, an industry-wide challenge.”

The same is true for any population where not many members have taken the test for a particular company. In an interview with NPR about trying to find information about her ancestry, journalist Alex Wagner described a similar problem, saying, “There are not a lot of Burmese people taking DNA tests … and so, the results that were returned were kind of nebulous.”

Wagner’s mother and grandmother both immigrated to the US from Burma in 1965, and when Wagner began investigating her ancestry, she, both of her parents, and her grandmother, all took tests from three different direct-to-consumer DNA testing companies. To Wagner’s surprise, her mother and grandmother both had results that showed they were Mongolian, but none of the results indicated Burmese heritage. In the interview she says that one of the biggest things she learned through doing all these tests was that “a lot of these DNA test companies [are] commercial enterprises. So, they basically purchase or acquire DNA samples on market-demand.”

As it turns out, there aren’t many Burmese people taking DNA tests, so there’s not much reason for the testing companies to pursue having a robust Burmese or even Southeast Asian database of DNA.

Who Owns Your Genetic Data?

As is often the case when it comes to technological advances, existing law hasn’t quite caught up with the market for ancestry DNA testing. There are some important unanswered questions, such as who owns the data that results from a DNA analysis?

An investigation conducted by the news organization McClatchy found that Ancestry does allow customers to request their DNA information be deleted from the company’s database, and that they can request their physical sample be destroyed as well. The author writes, “But it is a two-step process, and customers must read deep into the company’s privacy statement to learn how to do it. Requests for DNA data elimination can be made online, but the company asks customers to call its support center to request destruction of their biological sample.”

Another concern is hacking or theft. Ancestry and similar companies take steps to protect customers’ information, such as using barcodes rather than names and encryption when samples are sent to labs. Nevertheless, there was an incident in 2017 in which hackers infiltrated a website owned by Ancestry called RootsWeb. “The RootsWeb situation was certainly unfortunate,” Eric Heath, Ancestry’s Chief Privacy Officer, told McClatchy. He added that RootsWeb was a “completely separate system” from the Ancestry database that includes DNA information.

What We Don’t Know

The biggest pitfall for consumers may be that geneticists don’t know very much about DNA analysis. Adam Rutherford, PhD, is a British geneticist who interviewed for the Gizmodo story. He said that the real problem with companies like Ancestry is that people have a basic, fundamental misunderstanding of what can be learned from a DNA test.

“They’re not telling you where your DNA comes from in the past. They’re telling you where on Earth your DNA is from today,” Rutherford told Gizmodo.

Science evolves, of course, and genetic testing has much evolving to do. The author of the Gizmodo piece writes, “It’s not that the science is bad. It’s that it’s inherently imperfect.” There aren’t any best-practices for analyzing DNA data yet, and companies like Ancestry aren’t doing much to make sure their customers understand that fact.

Nevertheless, issues surrounding genetic testing, the resulting data, and its storage, interpretation, and protection, continue to impact clinical laboratories and anatomic pathology groups.

—Dava Stewart

Related Information:

2017 Was the Year Consumer DNA Testing Blew Up

Quest Diagnostics and Ancestry DNA Collaborate to Expand Consumer DNA Testing

Illumina, Secret Giant of DNA Sequencing, Is Bringing Its Tech to the Masses

Global $928 Million Consumer DNA (Genetic) Testing Market 2018-2023 with 23andMe, Ancestry, Color Genomics and Gene by Gene Dominating

How DNA Testing Botched My Family’s Heritage, and Probably Yours, Too

A Journalist Seeks Out Her Roots but Finds Few Answers in the Soil

Ancestry Wants Your Spit, Your DNA and Your Trust. Should You Give Them All Three?

Could Clinical Laboratories and Pathologists Have a New Use for DNA as a Data Storage Technology?

Researchers in Boston are working to develop DNA as a low-cost, effective way to store data; could lead to new molecular technology industries outside of healthcare

Even as new insights about the role of DNA in various human diseases and health conditions continue to tumble out of research labs, a potential new use for DNA is emerging. A research team in Boston is exploring how to use DNA as a low-cost, reliable way to store and retrieve data.

This has implications for the nation’s clinical laboratories and anatomic pathology groups, because they are gaining experience in sequencing DNA, then storing that data for analysis and use in clinical care settings. If a way to use DNA as a data storage methodology was to become reality, it can be expected that medical laboratories will have the skillsets, experience, and information technology infrastructure already in place to offer a DNA-based data storage service. This would be particularly true for patient data and healthcare data.

Finding a way to reduce the cost of data storage is a primary reason why scientists are looking at ways that DNA could be used as a data storage technology. These scientists and technology developers seek ways to alleviate the world’s over-crowded hard drives, cloud servers, and databases. They hope this can be done by developing technologies that store digital information in artificially-made versions of DNA molecules.

The research so far suggests DNA data storage could be used to store data more effectively than existing data storage solutions. If this proves true, DNA-based data storage technologies could play a key role in industries outside of healthcare.

If so, practical knowledge of DNA handling and storage would be critical to these companies’ success. In turn, this could present unique opportunities for medical laboratory professionals.

DNA Data Storage: Durable but Costly

Besides enormous capacity, DNA-based data storage technology offers durability and long shelf life in a compact footprint, compared to other data storage mediums.

“DNA has an information-storage density several orders of magnitude higher than any other known storage technology,” Victor Zhirnov, PhD, Chief Scientist and Director, Semiconductor Research Corporation, told Wired.

However, projected costs are quite high, due to the cost of writing the information into the DNA. However, Catalog Technologies Inc. of Boston thinks it has a solution.

Rather than producing billions of unique bits of DNA, as Microsoft did while developing its own DNA data storage solution, Catalog’s approach is to “cheaply generate large quantities of just a few different DNA molecules, none longer than 30 base pairs. Then [use] billions of enzymatic reactions to encode information into the recombination patterns of those prefab bits of DNA. Instead of mapping one bit to one base pair, bits are arranged in multidimensional matrices, and sets of molecules represent their locations in each matrix.”

The Boston-based company plans to launch an industrial-scale DNA data storage service using a machine that can daily write a terabyte of data by leveraging 500-trillion DNA molecules, according to Wired. Potential customers include the entertainment industry, federal government, and information technology developers.

Catalog is supported by $9 million from investors. However, it is not the only company working on this. Microsoft and other companies are reportedly working on DNA storage projects as well.

“It’s a new generation of information storage technology that’s got a million times the information density, compared to flash storage. You can shrink down entire data centers into shoeboxes of DNA,” Catalog’s CEO, Hyunjun Park, PhD (above center, between Chief Science Officer Devin Leake on left and Milena Lazova, scientist, on right), told the Boston Globe. (Photo copyright: Catalog.)

Microsoft, University of Washington’s Synthetic DNA Data Storage

Microsoft and researchers at the University of Washington (UW) made progress on their development of a DNA-based storage system for digital data, according to a news release. What makes their work unique, they say, is the large-scale storage of synthetic DNA (200 megabytes) along with the ability to the retrieve data as needed.

“Synthetic DNA is durable and can encode digital data with high density, making it an attractive medium for data storage. However, recovering stored data on a large-scale currently requires all the DNA in a pool to be sequenced, even if only a subset of the information needs to be extracted,” the researchers wrote in their paper published in Nature Biotechnology.

“Here, we encode and store 35 distinct files (over 200 megabytes of data ) in more than 13-million DNA oligonucleotides and show that we can recover each file individually and with no errors, using a random access approach,” the researchers explained.

“Our work reduces the effort, both in sequencing capacity and in processing, to completely recover information stored in DNA,” Sergey Yekhanin, PhD, Microsoft Senior Researcher, told Digital Trends.

Successful research by Catalog, Microsoft, and others may soon lead to the launch of marketable DNA data storage services. And medical laboratory professionals who already know the code—the life code that is—will likely find themselves more marketable as well!

—Donna Marie Pocius

Related Information:

The Rise of DNA Data Storage

The Next Big Thing in Data Storage is Actually Microscopic

Catalog Hauls in $9 Million to Make DNA-Based Data Storage Commercially Viable

UW and Microsoft Researchers Achieve Random Access in Large-Scale DNA Data Storage

Random Access in Large-Scale DNA Data Storage

Microsoft and University of Washington Show DNA Can Store Data in Practical Way

Scientists Encode Malware with Synthesized DNA That Targets DNA Analysis Software Commonly Found in Gene Sequencers Used by Clinical Laboratories

Researchers demonstrated it was feasible to encode digital malware onto a strand of synthesized DNA and infect the gene sequencers and computer networks used by medical laboratories

As if anatomic pathology groups and clinical laboratory leaders don’t already have enough to think about, here comes a security vulnerability right out of a sci-fi thriller. Researchers at the University of Washington (UW) have used synthesized DNA to encode digital malware into a physical strand of DNA capable of establishing a remote connection to the computer network on which the sequenced DNA is read!

Stated differently, researchers have now demonstrated that is possible for bad guys to hack into a medical laboratory’s instrument systems and computer network using a physical strand of synthesized DNA that is encoded with digital malware.

Another Threat to Clinical Laboratories, Pathology Groups?

Does this translate into an immediate security issue for medical laboratories? For now, the threat is only theoretical. While researchers did succeed, their study findings should provide some comfort to pathology groups or medical laboratories worried about the implications of DNA-based malware. The UW researchers published their findings at the 2017 USENIX Security Symposium.

Synthetic DNA Malware Exploit is More Proof-of-Concept than Immediate Threat

At its core, computer code (AKA source code) is similar to DNA in that it is composed of a set number of states—with binary, zeroes, and ones. This led UW researchers to question whether they could translate the AGCT elements (adenine, guanine, cytosine, and thymine) of DNA into binary code capable of hacking DNA sequencers and accessing the information they contain.

In an article in The Atlantic, Tadayoshi Kohno, PhD, Short-Dooley Professor in the Department of Computer Science and Engineering at UW, who led the research team, noted that, “The present-day threat is very small, and people don’t need to lose sleep immediately. But we wanted to know what was possible and what the issues are down the line.”

Complexity of Engineering a DNA-Powered Computer Virus

To begin the process, researchers needed to create a specific DNA strand encoded with the exact proteins that would later convert into their exploit. An article in ArsTechnica suggests this would be a challenge due to the physical properties of DNA’s double-helix design.

In the article, John Timmer, PhD, wrote, “DNA with Gs and Cs forms a stronger double-helix. Too many of them, and the strand won’t open up easily for sequencing. Too few, and it’ll pop open when you don’t want it to.”

The study shows it took multiple attempts to find a DNA sequence that would both carry the malware code and withstand the synthesizing and sequencing processes. Even then, researchers needed an exploit for the software used on sequencers in clinical laboratories and other diagnostics providers to prove their theory. Study authors used their own modified version of an open-source sequencing software, adding an exploit they could target, instead of a version of the software already publicly in use.

Lee Organick (above left), Karl Koscher (center), and Peter Ney (right) worked with Luis Ceze and Tadayoshi Kohno, PhD, at the University of Washington to develop the DNA sequence containing the malware code. The researchers determined that it was feasible for the gene instruments used by clinical laboratories to be infected with the malware, which could then move to infect a clinical lab’s computer network. (Photo copyright: University of Washington.)

With their proteins synthesized and customized software in place, researchers still faced challenges getting the code to trigger. “With reads randomly appearing in an FASTQ file,” the researchers noted, “we would expect the modified program to be exploited 37.4% of the time.”

As with genetic code, the binary code of a program is highly sensitive to errors. Any misread bases or splitting of the code resulted in failure. When sequencers only read a few hundred bases at a time, ensuring the code doesn’t hit one of these splits is a challenge.

One unique difference between binary and genetic code also caused trouble—genetic sequences aren’t direction dependent, while binary sequences are. If the code is read in reverse, it won’t execute properly.

Future Concerns for Clinical Laboratories and Genetic Researchers

Today, the threat to medical laboratories and the sensitive data generated by sequencing is minor. However, tomorrow that threat could be more common.

In a WIRED article on the subject, Jason Callahan, Chief Information Security Officer for Illumina stated, “This is interesting research about potential long-term risks. We agree with the premise of the study—that this does not pose an imminent threat and is not a typical cyber security capability.”

Don Rule, founder of Translational Software, agrees. When asked about the threat posed to clinical laboratories, he said, “… if you have to pre-introduce the hack in the analytics program, this is a pretty circuitous way to take over a computer. I can see how it is feasible and right now Norton Antivirus is not looking for viruses encoded in the AGCT code set, but we are right not to lose a lot of sleep over it.”

However, as genetic sequencing becomes a common part of medicine, attackers might have increased reason to disrupt services or intercept data. The UW researchers cite “important domains like forensics, medicine, and agriculture” as potential targets.

While their successful attack was highly engineered, their research into open-source sequencing software revealed a range of common security weaknesses. Many clinical laboratories and anatomic pathology groups also run proprietary analysis software or use hardware with embedded software.

They recommend that medical laboratories work to centralize software updates and create ways to verify data and patches through digital signatures or other secure measures.

Already, genetic researchers take care to avoid synthesizing potentially dangerous sequences, and to contain tests and data. But this study shows that not all threats come from within the research or clinical laboratory environment. Both engineers of sequencing technology and hardware—and the medical laboratories using them—will need to optimize operations and monitor trends closely to see how security issues evolve alongside sequencing capabilities.

—Jon Stone

 

Related Information:

These Scientists Took Over a Computer by Encoding Malware in DNA

Computer Security and Privacy in DNA Sequencing

Computer Security, Privacy, and DNA Sequencing: Compromising Computers with Synthesized DNA, Privacy Leaks, and More

This Speck of DNA Contains a Movie, a Computer Virus, and an Amazon Gift Card

Researchers Encode Malware in DNA, Compromise DNA Sequencing Software

Biohackers Encoded Malware in a Strand of DNA

The Ultimate Virus: How Malware Encoded in Synthesized DNA Can Compromise a Computer System

Researchers Hacked into DNA and Encoded It with Malware

Biomarker Trends Are Auspicious for Pathologists and Clinical Laboratories

Few anatomical tools hold more potential to revolutionize the science of diagnostics than biomarkers, and pathologists and medical laboratories will be first in line to put these powerful tools to use helping patients with chronic diseases

There’s good news for both anatomic pathology laboratories and medical laboratories worldwide. Large numbers of clinically-useful new biomarkers continue to be validated and are in development for use in diagnostic tests and therapeutic drugs.

Clinical laboratories rely on biomarkers for pathology tests and procedures that track and identify infections and disease during the diagnostic process. Thus, trends that highlight the critical role biomarkers play in medical research are particularly relevant to pathology groups and medical laboratories.

Here’s an overview of critical trends in biomarker research and development that promise to improve diagnosis and treatment of chronic disease.

Emerging Use of Predictive Biomarkers in Precision Medicine

Recent advances in whole genome sequencing are aiding the development of highly accurate diagnostics and treatment plans that involve the development and use of Predictive Biomarkers that improve Precision Medicine (PM).

PM involves an approach to healthcare that is fine-tuned to each patient’s unique condition and physiology. As opposed to the conventional one-size-fits-all approach, which looks at the best options for the average person without examining variations in individual patients.

Predictive biomarkers identify individuals who will most likely respond either favorably or unfavorably to a drug or course of treatment. This improves a patient’s chance to receive benefit or avoid harm and goes to the root of Precision Medicine. (Image copyright: Pennside Partners.)

The National Institutes of Health (NIH) defines PM as “an emerging approach for disease treatment and prevention that considers individual variability in genes, environment, and lifestyle for each person.” It gives physicians and researchers the ability to more accurately forecast which prevention tactics and treatments will be optimal for certain patients.

Combining Drugs for Specific Outcomes

Cancer treatment will be complimented by the utilization of combination drugs that include two or more active pharmaceutical ingredients. Many drug trials are currently being performed to determine which combination of drugs will be the most favorable for specific cancers.

Combination drugs should become crucial in the treatment of different cancers treatments, such as immunotherapy, which involves treating disease by inducing, enhancing, or suppressing an immune response.

Biomarkers associated with certain cancers may enable physicians and researchers to determine which combination drugs will work best for each individual patient.

Developing More Effective Diagnostics

In Vitro diagnostics (IVDs) are poised for massive growth in market share. A report by Allied Market Research, states the worldwide IVD market will reach $81.3 billion by 2022. It noted that IVD techniques in which bodily fluids, such as blood, urine, stool, and sputum are tested to detect disease, conditions, and infections include important technologies such as:

Allied Market Research expects growth of the IVD market to result from these factors:

  • Increases in chronic and infectious diseases;
  • An aging population;
  • Growing knowledge of rare diseases; and
  • Increasing use of personalized medicines.

The capability to sequence the human genome is further adding to improvements in diagnostic development. Pharmaceutical companies can generate diagnostic counterparts alongside related drugs.

Biopsies from Fluid Sources

Millions of dollars have been spent on developing liquid biopsies that detect cancer from simple blood draws. The National Cancer Institute Dictionary of Cancer Terms defines a liquid biopsy as “a test done on a sample of blood to look for cancer cells from a tumor that are circulating in the blood or for pieces of DNA from tumor cells that are in the blood.”

At present, liquid biopsies are typically used only in the treatment and monitoring of cancers already diagnosed. Companies such as Grail, a spinoff of Illumina, and Guardant Health are striving to develop ways to make liquid biopsies a crucial part of cancer detection in the early stages, increasing long-term survival rates.

“The holy grail in oncology has been the search for biomarkers that could reliably signal the presence of cancer at an early stage,” said Dr. Richard Klausner, Senior Vice President and Chief Medical Officer at Grail.

Grail hopes to market a pan-cancer screening test that will measure circulating nucleic acids in the blood to detect the presence of cancer in patients who are experiencing no symptoms of the disease.

Clinical Trials and Precision Medicine

The Precision Medicine Initiative (PMI), launched by the federal government in 2015, investigates ways to create tailor-made treatments and prevention strategies for patients based on their distinctive attributes.

Two ongoing studies involved in PMI research are MATCH and TAPUR:

  1. MATCH (Molecular Analysis for Therapy Choice) is a clinical trial run by The National Cancer Institute. The researchers are studying tumors to learn if they possess gene abnormalities that are treatable by known drugs.
  2. TAPUR (Targeted Agent and Profiling Utilization Registry), is a non-randomized clinical trial being conducted by the American Society of Clinical Oncology (ASCO). The researchers are chronicling the safety and efficacy of available cancer drugs currently on the market.

New Tools for Pathologists and Clinical Laboratories

The attention and funds given to these types of projects expand the possibilities of being able to develop targeted therapies and treatments for patients. Such technological advancements could someday enable physicians to view and treat cancer as a product of specific gene mutations and not just a disease.

These trends will be crucial and favorable for clinical laboratories in the future. As tests and treatments become unique to individual patients, pathologists and clinical laboratories will be on the frontlines of providing advanced services to healthcare professionals.

—JP Schlingman

Related Information:

5 Trends Being Impacted by Biomarkers

Immuno-Oncology Stories of 2016

Bristol-Myers Leads Immune-Oncology Race but Merck, Astrazeneca and Roche Still Have Contenders

Five Companies to Watch in the Liquid Biopsy Field

Illumina Spinoff GRAIL to Trial Liquid Biopsies for Early Detection of Cancer

Illumina Forms New Company to Enable Early Cancer Detection via Blood-Based Screening

A to Z List of Cancer Drugs

Personalized Medicine and the Role of Predictive vs. Prognostic Markers

Understanding Prognostic versus Predictive Biomarkers

NCI-MATCH Trial (Molecular Analysis for Therapy Choice)

Six Months of Progress on the Precision Medicine Initiative

Strata Oncology, in Tandem with Thermo Fisher, Offers 100,000 Free Genetic Cancer Tests to Patients as Part of New Clinical Laboratory Business Model

Startup medical company proposes to offer free genetic testing to 100,000 advanced cancer patients to increase their chances for optimum therapeutic results

Strata Oncology (Strata), a precision medicine company based in Ann Arbor, Mich., plans to provide free genetic testing to advanced cancer patients beginning in 2017. The company raised $12-million dollars and teamed up with Thermo Fisher Scientific to complete the large-scale tumor sequencing project.

Using tumor tissue, Strata’s gene test sequences DNA and RNA to identify patients with certain gene mutations. This information is used to determine which cancer medications would be best for each patient. Patients are then referred to the appropriate pharmaceutical company for drug therapy and, potentially, for customized clinical trials.

Strata states on their website that their goal is to “dramatically expand late-stage cancer patients’ access to tumor sequencing and precision medicine trials, and to accelerate the approval and availability of breakthrough cancer medicines.” (more…)

;