Could Clinical Laboratories and Pathologists Have a New Use for DNA as a Data Storage Technology?

Researchers in Boston are working to develop DNA as a low-cost, effective way to store data; could lead to new molecular technology industries outside of healthcare

Even as new insights about the role of DNA in various human diseases and health conditions continue to tumble out of research labs, a potential new use for DNA is emerging. A research team in Boston is exploring how to use DNA as a low-cost, reliable way to store and retrieve data.

This has implications for the nation’s clinical laboratories and anatomic pathology groups, because they are gaining experience in sequencing DNA, then storing that data for analysis and use in clinical care settings. If a way to use DNA as a data storage methodology was to become reality, it can be expected that medical laboratories will have the skillsets, experience, and information technology infrastructure already in place to offer a DNA-based data storage service. This would be particularly true for patient data and healthcare data.

Finding a way to reduce the cost of data storage is a primary reason why scientists are looking at ways that DNA could be used as a data storage technology. These scientists and technology developers seek ways to alleviate the world’s over-crowded hard drives, cloud servers, and databases. They hope this can be done by developing technologies that store digital information in artificially-made versions of DNA molecules.

The research so far suggests DNA data storage could be used to store data more effectively than existing data storage solutions. If this proves true, DNA-based data storage technologies could play a key role in industries outside of healthcare.

If so, practical knowledge of DNA handling and storage would be critical to these companies’ success. In turn, this could present unique opportunities for medical laboratory professionals.

DNA Data Storage: Durable but Costly

Besides enormous capacity, DNA-based data storage technology offers durability and long shelf life in a compact footprint, compared to other data storage mediums.

“DNA has an information-storage density several orders of magnitude higher than any other known storage technology,” Victor Zhirnov, PhD, Chief Scientist and Director, Semiconductor Research Corporation, told Wired.

However, projected costs are quite high, due to the cost of writing the information into the DNA. However, Catalog Technologies Inc. of Boston thinks it has a solution.

Rather than producing billions of unique bits of DNA, as Microsoft did while developing its own DNA data storage solution, Catalog’s approach is to “cheaply generate large quantities of just a few different DNA molecules, none longer than 30 base pairs. Then [use] billions of enzymatic reactions to encode information into the recombination patterns of those prefab bits of DNA. Instead of mapping one bit to one base pair, bits are arranged in multidimensional matrices, and sets of molecules represent their locations in each matrix.”

The Boston-based company plans to launch an industrial-scale DNA data storage service using a machine that can daily write a terabyte of data by leveraging 500-trillion DNA molecules, according to Wired. Potential customers include the entertainment industry, federal government, and information technology developers.

Catalog is supported by $9 million from investors. However, it is not the only company working on this. Microsoft and other companies are reportedly working on DNA storage projects as well.

“It’s a new generation of information storage technology that’s got a million times the information density, compared to flash storage. You can shrink down entire data centers into shoeboxes of DNA,” Catalog’s CEO, Hyunjun Park, PhD (above center, between Chief Science Officer Devin Leake on left and Milena Lazova, scientist, on right), told the Boston Globe. (Photo copyright: Catalog.)

Microsoft, University of Washington’s Synthetic DNA Data Storage

Microsoft and researchers at the University of Washington (UW) made progress on their development of a DNA-based storage system for digital data, according to a news release. What makes their work unique, they say, is the large-scale storage of synthetic DNA (200 megabytes) along with the ability to the retrieve data as needed.

“Synthetic DNA is durable and can encode digital data with high density, making it an attractive medium for data storage. However, recovering stored data on a large-scale currently requires all the DNA in a pool to be sequenced, even if only a subset of the information needs to be extracted,” the researchers wrote in their paper published in Nature Biotechnology.

“Here, we encode and store 35 distinct files (over 200 megabytes of data ) in more than 13-million DNA oligonucleotides and show that we can recover each file individually and with no errors, using a random access approach,” the researchers explained.

“Our work reduces the effort, both in sequencing capacity and in processing, to completely recover information stored in DNA,” Sergey Yekhanin, PhD, Microsoft Senior Researcher, told Digital Trends.

Successful research by Catalog, Microsoft, and others may soon lead to the launch of marketable DNA data storage services. And medical laboratory professionals who already know the code—the life code that is—will likely find themselves more marketable as well!

—Donna Marie Pocius

Related Information:

The Rise of DNA Data Storage

The Next Big Thing in Data Storage is Actually Microscopic

Catalog Hauls in $9 Million to Make DNA-Based Data Storage Commercially Viable

UW and Microsoft Researchers Achieve Random Access in Large-Scale DNA Data Storage

Random Access in Large-Scale DNA Data Storage

Microsoft and University of Washington Show DNA Can Store Data in Practical Way