Cloud-based genetic research networks that facilitate collaboration by stakeholders worldwide may solve the most difficult disease challenges, including a cure for cancer

Coming soon to a clinical laboratory near you: cloud-based “big data” genome analysis! A new industry is emerging dedicated to accepting, storing, and analyzing vast quantities of data generated by next-generation gene sequencing and whole human-genome sequencing.

There are already examples of academic departments of pathology and laboratory medicine that have outsourced the storage and annotation of whole human genomes sequenced from tissue specimens collected from cancer patients. The annotated genomes are returned to the referring pathologists for analysis.

In response to this fast-growing market opportunity, new companies are popping up regularly. The new cloud-based computing industry offers customized computing power and storage for genomic research. These companies provide the resources needed by researchers engaged in analyzing genomic sequences of large population cohorts for specific disease variants, noted a Singularity HUB blog.

CHARGE Demonstrates Efficiency of Big-Data Sequencing in the Cloud

Take the example of DNAnexus which is a platform-as-a-service (PaaS) company. It is providing the massive computing and storage resources needed by the Human Genome Sequencing Center at Baylor College of Medicine to analyze the volume of genomic data produced by a project known as the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE).

Baylor is one five members of the international CHARGE consortium. This a sequencing initiative aimed at linking the risk of particular diseases, particularly heart disease, with specific genetic variants. This big data project is supported by the DNAnexus platform and housed in the cloud by Amazon Web Services.

CHARGE researchers have sequenced the genomes of 14,000 individuals, including 3,751 whole genomes and 10,771 exomes−the 1% of the genome that is believed to contain mutations involved in genetic diseases, noted the Singularity HUB blog.

Jeffrey Reid is a former researcher at Baylor’s Human Genome Sequencing Center who now is Director and Head of Genome Informatics for the Regeneron Genetics Center at Regeneron Pharmaceuticals in Tarrytown, New York. He told an audience at the American Society of Human Genetics conference in October 2013, “Having access to this much data was unique. Many institutions do not have the local compute resources and infrastrucure to support large-scale analysis projects like this one,” stated Reid.

He noted that the volume of data recently generated by CHARGE forced researchers to move to the cloud, where they now work on the DNAnexus platform.

Cloud Computing is the Future for Large-Scale Population Studies

Compared to other cloud uses, big data projects like CHARGE take fuller advantage of the cloud’s potential, suggested Alan Louie, an Analyst at IDC Health Insights. It’s largely an infrastructure and agility issue for genomics,” he told Singularity HUB. “I think it will be the status quo both now and in the future.”

It was 2010 when DNAnexus launched its cloud-based service with the firepower to efficiently handle the volume of gene-sequencing data generated projects like CHARGE. “Many large-scale population studies to date have been limited by a lack of necessary computing power,” Richard Daly, DNAnexus CEO, told Singularity HUB. “This is a real hindrance in realizing the full promise of genomic medicine.”

The DNAnexus cloud platform computes CHARGE’s gene sequencing data 12 times faster than if Baylor’s internal computing infrastructure had been used, even if this work had dominated campus servers for an entire month. The DNAnexus platform, which is compliant with HIPAA (Health Insurance Portability and Accountability Act), also facilitated collaboration on the sequencing project by 300 CHARGE researchers at five institutions.

CEO Richard Daly (pictured) launched DNAnexus two years ago based on the knowledge that most academic and medical institutions lack inhouse computing power needed by researchers to sequence and analyze large volumes of genetic information. (Photo copyright DNAnexus.)

CEO Richard Daly (pictured) launched DNAnexus two years ago based on the knowledge that most academic and medical institutions lack inhouse computing power needed by researchers to sequence and analyze large volumes of genetic information. (Photo copyright DNAnexus.)

Can the Cloud Help Researchers Find a Cure for Cancer?

Other similar companies include SolveBio and GeneStack. They also provide cloud-based computing power that is lacking at most research institutions. SolvBio Founder Mark Kaganovich, a doctoral candidate in Genetics at Stanford University and entrepreneur, believes cloud technology will facilitate solving medical science’s biggest challenges, including a cure for cancer.


In an article published by Tech Crunch, Kaganovich pointed out that the rapidly improving scale and accuracy of DNA sequencing is fueling big leaps in the understanding of genetics. Other contributing factors are the progress in imaging and the identification of proteins, metabolites and other small human molecules. Meanwhile, as the cost of data acquisition drops, biomedical companies and academics are expected to begin using unbiased observational correlations to generate meaningful hypotheses about the genetic causes of disease.

Looking for Useful Patterns in Healthcare Big Data

“The result is the opportunity to create pools of comprehensive data for patients and healthy people where researchers can integrate data and find patterns,” said Kaganovich. “We simply haven’t had anything like this before.”

As new types of data appear, the cloud will make data integration possible, useful, and fast, Kaganovich said, adding that: “Data and algorithms can be distributed to people who specialize in different fields.” He further noted that the cloud can create a value network where researchers, doctors, and entrepreneurs specializing in certain kinds of data gathering and interpretation can interface effectively and meaningfully.

Entrepreneur and founder of SolvBio, Mark Kaganovich

Entrepreneur and founder of SolvBio, Mark Kaganovich (pictured) is a doctoral candidate in Genetics at Stanford University. He suggests that cloud computing power and its ability to facilitate collaboration of genetic research interests will lead to solutions for medicine’s biggest challenges, including a cure for cancer. (Photo copyright SolvBio.)

“The true value of the data will begin to be unlocked as it is analyzed in the context of all the other available data, whether in public clouds or private, secure silos,” continued Kaganovich. “This massively integrated analysis will speed the transition from bleeding edge experimentation to standards-as-solutions and data interpretations move from early-adopter stage to the good-enough stage where they will compete on ease-of-use, speed, and cost.”

The Cloud May Expand Pathologists’ Diagnostic Resources 

In the future, such cloud-based research networks would provide pathologists worldwide immediate access to genetic information of practical diagnostic use in patient care. It also could open up opportunities for physicians of every stripe to share information and collaborate with researchers in finding cures for some of the most dreaded diseases.

—by Patricia Kirk

Related Information:

DNA Sequencing Is Moving to the Cloud

The Cloud Will Cure Cancer

New York’s Mount Sinai Medical Center Using Big Data to Improve Clinical Care