Proof-of-concept study ‘highlights that using AI to integrate different types of clinically informed data to predict disease outcomes is feasible’ researchers say
Artificial intelligence (AI) and machine learning are—in stepwise fashion—making progress in demonstrating value in the world of pathology diagnostics. But human anatomic pathologists are generally required for a prognosis. Now, in a proof-of-concept study, researchers at Brigham and Women’s Hospital in Boston have developed a method that uses AI models to integrate multiple types of data from disparate sources to accurately predict patient outcomes for 14 different types of cancer.
The process also uncovered “the predictive bases of features used to predict patient risk—a property that could be used to uncover new biomarkers,” according to Genetic Engineering and Biotechnology News (GEN).
Should these research findings become clinically viable, anatomic pathologists may gain powerful new AI tools specifically designed to help them predict what type of outcome a cancer patient can expect.
“Experts analyze many pieces of evidence to predict how well a patient may do. These early examinations become the basis of making decisions about enrolling in a clinical trial or specific treatment regimens,” said Faisal Mahmood, PhD (above) in a Brigham press release. “But that means that this multimodal prediction happens at the level of the expert. We’re trying to address the problem computationally,” he added. Should they be proven clinically-viable through additional studies, these findings could lead to useful tools that help anatomic pathologists and clinical laboratory scientists more accurately predict what type of outcomes cancer patient may experience. (Photo copyright: Harvard.)
AI-based Prognostics in Pathology and Clinical Laboratory Medicine
The team at Brigham constructed their AI model using The Cancer Genome Atlas (TCGA), a publicly available resource which contains data on many types of cancer. They then created a deep learning-based algorithm that examines information from different data sources.
Pathologists traditionally depend on several distinct sources of data, such as pathology images, genomic sequencing, and patient history to diagnose various cancers and help develop prognoses.
For their research, Mahmood and his colleagues trained and validated their AI algorithm on 6,592 H/E (hematoxylin and eosin) whole slide images (WSIs) from 5,720 cancer patients. Molecular profile features, which included mutation status, copy-number variation, and RNA sequencing expression, were also inputted into the model to measure and explain relative risk of cancer death.
The scientists “evaluated the model’s efficacy by feeding it data sets from 14 cancer types as well as patient histology and genomic data. Results demonstrated that the models yielded more accurate patient outcome predictions than those incorporating only single sources of information,” states a Brigham press release.
“This work sets the stage for larger healthcare AI studies that combine data from multiple sources,” said Faisal Mahmood, PhD, Associate Professor, Division of Computational Pathology, Brigham and Women’s Hospital; and Associate Member, Cancer Program, Broad Institute of MIT and Harvard, in the press release. “In a broader sense, our findings emphasize a need for building computational pathology prognostic models with much larger datasets and downstream clinical trials to establish utility.”
Future Prognostics Based on Multiple Data Sources
The Brigham researchers also generated a research tool they dubbed the Pathology-omics Research Platform for Integrative Survival Estimation (PORPOISE). This tool serves as an interactive platform that can yield prognostic markers detected by the algorithm for thousands of patients across various cancer types.
The researchers believe their algorithm reveals another role for AI technology in medical care, but that more research is needed before their model can be implemented clinically. Larger data sets will have to be examined and the researchers plan to use more types of patient information, such as radiology scans, family histories, and electronic medical records in future tests of their AI technology.
“Future work will focus on developing more focused prognostic models by curating larger multimodal datasets for individual disease models, adapting models to large independent multimodal test cohorts, and using multimodal deep learning for predicting response and resistance to treatment,” the Cancer Cell paper states.
“As research advances in sequencing technologies, such as single-cell RNA-seq, mass cytometry, and spatial transcriptomics, these technologies continue to mature and gain clinical penetrance, in combination with whole-slide imaging, and our approach to understanding molecular biology will become increasingly spatially resolved and multimodal,” the researchers concluded.
Anatomic pathologists may find the Brigham and Women’s Hospital research team’s findings intriguing. An AI tool that integrates data from disparate sources, analyzes that information, and provides useful insights, could one day help them provide more accurate cancer prognoses and improve the care of their patients.
Researchers say their method can trace ancestry back 100,000 years and could lay groundwork for identifying new genetic markers for diseases that could be used in clinical laboratory tests
Cheaper, faster, and more accurate genomic sequencing technologies are deepening scientific knowledge of the human genome. Now, UK researchers at the University of Oxford have used this genomic data to create the largest-ever human family tree, enabling individuals to trace their ancestry back 100,000 years. And, they say, it could lead to new methods for predicting disease.
This new database also will enable genealogists and medical laboratory scientists to track when, where, and in what populations specific genetic mutations emerged that may be involved in different diseases and health conditions.
New Genetic Markers That Could Be Used for Clinical Laboratory Testing
As this happens, it may be possible to identify new diagnostic biomarkers and genetic indicators associated with specific health conditions that could be incorporated into clinical laboratory tests and precision medicine treatments for chronic diseases.
“We have basically built a huge family tree—a genealogy for all of humanity—that models as exactly as we can the history that generated all the genetic variation we find in humans today,” said Yan Wong, DPhil, an evolutionary geneticist at the Big Data Institute (BDI) at the University of Oxford, in a news release. “This genealogy allows us to see how every person’s genetic sequence relates to every other, along all the points of the genome.”
The BDI team overcame the major obstacle to tracing the origins of human genetic diversity when they developed algorithms to handle the massive amount of data created when combining genome sequences from many different databases. In total, they compiled the genomic sequences of 3,601 modern and eight high-coverage ancient people from 215 populations in eight datasets.
The ancient genomes included three Neanderthal genomes, a Denisovan genome, and a family of four people who lived in Siberia around 4,600 years ago.
The University of Oxford researchers noted in their news release that their method could be scaled to “accommodate millions of genome sequences.”
“This structure is a lossless and compact representation of 27 million ancestral haplotype fragments and 231 million ancestral lineages linking genomes from these datasets back in time. The tree sequence also benefits from the use of an additional 3,589 ancient samples compiled from more than 100 publications to constrain and date relationships,” the researchers wrote in their published study.
Wong believes his research team has laid the groundwork for the next generation of DNA sequencing.
“As the quality of genome sequences from modern and ancient DNA samples improves, the tree will become even more accurate and we will eventually be able to generate a single, unified map that explains the descent of all the human genetic variation we see today,” he said in the news release.
Developing New Clinical Laboratory Biomarkers for Modern Diagnostics
In a video illustrating the study’s findings, evolutionary geneticist Yan Wong, DPhil, a member of the BDI team, said, “If you wanted to know why some people have some sort of medical conditions, or are more predisposed to heart attacks or, for example, are more susceptible to coronavirus, then there’s a huge amount of that described by their ancestry because they’ve inherited their DNA from other people.”
Wohns agrees that the significance of their tree-recording methods extends beyond simply a better understanding of human evolution.
“[This study] could be particularly beneficial in medical genetics, in separating out true associations between genetic regions and diseases from spurious connections arising from our shared ancestral history,” he said.
The underlying methods developed by Wohns’ team could have widespread applications in medical research and lay the groundwork for identifying genetic predictors of disease risk, including future pandemics.
Clinical laboratory scientists will also note that those genetic indicators may become new biomarkers for clinical laboratory diagnostics for all sorts of diseases currently plaguing mankind.
Gene sequencing is enabling disease tracking in new ways that include retesting laboratory specimens from before the SARS-CoV-2 outbreak to determine when it arrived in the US
On February 26 of this year, nearly 200 executives and employees of neuroscience-biotechnology company Biogen gathered at the Boston Marriott Long Wharf hotel for their annual leadership conference. Unbeknownst to the attendees, by the end of the following day, dozens of them had been exposed to and become infected by SARS-CoV-2, the coronavirus that causes the COVID-19 illness.
Researchers now have hard evidence that attendees at this meeting returned to their communities and spread the infection. The findings of this study will be relevant to pathologists and clinical laboratory managers who are cooperating with health authorities in their communities to identify infected individuals and track the spread of the novel coronavirus.
This “superspreader” event has been closely investigated and has led to intriguing conclusions concerning the use of genetic sequencing to revealed vital information about the COVID-19 pandemic. Recent improvements in gene sequencing technology is giving scientists new ways to trace the spread of COVID-19 and other diseases, as well as a method for monitoring mutations and speeding research into various treatments and vaccines.
Genetic Sequencing Traces an Outbreak
“With genetic data, a record of our poor decisions is being captured in a whole new way,” Bronwyn MacInnis, PhD, Director of Pathogen Genomic Surveillance at the Broad Institute of MIT and Harvard, told The Washington Post (WaPo) during its analysis of the COVID-19 superspreading event. MacInnis is one of many Broad Institute, Harvard, MIT, and state of Massachusetts scientists who co-authored a study that detailed the coronavirus’ spread across Boston, including from the Biogen conference.
What they discovered is both surprising and enlightening. According to WaPo’s report, at least 35 new cases of the virus were linked directly to the Biogen conference, and the same strain was discovered in outbreaks in two homeless shelters in Boston, where 122 people were infected. The variant tracked by the Boston researchers was found in roughly 30% of the cases that have been sequenced in the state, as well as in Alaska, Senegal, and Luxembourg.
“The data reveal over 80 introductions into the Boston area, predominantly from elsewhere in the United States and Europe. We studied two superspreading events covered by the data, events that led to very different outcomes because of the timing and populations involved. One produced rapid spread in a vulnerable population but little onward transmission, while the other was a major contributor to sustained community transmission,” the researchers noted in their study abstract.
“The same two events differed significantly in the number of new mutations seen, raising the possibility that SARS-CoV-2 superspreading might encompass disparate transmission dynamics. Our results highlight the failure of measures to prevent importation into [Massachusetts] early in the outbreak, underscore the role of superspreading in amplifying an outbreak in a major urban area, and lay a foundation for contact tracing informed by genetic data,” they concluded.
Genetic Sequencing and Mutation Tracking
The use of genetic sequencing to trace the virus could inform measures to control the spread in new ways, but currently, only about 0.33% of cases in the United States are being sequenced, MacInnis told WaPo, and that not sequencing samples is “throwing away the crown jewels of what you really want to know.”
Another role that genetic sequencing is playing in this pandemic is in tracking viral mutations. One of the ways that pandemics worsen is when viruses mutate to become deadlier or more easily spread. Scientists are using genetic sequencing to monitor SARS-CoV-2 for such mutations.
Korber’s findings are important because the mutation the scientists identified appears to have a fitness advantage. “Our data show that, over the course of one month, the variant carrying the D614G Spike mutation became the globally dominant form of SARS-CoV-2,” they wrote. Additionally, the study noted, people infected with the mutated variant appear to have a higher viral load in their upper respiratory tracts.
Genetic Sequencing, the Race for Treatments, Vaccines, and Managing Future Pandemics
If, as Fauci and Morens predict, future pandemics are likely, improvements in gene sequencing and analysis will become even more important for tracing, monitoring, and suppressing outbreaks. Clinical laboratory managers will want to watch this closely, as medical labs that process genetic sequencing will, no doubt, be part of that operation.
Media reports in the United Kingdom cite bad timing and centralization of public health laboratories as reasons the UK is struggling to meet testing goals
Clinical pathologists and medical laboratories in UK and the US function within radically different healthcare systems. However, both countries faced similar problems deploying widespread diagnostic testing for SARS-CoV-2, the novel coronavirus that causes COVID-19. And the differences between America’s private healthcare system and the UK’s government-run, single-payer system are exacerbating the UK’s difficulties expanding coronavirus testing to its citizens.
The Dark Daily reported in March that a manufacturing snafu had delayed distribution of a CDC-developed diagnostic test to public health laboratories. This meant virtually all testing had to be performed at the CDC, which further slowed testing. Only later that month was the US able to significantly ramp up its testing capacity, according to data from the COVID Tracking Project.
However, the UK has fared even worse, trailing Germany, the US, and other countries, according to reports in Buzzfeed and other media outlets. On March 11, the UK government established a goal of administering 10,000 COVID-19 tests per day by late March, but fell far short of that mark, The Guardian reported. The UK government now aims to increase this to 25,000 tests per day by late April.
This compares with about 70,000 COVID-19 tests per day in
Germany, the Guardian reported, and about 130,000 per day in the US
(between March 26 and April 14), according to the COVID Tracking Project.
What’s Behind the UK’s Lackluster COVID-19 Testing
In January, when the outbreak first hit, Public Health England (PHE) “began a strict program of contact tracing and testing potential cases,” Buzzfeed reported. But due to limited medical laboratory capacity and low supplies of COVID-19 test kits, the government changed course and de-emphasized testing, instead focusing on increased ICU and ventilator capacity. (Scotland, Wales, and Northern Ireland each have separate public health agencies and national health services.)
Another factor that has limited widespread COVID-19 testing is the country’s highly-centralized system of public health laboratories, Buzzfeed reported. “This has limited its ability to scale and process results at the same speed as other countries, despite its efforts to ramp up capacity,” Buzzfeed reported. Public Health England, which initially performed COVID-19 testing at one lab, has expanded to 12 labs. NHS laboratories also are testing for the SARS-CoV-2 coronavirus, PHE stated in “COVID-19: How to Arrange Laboratory Testing” guidance.
“Laboratories in this country have largely been merged, so we have a smaller number of larger [medical] laboratories,” she said. “The alternative is to have a single large testing site. From my perspective, it is more efficient to have a bigger testing site than dissipating our efforts into a lot of laboratories around the country.”
Writing in The Guardian, Paul Hunter, MB ChB MD, a microbiologist and Professor of Medicine at University of East Anglia, cites historic factors behind the testing issue. The public health labs, he explained, were established in 1946 as part of the National Health Service. At the time, they were part of the country’s defense against bacteriological warfare. They became part of the UK’s Health Protection Agency (now PHE) in 2003. “Many of the laboratories in the old network were shut down, taken over by local hospitals or merged into a smaller number of regional laboratories,” he wrote.
US Facing Different Clinical Laboratory Testing Problems
Meanwhile, a few medical laboratories in the US are now contending with a different problem: Unused testing capacity, Nature reported. For example, the Broad Institute of MIT and Harvard in Cambridge, Mass., can run up to 2,000 tests per day, “but we aren’t doing that many,” Stacey Gabriel, PhD, a human geneticist and Senior Director of the Genomics Platform at the Broad Institute, told Nature. Factors include supply shortages and incompatibility between electronic health record (EHR) systems at hospitals and academic labs, Nature reported.
cited the CDC’s narrow testing criteria, and a lack of supplies for collecting
and analyzing patient samples—such as swabs and personal protective equipment—as
reasons for the slowdown in testing at some clinical laboratories in the US.
Challenges Deploying Antibody Tests in UK
The UK has also had problems deploying serology tests designed to detect whether people have developed antibodies against the virus. In late March, Peacock told members of Parliament that at-home test kits for COVID-19 would be available to the public through Amazon and retail pharmacy chains, the Independent reported. And, Politico reported that the government had ordered 3.5 million at-home test kits for COVID-19.
However, researchers at the University of Oxford who had been charged with validating the accuracy of the kits, reported on April 5 that the tests had not performed well and did not meet criteria established by the UK Medicines and Healthcare products Regulatory Agency (MHRA). “We see many false negatives (tests where no antibody is detected despite the fact we know it is there), and we also see false positives,” wrote Professor Sir John Bell, GBE, FRS, Professor of Medicine at the university, in a blog post. No test [for COVID-19], he wrote, “has been acclaimed by health authorities as having the necessary characteristics for screening people accurately for protective immunity.”
He added that it would be “at least a month” before suppliers could develop an acceptable COVID-19 test.
In the United States, the Cellex COVID-19 test is intended for use by medical laboratories. As well, many research sites, academic medical centers, clinical laboratories, and in vitro diagnostics (IVD) companies in the US are working to develop and validate serological tests for COVID-19.
Within weeks, it is expected that a growing number of such
tests will qualify for a Food and Drug Administration (FDA) Emergency Use
Authorization (EUA) and become available for use in patient care.
Known as Prime Editing, the scientists developed this technique as a more accurate way to edit Deoxyribonucleic acid (DNA). In a paper published in Nature, the authors claim prime editing has the potential to correct up to 89% of disease-causing genetic variations. They also claim prime editing is more powerful, precise, and flexible than CRISPR.
The research paper describes prime editing as a “versatile and precise genome editing method that directly writes new genetic information into a specified DNA site using a catalytically impaired Cas9endonuclease fused to an engineered reverse transcriptase, programmed with a prime editing guide RNA (pegRNA) that both specifies the target site and encodes the desired edit.”
And a Harvard Gazette article states, “Prime editing differs from previous genome-editing systems in that it uses RNA to direct the insertion of new DNA sequences in human cells.”
Assuming further research and clinical studies confirm the
viability of this technology, clinical laboratories would have a new diagnostic
service line that could become a significant proportion of a lab’s specimen
volume and test mix.
In that e-briefing we wrote that Liu “has led a team of scientists in the development of a gene-editing protein delivery system that uses cationic lipids and works on animal and human cells. The new delivery method is as effective as protein delivery via DNA and has significantly higher specificity. If developed, this technology could open the door to routine use of genome analysis, worked up by the clinical laboratory, as one element in therapeutic decision-making.”
Now, Liu has taken that development even further.
Cell Division Not Necessary
CRISPR stands for Clustered Regularly Interspaced Short Palindromic Repeats. It is considered the most advanced gene editing technology available. However, it has one drawback not found in Prime Editing—CRISPR relies on a cell’s ability to divide to generate desired alterations in DNA—prime editing does not.
This means prime editing could be used to repair genetic mutations in cells that do not always divide, such as cells in the human nervous system. Another advantage of prime editing is that it does not cut both strands of the DNA double helix. This lowers the risk of making unintended, potentially dangerous changes to a patient’s DNA.
The researchers claim prime editing can eradicate long lengths of disease-causing DNA and insert curative DNA to repair dangerous mutations. These feats, they say, can be accomplished without triggering genome responses introduced by other forms of CRISPR that may be potentially harmful.
“Prime editors are more like word processors capable of
searching for targeted DNA sequences and precisely replacing them with edited
DNA strands,” Liu told NPR.
The scientists involved in the study have used prime editing to perform over 175 edits in human cells. In the test lab, they have succeeded in repairing genetic mutations that cause both Sickle Cell Anemia (SCA) and Tay-Sachs disease, NPR reported.
“Prime editing is really a step—and potentially a significant step—towards this long-term aspiration of the field in which we are trying to be able to make just about any kind of DNA change that anyone wants at just about any site in the human genome,” Liu told News Medical.
Additional Research Required, but Results are Promising
Prime editing is very new and warrants further
investigation. The researchers plan to continue their work on the technology by
performing additional testing and exploring delivery mechanisms that could lead
to human therapeutic applications.
“Prime editing should be tested and optimized in as many cell types as researchers are interested in editing. Our initial study showed prime editing in four human cancer cell lines, as well as in post-mitotic primary mouse cortical neurons,” Liu told STAT. “The efficiency of prime editing varied quite a bit across these cell types, so illuminating the cell-type and cell-state determinants of prime editing outcomes is one focus of our current efforts.”
Although further research and clinical studies are needed to
confirm the viability of prime editing, clinical laboratories could benefit
from this technology. It’s worth watching.