New artificial intelligence model agrees with interpretations of human medical technologists and microbiologists with extraordinary accuracy
Microbiology laboratories will be interested in news from Brescia University in Italy, where researchers reportedly have developed a deep learning model that can visually identify and analyze bacterial species in culture plates with a high level of agreement with interpretations made by medical technologists.
They initially trained and tested the system to digitally identify pathogens associated with urinary tract infections (UTIs). UTIs are the source for a large volume of clinical laboratory microbiological testing.
The system, known as DeepColony, uses hierarchical artificial intelligence technology. The researchers say hierarchical AI is better suited to complex decision-making than other approaches, such as generative AI.
In their Nature paper, the researchers explained that microbiologists use conventional methods to visually examine culture plates that contain bacterial colonies. The scientists hypothesize which species of bacteria are present, after which they test their hypothesis “by regrowing samples from each colony separately and then employing mass spectroscopy techniques,” to confirm their hypotheses.
However, DeepColony—which was designed for use with clinical laboratory automation systems—looks at high-resolution digital scans of cultured plates and attempts to identify the bacterial strains and analyze them in much the same way a microbiologist would. For example, it can identify species based on their appearance and determine which colonies are suitable for analysis, the researchers explained.
“Working on a large stream of clinical data, and a complete set of 32 pathogens, the proposed system is capable of effectively assisting plate interpretation with a surprising degree of accuracy in the widespread and demanding framework of urinary tract infections,” the study authors wrote. “Moreover, thanks to the rich species-related generated information, DeepColony can be used for developing trustworthy clinical decision support services in laboratory automation ecosystems from local to global scale.”
“Compared to the most common solutions based on single convolutional neural networks (CNN), multi-network architectures are attractive in our case because of their ability to fit into contexts where decision-making processes are stratified into a complex structure,” wrote the study’s lead author Alberto Signoroni, PhD (above), Associate Professor of Computer Science, University of Brescia, and his researcher team in their Nature paper. “The system must be designed to generate useful and easily interpretable information and to support expert decisions according to safety-by-design and human-in-the-loop policies, aiming at achieving cost-effectiveness and skill-empowerment respectively.” Microbiologists and clinical laboratory managers will want to follow the further development of this technology. (Photo copyright: University of Brescia.)
How Hierarchical AI Works
Writing in LinkedIn, patent attorney and self-described technology expert David Cain, JD, of Hauptman Ham, LLP, explained that hierarchical AI systems “are structured in layers, each with its own distinct role yet interconnected in a way that forms a cohesive whole. These systems are significant because they mirror the complexity of human decision-making processes, incorporating multiple levels of analysis and action. This multi-tiered approach allows for nuanced problem-solving and decision-making, akin to a seasoned explorer deftly navigating through a multifaceted terrain.”
DeepColony, the researchers wrote, consists of multiple convolutional neural networks (CNNs) that exchange information and cooperate with one another. The system is structured into five levels—labeled 0 through 4—each handling a different part of the analysis:
At level 0, the system determines the number of bacterial colonies and their locations on the plate.
At level 1, the system identifies “good colonies,” meaning those suitable for further identification and analysis.
At level 2, the system assigns each good colony to a bacterial species “based on visual appearance and growth characteristics,” the researchers wrote, referring to the determination as being “pathogen aware, similarity agnostic.”
The CNN used at this stage was trained by using images of 26,213 isolated colonies comprising 32 bacterial species, the researchers wrote in their paper. Most came from clinical laboratories, but some were obtained from the American Type Culture Collection (ATCC), a repository of biological materials and information resources available to researchers.
At level 3, the system attempts to improve accuracy by looking at the larger context of the plate. The goal here is to “determine if observed colonies are similar (pure culture) or different (mixed cultures),” the researchers wrote, describing this step as “similarity aware, pathogen agnostic.” This enables the system to recognize variants of the same strain, the researchers noted, and has the effect of reducing the number of strains identified by the system.
At this level, the system uses two “Siamese CNNs,” which were trained with a dataset of 200,000 image pairs.
Then, at level 4, the system “assesses the clinical significance of the entire plate,” the researchers added. Each plate is labeled as:
“Positive” (significant bacterial growth),
“No significant growth” (negative), or
“Contaminated,” meaning it has three or more “different colony morphologies without a particular pathogen that is prevalent over the others,” the researchers wrote.
If a plate is labeled as “positive,” it can be “further evaluated for possible downstream steps,” using MALDI-TOF mass spectrometry or tests to determine susceptibility to antimicrobial measures, the researchers stated.
“This decision-making process takes into account not only the identification results but also adheres to the specific laboratory guidelines to ensure a proper supportive interpretation in the context of use,” the researchers wrote.
Nearly 100% Agreement with Medical Technologists
To gauge DeepColony’s accuracy, the researchers tested it on a dataset of more than 5,000 urine cultures from a US laboratory. They then compared its analyses with those of human medical technologists who had analyzed the same samples.
Agreement was 99.2% for no-growth cultures, 95.6% for positive cultures, and 77.1% for contaminated or mixed growth cultures, the researchers wrote.
The lower agreement for contaminated cultures was due to “a deliberately precautionary behavior, which is related to ‘safety by design’ criteria,” the researchers noted.
Lead study author Alberto Signoroni, PhD, Associate Professor of Computer Science, University of Brescia, wrote in Nature that many of the plates identified by medical technologists as “contaminated” were labeled as “positive” by DeepColony. “We maximized true negatives while allowing for some false positives, so that DeepColony [can] focus on the most relevant or critical cases,” he said.
Will DeepColony replace medical technologists in clinical laboratories any time soon? Not likely. But the Brescia University study indicates the direction AI in healthcare is headed, with high accuracy and increasing speed. The day may not be far off when pathologists and microbiologists regularly employ AI algorithms to diagnose disease.
Radboud University researchers fear oncology, molecular biology, pharmacology, and other cell-centric medical research efforts are at risk due to verification that at least 30,000 studies published in 33,000 scientific journals included data derived from misidentified or contaminated cell lines
Many research findings that underpin the science behind various diagnostic technologies used regularly by clinical laboratories and anatomic pathology groups may not be valid. This is because a large number of published studies may have used misidentified or contaminated cell lines.
Biomedical scientists have known for a long time that many research papers exist containing reports on the wrong cells due to cell line misidentification. And yet, few studies have measured the true scope of the problem. Until now. Researchers at Radboud University in the Netherlands have determined that this problem may have influenced the findings of thousands of published research studies and upon which many other research studies were conducted.
Because clinical laboratories and anatomic pathology groups use assays and diagnostic tests that are developed as a result of these research studies, identifying how many published papers have inaccurate findings that cannot be duplicated would affect how and when it is appropriate for physicians to order certain medical laboratory tests and rely on the results.
Additionally, cancer research is based on cell line studies as well. Thus, it may prove necessary to restudy existing published findings and revise them as appropriate. In turn, these new findings might change how and when some cancer tests are ordered and the results interpreted.
“We considered a reference to this original article as a good proxy for the usage of a cell line,” the researchers noted in their study published in the journal PLOS ONE. “Since typically the original papers are focused on reporting the establishment of the cell line only.”
They focused on misidentified cell lines that were caused by HeLa cells, also known as “immortalized cells.” HeLa cells have been used in scientific research for decades. They were the first mass-producible cells that could be used in vitro, making them highly desirable for biomedical research.
However, the process of creating immortalized cells involves mutation, during which contamination can be introduced by other cells. Immortalized cells can be identified as one type of cell when in fact they are actually another type of cell.
Research scientists have been aware of this problem for about as long as immortalized cells have been in use. They attempt to take it into account when completing their analyses, though not always successfully.
The Radboud researchers found 32,655 records of primary literature based on contaminated cell lines. They then cross-referenced the ICLAC Register of Misidentified Cell Lines with a range of databases to determine if articles were available for each of the 451 cell lines listed on Table One of the ICLAC Register.
With this information, they further researched published articles in the Web of Science database using cell line identifiers. They noted both primary literature and any citation report entries for each cell line.
The researchers noted in their published study, “As we only searched for cell lines known to be misidentified, this constitutes a conservative estimate of the scale of contamination in the primary literature. Moreover, to avoid false positives, we excluded several cell lines, such as the ones with non-unique identifiers or the cell lines for which verified stock is still in circulation.”
Their estimate for secondary contaminated literature based off primary articles is larger still. “In total, we can conservatively estimate the citations to the primary contaminated primary literature at over 500,000, excluding self-citations,” the authors noted in their PLOS ONE article. “Thereby leaving traces in a substantial share of the biomedical literature.” They concluded, “… the amount of research potentially building on false grounds remains worrisome.”
Impact of Contaminated Cell Lines on Research, Clinical Laboratory Communities
Many of the assays and diagnostic tests performed by clinical laboratories and pathology groups were developed using cell line research. Should further scrutiny into the ability to duplicate and verify study findings fail to produce positive outcomes, it might call into question the validity and appropriate use of these tests.
For the research community, these findings represent yet another call to promote accountability and define standards for verifying authenticity of cell lines to further strengthen research findings.
The Radboud researchers ranked the number of contaminated articles they discovered by research area. Top affected areas include:
Oncology
Molecular Biology
Pharmacology
Cell Biology
Immunology
The distribution of contaminated primary literature over the research areas as defined by Web of Science. Only the 25 most affected research areas are included. (Graphic copyright: PLOS ONE.)
Addressing the Problem of Cell Line Contamination and Misidentification
Adapting the ever-growing body of published medical literature to reflect the known misidentifications, as well as the possibility of invalid results, will be a major undertaking. Ultimately, resolving this problem could require changes to practices and procedures currently used by research facilities and medical laboratories.
While the cost to authenticate cell lines adds to the bottom line of research projects, the money spent on research that becomes invalidated by misidentified cell lines is far greater.
In a 2015 Retraction Watch article, Leonard P. Freeman, PhD, President, Global Biological Standards Institute, notes, “An NIH RePORT search identified 9,000 active projects using cell lines, totaling $3.7 billion. Required use of authentication techniques would affect over $900 million in research dollars annually.”
Additionally, failure to adapt authentication as a part of standard operations brings other consequences. “A 2004 survey reported that just one-third of laboratories authenticate their cell lines,” Freeman noted. “10 years later, a Sigma-Aldrich survey found that only 37% of respondents ‘validate the purity and identity before first use’ of cell lines. Understanding the existing barriers that prevent implementation of universal cell authentication is central to changing this sad state of affairs.”
Mixed Recommendations for Fixing Inaccurate Published Studies
Of course, none of this will change the vast body of archived literature that might contain errors due to misidentification. Recommendations for addressing this aspect of the problem vary. The Radboud study authors suggest posting notes on any previously published articles stating that misidentified cell lines were used.
However, in a STAT article, Ivan Oransky, MD, and Adam Marcus, Managing Editor, Gastroenterology and Endoscopy News, co-founders of Retraction Watch, recommend more severe measures. “When we polled readers of Retraction Watch last December about the issue, 55% said journals should correct papers known to describe contaminated or misidentified cell lines, and more than 40% said retraction was the right choice.”
Thanks to the Radboud study, as cell lines continue to power the innovations of modern biomedical research, concerns will surely increase surrounding cell-line authentication and research findings. For pathology groups and medical laboratories, staying abreast of these developments will work to ensure data validity and reduce reputation and liability concerns.