News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel

News, Analysis, Trends, Management Innovations for
Clinical Laboratories and Pathology Groups

Hosted by Robert Michel
Sign In

University Hospitals Birmingham Claims Its New AI Model Detects Certain Skin Cancers with Nearly 100% Accuracy

But dermatologists and other cancer doctors still say AI is not ready to operate without oversight by clinical physicians

Dermatopathologists and the anatomic pathology profession in general have a new example of how artificial intelligence’s (AI’s) ability to detect cancer with accuracy comparable to a trained pathologist has greatly improved. At the latest European Academy of Dermatology and Venereology (EADV) Congress, scientists presented a study in which researchers with the University Hospitals Birmingham NHS Foundation Trust used an AI platform to assess 22,356 people over 2.5 years.

According to an EADV press release, the AI software demonstrated a “100% (59/59 cases identified) sensitivity for detecting melanoma—the most serious form of skin cancer.” The AI software also “correctly detected 99.5% (189/190) of all skin cancers and 92.5% (541/585) of pre-cancerous lesions.”  

“Of the basal cell carcinoma cases, a single case was missed out of 190, which was later identified at a second read by a dermatologist ‘safety net.’ This further demonstrates the need to have appropriate clinical oversight of the AI,” the press release noted.

AI is being utilized more frequently within the healthcare industry to diagnose and treat a plethora of illnesses. This recent study performed by scientists in the United Kingdom demonstrates that new AI models can be used to accurately diagnose some skin cancers, but that “AI should not be used as a standalone detection tool without the support of a consultant dermatologist,” the press release noted.

“The role of AI in dermatology and the most appropriate pathway are debated,” said Kashini Andrew, MBBS, MSc (above), Specialist Registrar at University Hospitals Birmingham NHS Foundation Trust. “Further research with appropriate clinical oversight may allow the deployment of AI as a triage tool. However, any pathway must demonstrate cost-effectiveness, and AI is currently not a stand-alone tool in dermatology. Our data shows the great promise of AI in future provision of healthcare.” Clinical laboratories and dermatopathologists in the United States will want to watch the further development of this AI application. (Photo copyright: LinkedIn.)

How the NHS Scientists Conducted Their Study

Researchers tested their algorithm for almost three years to determine its ability to detect cancerous and pre-cancerous growths. A group of dermatologists and medical photographers entered patient information into their algorithm and trained it how to detect abnormalities. The collected data came from 22,356 patients with suspected skin cancers and included photos of known cancers.

The scientists then repeatedly recalibrated the software to ensure it could distinguish between non-cancerous lesions and potential cancers or malignancies. Dermatologists then reviewed the final data from the algorithm and compared it to diagnoses from health professionals.

“This study has demonstrated how AI is rapidly improving and learning, with the high accuracy directly attributable to improvements in AI training techniques and the quality of data used to train the AI,” said Kashini Andrew, MBBS, MSc, Specialist Registrar at University Hospitals Birmingham NHS Foundation Trust, and co-author of the study, in  EADV press release.

Freeing Up Physician Time

The EADV Congress where the NHS researchers presented their findings took place in October in Berlin. The first model of their AI software was tested in 2021 and that version was able to detect:

  • 85.9% (195 out of 227) of melanoma cases,
  • 83.8% (903 out of 1078) of all skin cancers, and
  • 54.1% (496 out of 917) of pre-cancerous lesions.

After fine-tuning, the latest version of the algorithm was even more promising, with results that included the detection of:

  • 100% (59 out of 59) cases of melanoma,
  • 99.5% (189 out of 190) of all skin cancers, and
  • 92.5% (541 out of 585) pre-cancerous lesions.

“The latest version of the software has saved over 1,000 face-to-face consultations in the secondary care setting between April 2022 and January 2023, freeing up more time for patients that need urgent attention,” Andrew said in the press release.

Still, the researchers admit that AI should not be used as the only detection method for skin cancers.

“We would like to stress that AI should not be used as a standalone tool in skin cancer detection and that AI is not a substitute for consultant dermatologists,” stated Irshad Zaki, B Med Sci (Hons), Consultant Dermatologist at University Hospitals Birmingham NHS Foundation Trust and one of the authors of the study, in the press release.

“The role of AI in dermatology and the most appropriate pathway are debated. Further research with appropriate clinical oversight may allow the deployment of AI as a triage tool,” said Andrew in the press release. “However, any pathway must demonstrate cost-effectiveness, and AI is currently not a stand-alone tool in dermatology. Our data shows the great promise of AI in future provision of healthcare.”

Two People in the US Die of Skin Cancer Every Hour

According to the Skin Cancer Foundation, skin cancer is the most common cancer in the United States as well as the rest of the world. More people in the US are diagnosed with skin cancer every year than all other cancers combined.

When detected early, the five-year survival rate for melanoma is 99%, but more than two people in the US die of skin cancer every hour. At least one in five Americans will develop skin cancer by the age of 70 and more than 9,500 people are diagnosed with the disease every day in the US.

The annual cost of treating skin cancers in the United States is estimated at $8.1 billion annually, with approximately $3.3 billion of that amount being for melanoma and the remaining $4.8 billion for non-melanoma skin cancers.

More research is needed before University Hospitals Birmingham’s new AI model can be used clinically in the diagnoses of skin cancers. However, its level of accuracy is unprecedented in AI diagnostics. This is a noteworthy step forward in the field of AI for diagnostic purposes that can be used by clinical laboratories and dermatopathologists.

—JP Schlingman

Related Information:

The App That is 100% Effective at Spotting Some Skin Cancers—as Study Shows Melanoma No Longer the Biggest Killer

AI Software Shows Significant Improvement in Skin Cancer Detection, New Study Shows

Skin Cancer Facts and Statistics

Google DeepMind Says Its New Artificial Intelligence Tool Can Predict Which Genetic Variants Are Likely to Cause Disease

AMA Issues Proposal to Help Circumvent False and Misleading Information When Using Artificial Intelligence in Medicine

UCLA’s Virtual Histology Could Eliminate Need for Invasive Biopsies for Some Skin Conditions and Cancers

IT Experts Demonstrate How AI and Computer Microphones Can Be Used to Figure Out Passwords and Break into Customer Accounts

Clinical laboratories and pathology groups should be on the alert to this new digital threat; telehealth sessions and video conferencing calls particularly vulnerable to acoustic AI attacks

Banks may be the first to get hit by a new form of hacking because of all the money they hold in deposit accounts, but experts say healthcare providers—including medical laboratories—are comparably lucrative targets because of the value of patient data. The point of this hacking spear is artificial intelligence (AI) with increased capabilities to penetrate digital defenses.

AI is developing rapidly. Are healthcare organizations keeping up? The hackers sure are. An article from GoBankingRates titled, “How Hackers Are Using AI to Steal Your Bank Account Password,” reveals startling new AI capabilities that could enable bad actors to compromise information technology (IT) security and steal from customers’ accounts.

Though the article covers how the AI could conduct cyberattacks on bank information, similar techniques can be employed to gain access to patients’ protected health information (PHI) and clinical laboratory databases as well, putting all healthcare consumers at risk.

The new AI cyberattack employs an acoustic Side Channel Attack (SCA). An SCA is an attack enabled by leakage of information from a physical computer system. The “acoustic” SCA listens to keystrokes through a computer’s microphone to guess a password with 95% accuracy.

That’s according to a UK study published in IEEE Xplore, a journal of the IEEE European Symposium on Security and Privacy Workshops, titled, “A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards.”

“With recent developments in deep learning, the ubiquity of microphones and the rise in online services via personal devices, acoustic side channel attacks present a greater threat to keyboards than ever,” wrote UK study authors Joshua Harrison, MEng, Durham University; Ehsan Toreini, University of Surrey; and Maryam Mehrnezhad, PhD, University of London.

Hackers could be recording keystrokes during video conferencing calls as well, where an accuracy of 93% is achievable, the authors added.

This nefarious technological advance could spell trouble for healthcare security. Using acoustic SCA attacks, busy healthcare facilities, clinical laboratories, and telehealth appointments could all be potentially compromised.

“The ubiquity of keyboard acoustic emanations makes them not only a readily available attack vector, but also prompts victims to underestimate (and therefore not try to hide) their output,” wrote Joshua Harrison, MEng (above), and his team in their IEEE Xplore paper. “For example, when typing a password, people will regularly hide their screen but will do little to obfuscate their keyboard’s sound.” Since computer keyboards and microphones in healthcare settings like hospitals and clinical laboratories are completely ubiquitous, the risk that this AI technology will be used to invade and steal patients’ protected health information is high. (Photo copyright: CNBC.)

Why Do Hackers Target Healthcare?

Ransomware attacks in healthcare are costly and dangerous. According to InstaMed, a healthcare payments and billing company owned by J.P. Morgan, healthcare data breaches increased to 29.5% in 2021 costing over $9 million. And beyond the financial implications, these attacks put sensitive patient data at risk.

Healthcare can be seen as one of the most desirable markets for hackers seeking sensitive information. As InstaMed points out, credit card hacks are usually quickly figured out and stopped. However, “medical records can contain multiple pieces of personally identifiable information. Additionally, breaches that expose this type of data typically take longer to uncover and are harder for an organization to determine in magnitude.”

With AI advancing at such a high rate, healthcare organizations may be unable to adapt older network systems quickly—leaving them vulnerable.

“Legacy devices have been an issue for a while now,” Alexandra Murdoch, medical data analyst at GlobalData PLC, told Medical Device Network, “Usually big medical devices, such as imaging equipment or MRI machines are really expensive and so hospitals do not replace them often. So as a result, we have in the network these old devices that can’t really be updated, and because they can’t be updated, they can’t be protected.”

Vulnerabilities of Telehealth

In “Penn Medicine Study Shows Telemedicine Can Cut Employer Healthcare Costs by 25%,” Dark Daily reported a study conducted by the Perelman School of Medicine at the University of Pennsylvania (Penn Medicine) which suggested there could be significant financial advantages for hospitals that conduct telehealth visits. This, we projected, would be a boon to clinical laboratories that perform medical testing for telemedicine providers.

But telehealth, according to the UK researchers, may also be one way hackers get past safeguards and into critical hospital systems.

“When trained on keystrokes recorded using the video-conferencing software Zoom, an accuracy of 93% was achieved, a new best for the medium. Our results prove the practicality of these side channel attacks via off-the-shelf equipment and algorithms,” the UK researchers wrote in IEEE Xplore.

“[AI] has worrying implications for the medical industry, as more and more appointments go virtual, the implications of deepfakes is a bit concerning if you only interact with a doctor over a Teams or a Zoom call,” David Higgins, Senior Director at information security company CyberArk, told Medical Device Network.

Higgins elaborated on why healthcare is a highly targeted industry for hackers.

“For a credit card record, you are looking at a cost of one to two dollars, but for a medical record, you are talking much more information because the gain for the purposes of social engineering becomes very lucrative. It’s so much easier to launch a ransomware attack, you don’t even need to be a coder, you can just buy ransomware off of the dark web and use it.”

Steps Healthcare Organizations Should Take to Prevent Cyberattacks

Hackers will do whatever they can to get their hands on medical records because stealing them is so lucrative. And this may only be the beginning, Higgins noted.

“I don’t think we are going to see a slowdown in attacks. What we are starting to see is that techniques to make that initial intrusion are becoming more sophisticated and more targeted,” he told Medical Device Network. “Now with things like AI coming into the mix, it’s going to become much harder for the day-to-day individual to spot a malicious email. Generative AI is going to fuel more of that ransomware and sadly it’s going to make it easier for more people to get past that first intrusion stage.”

To combat these attacks patient data needs to be encrypted, devices updated, and medical staff well-trained to spot cyberattacks before they get out of hand. These SCA attacks on bank accounts could be easily transferable to attacks on healthcare organizations’ patient records.

Clinical laboratories, anatomic pathology groups, and other healthcare facilities would be wise to invest in cybersecurity, training for workers, and updated technology. The hackers are going to stay on top of the technology, healthcare leaders need to be one step ahead of them.

—Ashley Croce

Related Information:

How Hackers Are Using AI to Steal Your Bank Account Password

A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards

AI Can Steal Passwords with 95% Accuracy by ‘Listening’ to Keystrokes, Alarming Study Finds

New ‘Deep Learning Attack’ Deciphers Laptop Keystrokes with 95% Accuracy

Can A.I. Steal Your Password? Study Finds 95% Accuracy by Listening to Keyboard Typing

Ransomware in Healthcare: What You Need to Know

Hospital 2040: How Healthcare Cybercrime is Predicted to Escalate

30 Crucial Cybersecurity Statistics (2023): Data, Trends and More

Penn Medicine Study Shows Telemedicine Can Cut Employer Healthcare Costs by 25%

Orchid Health Announces Release of First Commercially-Available Whole Genome Sequencing Service for Certain Diseases in Preimplantation Embryos

Clinical laboratory managers should note that this company’s new diagnostic offering involving screening embryos for specific genetic conditions is not without controversy

Is the world ready for whole genome sequencing (WGS) of preimplantation embryos to help couples undergoing in vitro fertilization (IVF) treatments know if their embryos  have potential genetic health problems? Orchid Health, a clinical preimplantation genetic testing (PGT) laboratory that conducts genetic screening in San Francisco, believes the answer is yes! But the cost is high, and the process is not without controversy.

According to an article in Science, Orchid’s service—a sequencings of the whole human genome of preimplantation embryos at $2,500 per embryo tested—“will look not just for single-gene mutations that cause disorders such as cystic fibrosis, but also more extensively for medleys of common and rare gene variants known to predispose people to neurodevelopmental disorders, severe obesity, and certain psychiatric conditions such as schizophrenia.”

However, Science also noted that some genomics researchers “claim the company inappropriately uses their data to generate some of its risk estimates,” adding that the “Psychiatric Genomics Consortium (PGC), an international group of more than 800 researchers working to decode the genetic and molecular underpinnings of mental health conditions, says Orchid’s new test relies on data [PGC] produced over the past decade, and that the company has violated restrictions against the data’s use for embryo screening.”

There are some who assert that a whole genome sequence of an embryo—given today’s state of genetic technology and knowledge—could generate information that cannot be interpreted accurately in ways that help parents and doctors make informed prenatal testing decisions. At the same time, criticisms expressed by the PGC raise reasonable points.

Perhaps this is a sign of the times. Orchid Health is the latest genetic testing company that is looking to get ahead of genetic testing competitors with its diagnostics offerings. Meanwhile, knowledgeable and credible experts question the appropriateness of this testing, given the genetic knowledge that exists today.

Noor Siddiqui

“This is a major advance in the amount of information parents can have,” Orchid’s founder and CEO Noor Siddiqui (above) told CNBC. “The way that you can use that information is really up to you, but it gives a lot more control and confidence into a process that, for all of history, has just been totally left to chance.” Should Orchid Health’s analysis prove useful, pediatricians could order further clinical laboratory prenatal testing to confirm and diagnose potential genetic diseases for parents. (Photo copyright: General Assembly.)

Orchid Receives World-class Support

Regardless of the pushback from some genetic researchers, Orchid has attracted several world-class geneticists and genetics investors to its board of advisors. They include:

The WGS test, according to Orchid, detects genetic errors in embryos that are linked to severe illnesses before a pregnancy even begins. And by sequencing 99% of an embryo’s DNA, the test can spot potential health risks that could affect a future baby.

According to its website, the PGT lab company uses the WGS data to identify both monogenic (single-gene) and polygenic (multiple-gene) diseases, including:

The company also claims its genetic screening can predict the risk of brain health issues in the unborn, such as Alzheimer’s disease, bipolar disorder, and schizophrenia, as well as heart health issues such atrial fibrillation and coronary artery disease.

Other health problems such as celiac disease and Type I/II diabetes also can be forecasted with the test, Orchid claims. 

Not all Genetics Experts Agree

Orchid is not without its critics. Knowledgeable, credible experts have questioned the appropriateness of this type of genetic testing. They fear it could become a modern-day form of eugenics.

Andrew McQuillin, PhD, Professor of Molecular Psychiatry at University College London, has concerns about Orchid’s preimplantation genetic testing. He maintains that it is difficult to control how such data is used, and that even the most accurate sequencing techniques do not predict disease risk very well. 

“[Polygenic risk scores are] useful in the research context, but at the individual level, they’re not actually terribly useful to predict who’s going to develop schizophrenia or not,” McQuillin told Science. “We can come up with guidance on how these things should be used. The difficulty is that official guidance like that doesn’t feature anywhere in the marketing from these companies.”

McQuillin also stated that researchers must have an extensive discussion regarding the implications of this type of embryo screening.

“We need to take a look at whether this is really something we should be doing. It’s the type of thing that, if it becomes widespread, in 40 years’ time, we will ask, ‘What on Earth have we done?’” McQuillin emphasized.

Redefining Reproduction

It takes about three weeks for couples to receive their report back from Orchid after completing the whole genome sequence of a preimplantation embryo. A board-certified genetic counselor then consults with the parents to help them understand the results. 

Founder and CEO Noor Siddiqui hopes Orchid will be able to scale up its operations and introduce more automation to the testing process to the cost per embryo.

“We want to make this something that’s accessible to everyone,” she told CNBC.

“I think this has the potential to totally redefine reproduction,” she added. “I just think that’s really exciting to be able to make people more confident about one of the most important decisions of their life, and to give them a little bit more control.”

Clinical laboratories have long been involved in prenatal screening to gain insight into risk levels associated with certain genetic disorders. Even some of that testing comes with controversy and ambiguous findings. Whether Orchid Health’s PGT process delivers accurate, reliable diagnostic insights regarding preimplantation embryos remains to be seen.

—JP Schlingman

Related Information:

Genetics Group Slams Company for Using Its Data to Screen Embryos’ Genomes

Reproductive Startup Launches Test to Identify an Embryo’s Genetic Defects Before an IVF Pregnancy Begins

What Is the Difference Between Monogenic and Polygenic Diseases?

First Clinical Validation of Whole Genome Screening on Standard Trophectoderm Biopsies of Preimplantation Embryos

Orchid Tests Embryos for Genetic Diseases. It Just Raised $12 Million with This 11-Slide Pitch Deck

Electronic Health Records Vendors Now Adding Generative AI to Their Products

One goal of these new functions is to streamline physician workflows. However, these new EHRs may interface differently with clinical laboratory information systems

Artificial intelligence (AI) developers are making great contributions in clinical laboratory, pathology, radiology, and other areas of healthcare. Now, Electronic Health Record (EHR) developers are looking into ways to incorporate a new type of AI—called “Generative AI”—into their EHR products to assist physicians with time-consuming and repetitive administrative tasks and help them focus on patient-centered care. 

Generative AI uses complex algorithms and statistical models to learn patterns from collected data. It then generates new content, including text, images, and audio/video information.

According to the federal Government Accountability Office (GAO), generative AI “has potential applications across a wide range of fields, including education, government, medicine, and law” and that “a research hospital is piloting a generative AI program to create responses to patient questions and reduce the administrative workload of healthcare providers.”

Reducing the workload on doctors and other medical personnel is a key goal of the EHR developers.

Generative AI uses deep learning neural networks modeled after the human brain comprised of layers of connected nodes that process data. It employs two neural networks: a generator [generative network] which creates new content, and a discriminator [discriminative network] which evaluates the quality of that content.

The collected information is entered into the network where each individual node processes the data and passes it on to the next layer. The last layer in the process produces the final output. 

Many EHR companies are working toward adding generative AI into their platforms, including:

As our sister publication The Dark Report points out in its December 26 “Top 10 Biggest Lab Stories for 2023,” almost every product or service presented to a clinical laboratory or pathology group will soon include an AI-powered solution.

Girish Navani

“We believe that generative AI has the potential of being a personal assistant for every doctor, and that’s what we’re working on,” Girish Navani (above), co-founder and CEO of eClinicalWorks, told EHRIntelligence. “It could save hours. You capture the essence of the entire conversation without touching a keyboard. It is transformational in how it works and how well it presents the information back to the provider.” Clinical laboratory information systems may also benefit from connecting with generative AI-based EHRs. (Photo copyright: eClinicalWorks.)

Generative AI Can Help with Physician Burnout

One of the beneficial features of generative AI is that it has the ability to “listen” to a doctor’s conversation with a patient while recording it and then produce clinical notes. The physician can then review, edit, and approve those notes to enter into the patient’s EHR record, thus streamlining administrative workflows.

“The clinician or support team essentially has to take all of the data points that they’ve got in their head and turn that into a narrative human response,” Phil Lindemann, Vice President of Data and Analytics at Epic, told EHRIntelligence. “Generative AI can draft a response that the clinician can then review, make changes as necessary, and then send to the patient.”

By streamlining and reducing workloads, EHRs that incorporate generative AI may help reduce physician burnout, which has been increasing since the COVID-19 pandemic.

A recent study published in the Journal of the American Informatics Association (JAMIA) titled, “Association of Physician Burnout with Perceived EHR Work Stress and Potentially Actionable Factors,” examined physician burnout associated with EHR workload factors at UC San Diego Health System. The researchers found that nearly half of surveyed doctors reported “burnout symptoms” and an increase in stress levels due to EHR processes. 

“Language models have a huge potential in impacting almost every workflow,” Girish Navani, co-founder and CEO of eClinicalWorks, told EHRIntelligence. “Whether it’s reading information and summarizing it or creating the right type of contextual response, language models can help reduce cognitive load.”

Generative AI can also translate information into many different languages. 

“Health systems spend a lot of time trying to make patient education and different things available in certain languages, but they’ll never have every language possible,” Lindemann said. “This technology can take human language, translate it at any reading level in any language, and have it understandable.”

MEDITECH is working on a generative AI project to simplify clinical documentation with an emphasis on hospital discharge summaries that can be very laborious and time-consuming for clinicians.

“Providers are asked to go in and review previous notes and results and try to bring that all together,” Helen Waters, Executive Vice President and COO of MEDITECH, told EHRIntelligence. “Generative AI can help auto-populate the discharge note by bringing in the discrete information that would be most relevant to substantiate that narrative and enable time savings for those clinicians.”

Many Applications for Generative AI in Healthcare

According to technology consulting and solutions firm XenonStack, generative AI has many potential applications in healthcare including:

  • Medical simulation
  • Drug discovery
  • Medical chatbots
  • Medical imaging
  • Medical research
  • Patient care
  • Disease diagnosis
  • Personalized treatment plans

The technology is currently in its early stages and does present challenges, such as lack of interpretability, the need for large datasets and more transparency, and ethical concerns, all of which will need to be addressed. 

“We see it as a translation tool,” Lindemann told EHRIntelligence. “It’s not a panacea, but there’s going to be really valuable use cases, and the sooner the community can agree on that, the more useful the technology’s going to be.”

Since generative AI can be used to automate manual work processes, clinical laboratories and anatomic pathology groups should be alert to opportunities to interface their LISs with referring physicians’ EHRs. Such interfaces may enable the use of the generative AI functions to automate manual processes in both the doctors’ offices and the labs.

—JP Schlingman

Related Information:

How Four EHR Vendors Are Leveraging Generative AI in Clinical Workflows

NextGen Healthcare Unveils NextGen Ambient Assist, an AI Solution Designed to Boost Provider Efficiency

Science and Tech Spotlight: Generative Ai

What is Generative AI? Everything You Need to Know

Generative AI Could Revolutionize Health Care—But Not if Control is Ceded to Big Tech

Generative AI in Healthcare and Its Uses—Complete Guide

Association of Physician Burnout with Perceived EHR Work Stress and Potentially Actionable Factors

University of Florida Study Determines That ChatGPT Made Errors in Advice about Urology Cases

Research results call into question the safety and dependability of using artificial intelligence in medical diagnosis, a development that should be watched by clinical laboratory scientists

ChatGPT, an artificial intelligence (AI) chatbot that returns answers to written prompts, has been tested and found wanting by researchers at the University of Florida College of Medicine (UF Health) who looked into how well it could answer typical patient questions on urology. Not good enough according to the researchers who conducted the study.

AI is quickly becoming a powerful new tool in diagnosis and medical research. Some digital pathologists and radiologists use it for data analysis and to speed up diagnostic modality readings. It’s even been said that AI will improve how physicians treat disease. But with all new discoveries there comes controversy, and that’s certainly the case with AI in healthcare.

Many voices in opposition to AI’s use in clinical medicine claim the technology is too new and cannot be trusted with patients’ health. Now, UF Health’s study seems to have confirmed that belief—at least with ChatGPT.

The study revealed that answers ChatGPT provided “fell short of the standard expected of physicians,” according to a UF Health new release, which called ChatGPT’s answers “flawed.”

The questions posed were considered to be common medical questions that patients would ask during a visit to a urologist.

The researchers believes their study is the first of its kind to focus on AI and the urology specialty and which “highlights the risk of asking AI engines for medical information even as they grow in accuracy and conversational ability,” UF Health noted in the news release.

The researchers published their findings in the journal Urology titled, “Caution! AI Bot Has Entered the Patient Chat: ChatGPT Has Limitations in Providing Accurate Urologic Healthcare Advice.”

Russell S. Terry, MD

“I am not discouraging people from using chatbots,” said Russell S. Terry, MD (above), an assistant professor in the UF College of Medicine’s department of urology and the study’s senior author, in a UF Health news release. “But don’t treat what you see as the final answer. Chatbots are not a substitute for a doctor.” Pathologists and clinical laboratory managers will want to monitor how developers improve the performance of chatbots and other applications using artificial intelligence. (Photo copyright: University of Florida.)

UF Health ChatGPT Study Details

UF Health’s study featured 13 of the most queried topics from patients to their urologists during office visits. The researchers asked ChatGPT each question three times “since ChatGPT can formulate different answers to identical queries,” they noted in the news release.

The urological conditions the questions covered included:

The researchers then “evaluated the answers based on guidelines produced by the three leading professional groups for urologists in the United States, Canada, and Europe, including the American Urological Association (URA). Five UF Health urologists independently assessed the appropriateness of the chatbot’s answers using standardized methods,” UF Health noted.

Notable was that many of the results were inaccurate. According to UF Health, only 60% of responses were deemed appropriate from the 39 evaluated responses. Outside of those results, the researchers noted in their Urology paper, “[ChatGPT] misinterprets clinical care guidelines, dismisses important contextual information, conceals its sources, and provides inappropriate references.”

When asked, for the most part ChatGPT was not able to accurately provide the sources it referenced for its answers. Apparently, the chatbot was not programmed to provide such sources, the UF Health news release stated.

“It provided sources that were either completely made up or completely irrelevant,” Terry noted in the new release. “Transparency is important so patients can assess what they’re being told.”

Further, “Only 7 (54%) of 13 topics and 21 (54%) of 39 responses met the BD [Brief DISCERN] cut-off score of ≥16 to denote good-quality content,” the researchers wrote in their paper. BD is a validated healthcare information assessment questionnaire that “provides users with a valid and reliable way of assessing the quality of written information on treatment choices for a health problem,” according to the DISCERN website.

ChatGPT often “omitted key details or incorrectly processed their meaning, as it did by not recognizing the importance of pain from scar tissue in Peyronie’s disease. As a result … the AI provided an improper treatment recommendation,” the UF Health study paper noted.

Is Using ChatGPT for Medical Advice Dangerous to Patients?

Terry noted that the chatbot performed better in some areas over others, such as infertility, overactive bladder, and hypogonadism. However, frequently recurring UTIs in women was one topic of questions for which ChatGPT consistently gave incorrect results.

“One of the more dangerous characteristics of chatbots is that they can answer a patient’s inquiry with all the confidence of a veteran physician, even when completely wrong,” UF Health reported.

“In only one of the evaluated responses did the AI note it ‘cannot give medical advice’ … The chatbot recommended consulting with a doctor or medical adviser in only 62% of its responses,” UF Health noted.

For their part, ChatGPT’s developers “tell users the chatbot can provide bad information and warn users after logging in that ChatGPT ‘is not intended to give advice,’” UF Health added.

Future of Chatbots in Healthcare

In UF Health’s Urology paper, the researchers state, “Chatbot models hold great promise, but users should be cautious when interpreting healthcare-related advice from existing AI models. Additional training and modifications are needed before these AI models will be ready for reliable use by patients and providers.”

UF Health conducted its study in February 2023. Thus, the news release points out, results could be different now due to ChatGPT updates. Nevertheless, Terry urges users to get second opinions from their doctors.

“It’s always a good thing when patients take ownership of their healthcare and do research to get information on their own,” he said in the news release. “But just as when you use Google, don’t accept anything at face value without checking with your healthcare provider.”

That’s always good advice. Still, UF Health notes that “While this and other chatbots warn users that the programs are a work in progress, physicians believe some people will undoubtedly still rely on them.” Time will tell whether trusting AI for medical advice turns out well for those patients.

The study reported above is a useful warning to clinical laboratory managers and pathologists that current technologies used in ChatGPT, and similar AI-powered solutions, have not yet achieved the accuracy and reliability of trained medical diagnosticians when answering common questions about different health conditions asked by patients.

—Kristin Althea O’Connor

Related Information:

UF College of Medicine Research Shows AI Chatbot Flawed when Giving Urology Advice

Caution! AI Bot Has Entered the Patient Chat: ChatGPT Has Limitations in Providing Accurate Urologic Healthcare Advice

;