USCAP Archives

Stanford Researchers Use Text and Images from Pathologists’ Twitter Accounts to Train New Pathology AI Model

Mar 6, 2024 | Laboratory Instruments & Laboratory Equipment, Laboratory Management and Operations, Laboratory News, Laboratory Pathology

Researchers intend their new AI image retrieval tool to help pathologists locate similar case images to reference for diagnostics, research, and education

Researchers at Stanford University turned to an unusual source—the X social media platform (formerly known as Twitter)—to train an artificial intelligence (AI) system that can look at clinical laboratory pathology images and then retrieve similar images from a database. This is an indication that pathologists are increasingly collecting and storing images of representative cases in their social media accounts. They then consult those libraries when working on new cases that have unusual or unfamiliar features.

The Stanford Medicine scientists trained their AI system—known as Pathology Language and Image Pretraining (PLIP)—on the OpenPath pathology dataset, which contains more than 200,000 images paired with natural language descriptions. The researchers collected most of the data by retrieving tweets in which pathologists posted images accompanied by comments.

“It might be surprising to some folks that there is actually a lot of high-quality medical knowledge that is shared on Twitter,” said researcher James Zou, PhD, Assistant Professor of Biomedical Data Science and senior author of the study, in a Stanford Medicine SCOPE blog post, which added that “the social media platform has become a popular forum for pathologists to share interesting images—so much so that the community has widely adopted a set of 32 hashtags to identify subspecialties.”

“It’s a very active community, which is why we were able to curate hundreds of thousands of these high-quality pathology discussions from Twitter,” Zou said.

The Stanford researchers published their findings in the journal Nature Medicine titled, “A Visual-Language Foundation Model for Pathology Image Analysis Using Medical Twitter.”

“The main application is to help human pathologists look for similar cases to reference,” James Zou, PhD (above), Assistant Professor of Biomedical Data Science, senior author of the study, and his colleagues wrote in Nature Medicine. “Our approach demonstrates that publicly shared medical information is a tremendous resource that can be harnessed to develop medical artificial intelligence for enhancing diagnosis, knowledge sharing, and education.” Leveraging pathologists’ use of social media to store case images for future reference has worked out well for the Stanford Medicine study. (Photo copyright: Stanford University.)

Retrieving Pathology Images from Tweets

“The lack of annotated publicly-available medical images is a major barrier for innovations,” the researchers wrote in Nature Medicine. “At the same time, many de-identified images and much knowledge are shared by clinicians on public forums such as medical Twitter.”

In this case, the goal “is to train a model that can understand both the visual image and the text description,” Zou said in the SCOPE blog post.

Because X is popular among pathologists, the United States and Canadian Academy of Pathology (USCAP), and Pathology Hashtag Ontology project, have recommended a standard series of hashtags, including 32 hashtags for subspecialties, the study authors noted.

Examples include:

#EyePath for Ophthalmic Pathology,
#GIPath for Gastrointestinal and Liver Pathology,
#HemePath for Hematopathology, and
#IDpath for Infectious Disease (clinical) Pathology.

“Pathology is perhaps even more suited to Twitter than many other medical fields because for most pathologists, the bulk of our daily work revolves around the interpretation of images for the diagnosis of human disease,” wrote Jerad M. Gardner, MD, a dermatopathologist and section head of bone/soft tissue pathology at Geisinger Medical Center in Danville, Pa., in a blog post about the Pathology Hashtag Ontology project. “Twitter allows us to easily share images of amazing cases with one another, and we can also discuss new controversies, share links to the most cutting edge literature, and interact with and promote the cause of our pathology professional organizations.”

The researchers used the 32 subspecialty hashtags to retrieve English-language tweets posted from 2006 to 2022. Images in the tweets were “typically high-resolution views of cells or tissues stained with dye,” according to the SCOPE blog post.

The researchers collected a total of 232,067 tweets and 243,375 image-text pairs across the 32 subspecialties, they reported. They augmented this with 88,250 replies that received the highest number of likes and had at least one keyword from the ICD-11 codebook. The SCOPE blog post noted that the rankings by “likes” enabled the researchers to screen for high-quality replies.

They then refined the dataset by removing duplicates, retweets, non-pathology images, and tweets marked by Twitter as being “sensitive.” They also removed tweets containing question marks, as this was an indicator that the practitioner was asking a question about an image rather than providing a description, the researchers wrote in Nature Medicine.

They cleaned the text by removing hashtags, Twitter handles, HTML tags, emojis, and links to websites, the researchers noted.

The final OpenPath dataset included:

116,504 image-text pairs from Twitter posts,
59,869 from replies, and
32,041 image-text pairs scraped from the internet or obtained from the LAION dataset.

The latter is an open-source database from Germany that can be used to train text-to-image AI software such as Stable Diffusion.

Training the PLIP AI Platform

Once they had the dataset, the next step was to train the PLIP AI model. This required a technique known as contrastive learning, the researchers wrote, in which the AI learns to associate features from the images with portions of the text.

As explained in Baeldung, an online technology publication, contrastive learning is based on the idea that “it is easier for someone with no prior knowledge, like a kid, to learn new things by contrasting between similar and dissimilar things instead of learning to recognize them one by one.”

“The power of such a model is that we don’t tell it specifically what features to look for. It’s learning the relevant features by itself,” Zou said in the SCOPE blog post.

The resulting AI PLIP tool will enable “a clinician to input a new image or text description to search for similar annotated images in the database—a sort of Google Image search customized for pathologists,” SCOPE explained.

“Maybe a pathologist is looking at something that’s a bit unusual or ambiguous,” Zou told SCOPE. “They could use PLIP to retrieve similar images, then reference those cases to help them make their diagnoses.”

The Stanford University researchers continue to collect pathology images from X. “The more data you have, the more it will improve,” Zou said.

Pathologists will want to keep an eye on the Stanford Medicine research team’s progress. The PLIP AI tool may be a boon to diagnostics and improve patient outcomes and care.

—Stephen Beale

Related Information:

New AI Tool for Pathologists Trained by Twitter (Now Known as X)

A Visual-Language Foundation Model for Pathology Image Analysis Using Medical Twitter

AI + Twitter = Foundation Visual-Language AI for Pathology

Pathology Foundation Model Leverages Medical Twitter Images, Comments

A Visual-Language Foundation Model for Pathology Image Analysis Using Medical Twitter (Preprint)

Pathology Language and Image Pre-Training (PLIP)

Introducing the Pathology Hashtag Ontology

Inside the Recent CLMA and USCAP Meetings

Sep 16, 2010 | Laboratory News

Last week, The Dark Report was in San Diego and Houston to attend the annual meetings of the United States & Canadian Academy of Pathology (USCAP) and the Clinical Laboratory Management Association (CLMA). Time spent in the exhibit halls of both meetings spoke volumes about the changing trends in the laboratory profession.

First was the USCAP meeting, conducted in San Diego, California. This is a growing meeting and attracts more than 3,000 pathologists from countries around the world. One can hear many different languages spoken as one walks among the crowd between sessions. The exhibit hall of USCAP is also growing. It featured 245 exhibitors and represented a good cross section of companies selling instrument systems, consumables, and services to anatomic pathology laboratories.

Of particular note were two things seen in USCAP’s exhibition hall. First, there was an intriguing spread of companies offering digital solutions for anatomic pathology. Technology is advancing and, even if the current generation of products fall a bit short of the functionality desired by customers, it is clear that lots of money is being invested to advance all aspects of pathology informatics and digital imaging. Second, molecular pathology was definitely a major product sector at this exhibition. Whether it was the marketing of new diagnostic assays or companies offering services in molecular pathology, there was high interest in how molecular pathology could be used to provide higher quality diagnostic support to pathologists and their referring clinicians.

Following the USCAP meeting, I flew to Houston, Texas to catch the CLMA annual meeting. Just as laboratory consolidation in the hospital industry over the past decade has steadily concentrated laboratory management duties into the hands of fewer people, CLMA has seen a corresponding shift in the numbers of attendees and the composition of vendors in its exhibition hall. One obvious difference from past years is the lower profile of several in vitro diagnostic (IVD) companies at this year’s event. Yet, a survey of vendors throughout the exhibition hall indicated that the people passing through the exhibition were qualified buyers and their expectation was that new business would be result from their participation at the exhibition.

Just as at the USCAP exhibition hall, CLMA’s exhibitor line-up featured a growing number of software and informatics vendors compared to past years. I take this as a sign that laboratory directors and pathologists are taking active steps to use information technology to guide their management of laboratory operations and work flow. The range of middleware solutions and vendors on the exhibition floor would be a response by vendors to the demand for those functions by laboratory customers.

Another observation was gained from attendance at the annual meetings of USCAP and CLMA. There is plenty of optimism about the future of laboratory medicine among attendees and vendors at both events. Despite the rapid pace of change in healthcare and unfavorable reimbursement trends, pathologists and laboratory managers believe that new diagnostic tests and advances in laboratory medicine are giving them important new tools to help patients and their physicians.

Your traveling editor,
Robert Michel

Send your comments and observations to Robert at rmichel@darkdaily.com.

Stanford Researchers Use Text and Images from Pathologists’ Twitter Accounts to Train New Pathology AI Model

Inside the Recent CLMA and USCAP Meetings

E-Briefings Categories