Locally operated large language models (LLMs) could efficiently extract data from radiology reports while maintaining patient privacy, according to a study from the National Institutes of Health Clinical Center. A freely accessible LLM by the name of Vicuna-13B showed equivalent performance in categorizing chest X-ray findings, in contrast to models like ChatGPT and GPT-4, which raise privacy issues. Researchers discovered that the LLM’s outcomes closely matched those of conventional labeling instruments. With the ability to extract useful data from medical records while protecting patient privacy and creating new opportunities for clinical applications, this method may completely alter the course of AI research in the healthcare industry.
Locally run large language models (LLMs) may provide a workable solution for information extraction from text-based radiological reports while protecting patient privacy, according to a recent study from the National Institutes of Health Clinical Center (NIH CC). The results of this study, which were published in the journal Radiology by the Radiological Society of North America (RSNA), suggest that LLMs, which are deep learning models trained to understand and produce text in a human-like manner, have potential applications in the healthcare industry.
In recent times, prominent LLM models such as ChatGPT and GPT-4 have gained attention; however, their application in healthcare is restricted due to privacy concerns.
According to senior author Dr. Ronald M. Summers, a senior investigator in the Radiology and Imaging Sciences Department at the NIH, “ChatGPT and GPT-4 are proprietary models that necessitate sending data to OpenAI for processing, which would require the painstaking task of de-identifying patient data. Removing all patient health information is a labor-intensive and impractical process for handling large volumes of reports.”
In this study, led by Dr. Pritam Mukherjee, a staff scientist at the NIH CC, researchers explored the feasibility of employing a locally operated LLM known as Vicuna-13B to annotate crucial findings in chest X-ray reports from both the NIH and the Medical Information Mart for Intensive Care (MIMIC) Database, a publicly accessible dataset containing de-identified electronic health records.
Dr. Summers pointed out, “Initial evaluations have revealed that Vicuna, a freely available LLM, approaches the performance levels of ChatGPT in tasks like multilingual question answering.”
The study dataset consisted of 3,269 chest X-ray reports from MIMIC and 25,596 reports from the NIH. Researchers employed two prompts for two distinct tasks, instructing the LLM to identify and label the presence or absence of 13 specific findings in the chest X-ray reports. The LLM’s performance was then compared with that of two widely used non-LLM labeling tools.
A statistical analysis of the LLM’s outputs demonstrated a significant level of agreement with the non-LLM computer programs. Dr. Summers stated, “Our study showed that the LLM’s performance was on par with the existing reference standards. With the appropriate prompt and task, we achieved alignment with the currently utilized labeling tools.”
Dr. Summers emphasized that locally operated LLMs could be invaluable in creating extensive datasets for AI research while upholding patient confidentiality. He remarked, “LLMs have revolutionized the field of natural language processing, enabling us to tackle challenges that were previously difficult with traditional pre-large language models.”
He further explained that LLM tools could be applied to extract critical information from other text-based radiology reports and medical records, serving as a valuable resource for identifying disease biomarkers. “My lab has been focused on extracting features from diagnostic images,” he noted. “With tools like Vicuna, we can extract features from text and merge them with image features for input into advanced AI models that might provide answers to clinical queries. LLMs that are freely accessible, privacy-conscious, and available for local use are game-changers, enabling us to accomplish tasks that were previously beyond our reach.”