
AI models, including OpenAI’s ChatGPT, often fabricate information due to hallucination. Mistakes range from trivial to dangerous, with potential legal and cybersecurity concerns. While LLMs lack malice, their training methods lead to inaccuracies. Addressing hallucination involves refining training approaches, such as reinforcement learning from human feedback (RLHF). Despite challenges, harnessing the creative potential in hallucination may outweigh the drawbacks. Treating AI model predictions skeptically seems the best approach for now.
The issue of hallucination plagues large language models (LLMs) like OpenAI’s ChatGPT, leading them to fabricate information.
These errors can range from trivial instances, such as asserting that the Golden Gate Bridge was moved to Egypt in 2016, to serious and problematic cases.
For instance, an Australian mayor recently considered legal action against OpenAI after ChatGPT falsely claimed his involvement in a major bribery scandal. Researchers have uncovered that these LLM hallucinations can be exploited to distribute harmful code to unsuspecting software developers. Additionally, these models frequently provide inaccurate mental health and medical advice, like suggesting that wine consumption can prevent cancer.
This phenomenon of generating “facts” is known as hallucination, resulting from the development and training methods of today’s LLMs and all generative AI models.
Model Training
Generative AI models lack true intelligence; they are statistical systems that predict data such as words, images, music, or speech. By processing numerous examples, often sourced from the internet, AI models learn the likelihood of data based on patterns and contextual information.
For instance, given the phrase “Looking forward…” in a typical email, an LLM might complete it with “… to hearing back,” based on patterns in its training data. This doesn’t signify genuine anticipation.
“The present framework for training LLMs involves concealing prior words for context,” explained Sebastian Berns, a Ph.D. researcher at Queen Mary University of London. He mentioned that the model predicts which words should replace the concealed ones, similar to predictive text on iOS.
This probability-based method works well on a large scale, but it isn’t foolproof. LLMs can generate grammatically correct yet nonsensical statements, propagate falsehoods present in their training data, or combine conflicting information sources, including fictional ones.
LLMs lack malice; they don’t understand true or false. They’ve merely learned associations between words or phrases and concepts, even if these associations aren’t accurate.
Addressing Hallucination
Can hallucinations be resolved? It hinges on the definition of “resolved.”
Vu Ha, an applied researcher at the Allen Institute for Artificial Intelligence, believes that LLMs will always experience hallucinations. However, he thinks there are ways to reduce these instances, depending on how LLMs are trained and deployed.
For instance, a question-answering system could integrate a well-curated knowledge base with an LLM to provide accurate answers via a retrieval process. A high-quality knowledge base could lead to more accurate results, as opposed to using less carefully curated data.
Berns suggested a technique that has partially succeeded in reducing hallucinations: reinforcement learning from human feedback (RLHF). This approach involves training a reward model using human feedback to fine-tune the LLM’s responses.
While the space for aligning LLMs with RLHF is vast, it may not lead to complete alignment. Despite imperfections, these hallucinations could be harnessed for creativity, offering unexpected ideas that a human mind might not generate on its own.
Ha argued that today’s LLMs are held to an unfair standard. Human memory and representation of truth are also fallible. LLMs cause cognitive dissonance by producing seemingly accurate outputs that hide errors upon closer examination.
Overall, treating predictions from generative AI models with skepticism is the current approach to addressing hallucination, rather than solely relying on technical solutions.