Algomo: Preventing Hallucinations in AI Powered Customer Support Chatbots with Lynx
Background
Aida is an AI agent that personalizes your website and engages visitors through a chatbot in real time. Aida customizes landing pages and engages in personalized conversations instantly for each visitor based on multiple data sources and an array of personalization tools.
In order to achieve this, Algomo leverages data from multiple sources, including intents, documents, websites and others. The LLM responses must be accurate and cannot hallucinate or misrepresent document contents. Continuously evaluating and reducing failures in the system is therefore critical to deliver high quality outputs for the chatbot’s users.
Detecting hallucinations in a multilingual setting additionally provides challenges for GenAI systems. Reviewing hallucinations and LLM failures manually costs significant time and effort, and often requires domain expertise. It’s therefore critical to have a scalable, automated approach to failure detection.
Algomo AI Customer Support workflow with post response Hallucination Detection
At Algomo, we used a layered solution where in the first layer, a combination of offline and online evaluations determines the hallucination score. The subsequent layer is where we use Lynx-large-70B model to evaluate the results of the preceding evaluators. Based on the final decision provided by Lynx, we receive alerts for potential hallucinated messages in the given conversation
We have noticed that by adding the Lynx-large-70B into our workflow, we were able to double the precision of our Hallucination detection solution from 0.375 to 0.69 with our internal evaluation set of 140 samples.
Lynx is able to detect both obvious and subtle hallucinations, including on multilingual AI applications. By leveraging Lynx, Algomo is able to deliver reliable, high quality generations to our customers.