The limitations in terms of factual accuracy and the difficulties in updating datasets are two crucial issues in the field of Generative AI. As we write in a post dedicated to this topic, at Aptus.AI we are aware of how Generative AI models are very good at generating linguistically correct and apparently reliable texts, but equally inadequate at ensuring their precision and accuracy from a factual point of view. Since they are very complex probabilistic models, they are able to predict the next sentence on the basis of a probability calculation. Their sophistication allows Generative AI models to return answers of such quality that they all seem correct, but, as we know, this is not always true. In fact, Language Models generate sentences based on the data they have, namely the data used to train them. This, however, can cause some so-called “hallucinations”, namely answers that appear to be true, perhaps very likely, but nonetheless not factually correct or simply not up-to-date with the latest news.
“Hallucinations”, then, is the term used to define the most obvious errors that LMs make in returning factual answers. This topic, which is extremely discussed today, has been the focus of research and development activities at Aptus.AI for several years now. In fact, in unsuspected times, we wrote how the true power of Natural Language Processing is knowing its limits. On the other hand, it is equally fundamental to know all the technological developments that attempt to overcome these limits, as we have done in the past, since the introduction of AI Transformers and the creation of Geppetto, the first Italian-speaking text generation model (based on GPT-2), developed in collaboration with the Bruno Kessler Foundation, the University of Groningen and the ILC-CNR. This is why Google Bard's mistake, which caused 100 billion dollars in stock market losses, and the low accuracy of Bing AI, Microsoft's Conversational AI based on ChatGPT, did not surprise us. On the contrary, evidence like this prompted us to study Generative AI and innovative techniques to make them reliable. And this is exactly how we integrated Retrieval Augmented Generation into our Language Models, thus offering up-to-date and reliable answers to the users of our RegTech SaaS, by using Daitomic Chat, an innovative Conversational AI service dedicated to regulatory analysis.
After this necessary introduction, we can finally enter the world of Retrieval Augmented Generation. In the AI section of Meta's blog, this methodology was presented in a clear and effective manner. RAG provides an architecture that adds an additional step compared to standard Generative AI models, which receive one sequence of words as input and return another as output. With the RAG methodology, the input is still passed directly to the text generator, but also used to retrieve a set of relevant documents on the topic from an additional source. These two sources, acting together, complement each other, thus integrating all the information and also being capable of generating correct answers even in cases where these are not found verbatim in any of the documents. Above all, models using Retrieval Augmented Generation provide unprecedented flexibility, as there is no need to retrain them to obtain up-to-date answers, but just to replace the documents used to retrieve the information.
Now it is clear that Retrieval Augmented Generation addresses two crucial needs of Generative AI models: to return information that is both factually accurate and up-to-date. Conversational AI services, therefore, must be able to access large amounts of information, but especially the right information. A way to achieve this was proposed in a recent study on RAG application methodologies, which investigated how to prevent text generation models from creating hallucinations and factually inaccurate output. Starting from the RAG architecture, it is indeed possible to implement what is called Active Retrieval Augmented Generation, namely a methodology through which models would be able to choose when and what information to retrieve during text generation itself. Such a possibility would make a Conversational AI model capable of working on even longer texts and more complex scenarios. This solution is still to be validated and applied to real use cases, but the first results are already promising.
At this point, the importance of RAG for the implementation of truly effective Conversational AI models is quite evident. Yes, because the effectiveness of these tools is based on the accuracy and factual reliability of the returned answers. Especially in specific fields and on very technical subjects, such as laws. Legal information, in order to be used, must be certain and up-to-date, since any change, even the slightest, can modify the entire context. Precisely for this reason, RAG has proved to be a fundamental ally for the growth of Daitomic, our RegTech SaaS, which now includes a Conversational AI service. The name is Daitomic Chat, since it is a real instant chat that allows users to directly ask regulations for the legal information they need. And it is Retrieval Augmented Generation that allows Daitomic Chat to avoid hallucinations and be always updated in real-time. How? By using users’ questions not only as inputs for the text generation model, but also to feed the Artificial Intelligence itself which reference articles to retrieve the information from. For example, if a user reading the GDPR (General Data Protection Regulation, EU Regulation 2016/679) asks Daitomic Chat “what is the right to be forgotten?”, the model will answer after consulting Article 17 of the document, where the concept is defined. This, as we have seen, provides two advantages: avoid hallucinations - namely wrong answers -, but also offer up-to-date information on the latest regulatory changes, simply by pointing to the updated document (or document’s part) from which to take the information. Are you curious to try the potential of Daitomic Chat? Join the waiting list!