Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models

Paper · arXiv 2311.09210 · Published November 15, 2023
Retrieval-Augmented Generation (RAG)Reading and Summarization

Retrieval-augmented language model (RALM) represents a significant advancement in mitigating factual hallucination by leveraging external knowledge sources. However, the reliability of the retrieved information is not always guaranteed, and the retrieval of irrelevant data can mislead the response generation. Moreover, standard RALMs frequently neglect their intrinsic knowledge due to the interference from retrieved information. In instances where the retrieved information is irrelevant, RALMs should ideally utilize their intrinsic knowledge or, in the absence of both intrinsic and retrieved knowledge, opt to respond with "unknown" to avoid hallucination. In this paper, we introduces CHAIN-OF-NOTE (CON), a novel approach to improve robustness of RALMs in facing noisy, irrelevant documents and in handling unknown scenarios. The core idea of CON is to generate sequential reading notes for each retrieved document, enabling a thorough evaluation of their relevance to the given question and integrating this information to formulate the final answer. Our experimental results show that GPT-4, when equipped with CON, outperforms the CHAIN-OF-THOUGHT approach. Besides, we utilized GPT-4 to create 10K CON data, subsequently trained on LLaMa-2 7B model.

Introduction. Retrieval-augmented language models (RALMs) represent a novel framework that significantly advances large language models (Touvron et al., 2023; OpenAI, 2023) by addressing key limitations such as reducing factual hallucinations (Ji et al., 2023; Zhang et al., 2023a), injecting up-to-date knowledge in a plug-and-play manner (Dhingra et al., 2022; Vu et al., 2023), and enhancing domainspecific expertise (Li et al., 2023; Qin et al., 2023). These enhancements primarily stem from integrating large language models (LLMs) with external knowledge sources (Guu et al., 2020; Lewis et al., 2020; Borgeaud et al., 2022; Shi et al., 2023c). In a typical RALM setup, a query is first processed by a retriever that searches a vast evidence corpus for pertinent documents. A reader then examines these documents, extracting useful information and formulating the final output answer. However, there exist several issues with the current RALM framework. First, there is no guarantee that the information retrieval (IR) system will always yield the most pertinent or trustworthy information.

Discussion / Conclusion. In this paper, we introduce the CHAIN-OF-NOTE (CON) framework, a novel methodology designed to enhance the robustness of RALMs. The central concept of CON revolves around the generation of sequential reading notes for each retrieved document. This process allows for an in-depth assessment of document relevance to the posed question and aids in synthesizing this information to craft the final answer. Our experiments show that GPT-4, when equipped with CON, outperforms the CHAIN- OF-THOUGHT approach. Besides, we utilized GPT- 4 to create 10K CON data, subsequently trained on a LLaMa-2 7B model. Our experiments across four open-domain QA benchmarks show that RALMs equipped with CON significantly outperform standard fine-tuned RALMs.