RAG
RAG is a technique that retrieves relevant documents before generating an answer, helping reduce hallucinations.
RAG (Retrieval Augmented Generation) is a technique where AI does not rely solely on memory to answer, but first searches for documents related to the question and then creates an answer based on the contents. It can be likened to an open book exam, where instead of answering with memorized content, you open the textbook, check the relevant page, and write your answer.
LLM has the hallucination problem of not knowing information after the point of learning, not being able to answer the content of in-house documents that you have not studied, and making up what you don't know to make it seem plausible. RAG is a realistic way to alleviate these three things at once, and has become a standard pattern for corporate AI adoption, such as in-house document chatbots and customer center automation.
However, adding a RAG does not make the hallucination disappear. If a search retrieves the wrong document, the answer will also be wrong, so document organization and search quality control determine success or failure.
✅ Why it matters
- You can have the model answer with knowledge it does not know, such as the latest information and in-house documents.
- Verification of answers is easier because supporting documents can be presented.
- It is much cheaper and faster than retraining the model.
⚠️ Limits and debates
- If the search is inaccurate, the answer will also be incorrect
- It only reduces the illusion but does not completely eliminate it
- It takes a lot of work to build and maintain document organization, segmentation, search tuning, etc.