Why Retrieval-Augmented Generation (RAG) is Essential for Trusting LLMs

Dec 17, 2024

Hi friends,

In my recent exploration of Large Language Models (LLMs), I encountered a situation that perfectly illustrates their limitations and the potential of Retrieval Augmented Generation (RAG) to overcome these challenges.

In today’s newsletter, I’ll briefly discuss the current problems with LLMs, why I have high hopes for RAG, and why I’m actively experimenting with them.

♠️ Quick heads up: my Black Friday promotion is ending soon. For a few more days, you can still get my two practical books, OpenAI Crash Course and Building PWAs with Supabase and React, at a 50% discount using the code BLACKFRIDAY_2024.

I wanted to make a local RAG application using Ollama with the local llama3.2:latest model.

When I asked Ollama about RAG, it told me that RAG is a role-based action game.

Know Your Genres: Action Role-Playing Games - Xbox Wire — RAG, according to Ollama

After I got back into my chair and clarified what I was talking about, Ollama still didn’t get the memo.

It “corrected” itself by telling me that RAG is actually a reality-augmented generation, text-based game.

I copied part of the chat for your amusement:

This experience highlighted why skepticism towards LLMs is warranted and how RAG can enhance their reliability.

The Problem with LLMs

LLMs are powerful tools but often provide answers based on statistical likelihood rather than verified information. This can lead to inaccuracies, especially when the model doesn’t have access to the necessary resources or context.

https://x.com/dinkydani21/status/1868626869006077972

Here are some key issues:

Lack of Source Verification: LLMs typically don’t cite sources, making it difficult to verify the accuracy of their responses.
Statistical Guessing: They tend to offer the most statistically probable answer, even if it’s incorrect.
Inability to Admit Uncertainty: LLMs rarely acknowledge when they don’t know something, which can lead to misleading information. Remember the memes when people convinced it to say that 2+2 = 5 is correct?

The Solution: Incorporating RAG

RAG addresses these issues by integrating a retrieval system that sources information from a predefined dataset. This ensures that the answers are based on actual data rather than assumptions. Here’s how RAG can improve LLMs:

Source-Based Responses: RAG allows LLMs to cite where they found the information, enhancing trust and reliability.
Contextual Accuracy: By using a specific dataset, RAG ensures that responses are relevant and accurate.
Professional Application: RAG can provide precise answers based on existing documents or codebases in legal teams or software development environments.

Practical Applications of RAG

Would you rely on a legal team backed by ChatGPT?

Neither would I.

Legal teams have many documents. If you need to find specific information from years ago, a RAG-enhanced LLM can search through the documents and provide an accurate answer, citing the source.

We can already see these approaches implemented by major LLMs, but for publicly available information:

In software development, RAG-powered documentation is bliss.

You don’t have to cut through pages and pages of docs anymore. Simply ask the chatbot and get a correct answer, citing sources:

Even wilder, imagine not having to update static user-facing documentation because your library’s use cases can already be found in the source code.

Key Benefits of RAG

Enhanced Trust: By providing source-based answers, RAG builds trust in LLMs.
Faster Adaptation: With reliable information, users can quickly adapt LLMs into their workflows.
Maintained Accuracy: RAG ensures that responses remain accurate over time, even as context changes.

Looking Forward

I see RAGs as the stepping stone to fully trusting LLM responses and not needing second thoughts or Googling a bit more, just in case the LLM hallucinated…

As we continue to develop and refine these technologies, the potential for RAG to transform how we interact with LLMs is immense.

By ensuring that responses are accurate and verifiable, RAG can help us rise above the limitations of current LLMs and unlock new possibilities for their use in professional and technical environments.

Imagine one day you say with confidence: I trust the AI.

📰 Weekly shoutout

📣 Share

There’s no easier way to help this newsletter grow than by sharing it with the world. If you liked it, found something helpful, or you know someone who knows someone to whom this could be helpful, share it:

🏆 Subscribe

Actually, there’s one easier thing you can do to grow and help grow: subscribe to this newsletter. I’ll keep putting in the work and distilling what I learn/learned as a software engineer/consultant. Simply sign up here:

Alberto Gonzalez

Thanks for the shoutout, Akos. And for some extra reading on RAGs here on Substack you can refer to Mostly Harmless Ideas, by Alejandro Piad. He has a series of posts diving a bit deeper and some step by step tutorials.

Expand full comment

3 replies by Akos Komuves and others

3 more comments...

Bitsy

Discussion about this post