12 Types of RAG

An Interactive Guide to "When to Use Which One"

The original diagram outlines 12 distinct types of Retrieval-Augmented Generation (RAG). This interactive guide translates that diagram, allowing you to explore these techniques. Click the buttons below to filter by category, or browse all 12 types in the grid. Each card explains the technique and its typical use case.

Naive RAG

1. Standard RAG

The foundational approach. Retrieves documents, adds them to the prompt context, and generates an answer. Use for simple Q&A tasks with a small, clean knowledge base.

Advanced RAG Pre-processing

2. Sliding Window

Breaks down documents into smaller, overlapping chunks (e.g., 256 tokens with 128 token overlap). Use for long documents to maintain local context and ensure no information is missed at chunk boundaries.

Advanced RAG Pre-processing

3. Sentence Window

Retrieves a single sentence and adds the surrounding sentences as context. Use for highly specific, fact-based queries where the exact sentence is key, but context is needed for meaning.

Advanced RAG Pre-processing

4. Hierarchical RAG

Creates a hierarchy of summaries. Retrieves from summaries first, then "drills down" to retrieve from the underlying raw text chunks. Use for large, complex document sets.

Advanced RAG Post-processing

5. Contextual Re-ranking

Retrieves a large number of documents (e.g., top 50) and then uses a smaller, more precise model to re-rank them for relevance. Use to improve precision when initial retrieval is "noisy."

Modular RAG Search

6. Multi-query

Generates multiple different queries from a single user query to broaden the search. Use when a user's query is ambiguous or could be interpreted in multiple ways.

Modular RAG Search

7. RAG-Fusion

Generates multiple queries, retrieves documents for each, and then intelligently merges and re-ranks all results. Use to get a comprehensive answer from diverse sources.

Modular RAG Memory

8. CoRAG

Corrective RAG. Uses retrieved documents to verify and correct its own generated answer, iterating until the answer is factually grounded. Use for high-stakes, accuracy-critical tasks.

Modular RAG Response

9. Self-RAG

Uses retrieved documents to reflect on and improve its own generation process in real-time, deciding when to retrieve more or when to answer. Use for complex, multi-step queries.

Modular RAG Augmentation

10. Chain-of-Thought (CoT)

Integrates step-by-step reasoning into the generation process. Uses retrieved context to inform each step. Use for complex problems that require logical deduction.

Modular RAG Augmentation

11. Tree of Thought (ToT)

Explores multiple reasoning paths (branches) simultaneously and evaluates them, pruning weak paths. Use for highly complex problems with no single, clear answer path.

Modular RAG Augmentation

12. RAG-Fusion (Augmentation)

This appears again, linking augmentation with search. It involves fusing augmented context from multiple query sources before the final generation step. Use to synthesize diverse information.