Deepak Gupta

Glossary · last updated 2026-05-21

RAG (Retrieval-Augmented Generation)

Also known as: Retrieval-Augmented Generation

The architecture pattern where an AI engine retrieves source documents from a corpus before generating an answer, conditioning the response on the retrieved context. The technical foundation underneath both AEO and GEO.

RAG is the architectural pattern that makes both grounded AI search and most enterprise AI applications work. Mechanically: when a query comes in, a retriever (vector search, keyword search, hybrid, or web crawl) finds the most relevant documents from the corpus, the retrieved passages get added to the prompt as context, and the language model generates the answer conditioned on the retrieved passages. The model is "augmented" by the retrieval step: it can answer accurately about content it never saw during training because the retrieval injects current, specific information at inference time.

For AI search engines, the corpus is the web. For enterprise AI applications, the corpus is whatever internal documents the company has indexed. The mechanics are the same; only the corpus differs. ChatGPT Search, Perplexity, Claude with web search, Gemini, AI Overviews, and Bing Copilot are all variations on RAG against the web.

For publishers thinking about AEO and GEO: RAG is the reason structured, well-attributed, semantically-clean content wins. The retrieval step prefers content the retriever can match precisely to the query; the generation step prefers content with quotable, citable sentences. Optimizing for RAG-driven engines means optimizing for both halves: being retrievable and being quotable.