Deepak Gupta

Glossary · last updated 2026-05-27

LLM (Large Language Model)

Also known as: Large Language Model, foundation model

A neural network trained on a large text corpus to predict and generate language. The underlying model class behind ChatGPT, Claude, Gemini, and every other generative AI engine: the L in LLMO and the engine doing the work in every answer engine.

Large language models (GPT-4 / GPT-5 (OpenAI), Claude 3 / 4 (Anthropic), Gemini 1.5 / 2 (Google), Llama (Meta), Mistral, and others) are transformer-based neural networks trained on hundreds of billions to trillions of tokens of text. They learn statistical patterns of language that let them generate coherent text in response to a prompt.

For an AEO / GEO programme, LLMs are the runtime behind every answer engine. Understanding how they work shapes how you write for them:

They predict the next token given context. The "context" includes the user's question and any retrieved documents (the RAG step). Documents the retriever pulls in have a much larger effect on the answer than documents the model "memorised" during training.
They favour confident, specific, well-attributed sentences. A page with quotable claims is more likely to be reproduced in the generated answer than a page of hedged generality.
They're prone to [hallucination](/glossary/hallucination/). When grounding is missing or weak, LLMs invent plausible-sounding content. Engines partly mitigate by aggressive citation requirements; publishers help by being the high-confidence source the model grounds against.

The training corpus question (what LLMs "know" from training) is mostly a distraction for publishers in 2026. Major LLMs publish little about training data, opt-out signals are honoured inconsistently, and the grounded retrieval step dominates AI-search behaviour. Optimise for being retrieved and cited; training-corpus inclusion will follow as a byproduct.