Glossary · last updated 2026-05-27
Vector search
Also known as: embedding search, dense retrieval
Retrieval that compares the vector embeddings of a query and candidate documents, returning documents whose embedding is most similar to the query's. The mechanical core of semantic search and RAG.
A vector embedding is a numerical representation of a text passage (typically a 512-3072 dimensional floating-point vector) produced by a model trained to place semantically similar text near each other in vector space. Vector search compares the embedding of a query against the embeddings of candidate documents using a distance metric (typically cosine similarity); the closest matches are retrieved.
Vector search is the retrieval mechanism behind semantic search and RAG systems. In an answer engine: the query is embedded; the engine's vector index returns the top-N closest documents; those documents are passed to the LLM for grounding and generation. The retrieval quality directly determines answer quality: if the wrong documents are retrieved, no amount of LLM capability can salvage the answer.
What this means for AEO/GEO: the documents your content is competing against for retrieval are the ones with the most similar embeddings to the query, regardless of keyword overlap. To be retrieved:
- Write passages that are specific to a question. Generic statements have generic embeddings that don't cluster well against specific questions. A passage answering exactly "how does PKCE prevent code interception" embeds closer to that query than a passage broadly explaining OAuth.
- Structure paragraphs around discrete claims. Vector retrieval often operates at the chunk level (paragraphs or 200-500 token windows). A paragraph with one clear claim retrieves more cleanly than one mixing several.
- Match the user's phrasing. Even with semantic search, embeddings still cluster more tightly when phrasings overlap. Writing question-shaped headings ("How long should session timeouts be?") that mirror how users phrase queries improves retrieval.
Most publishers don't directly control vector indexing; the engines do. But the structural choices above influence whether your content gets retrieved.
Related