Deepak Gupta

Glossary · last updated 2026-05-27

Vector search

Also known as: embedding search, dense retrieval

Retrieval that compares the vector embeddings of a query and candidate documents, returning documents whose embedding is most similar to the query's. The mechanical core of semantic search and RAG.

A vector embedding is a numerical representation of a text passage (typically a 512-3072 dimensional floating-point vector) produced by a model trained to place semantically similar text near each other in vector space. Vector search compares the embedding of a query against the embeddings of candidate documents using a distance metric (typically cosine similarity); the closest matches are retrieved.

Vector search is the retrieval mechanism behind semantic search and RAG systems. In an answer engine: the query is embedded; the engine's vector index returns the top-N closest documents; those documents are passed to the LLM for grounding and generation. The retrieval quality directly determines answer quality: if the wrong documents are retrieved, no amount of LLM capability can salvage the answer.

What this means for AEO/GEO: the documents your content is competing against for retrieval are the ones with the most similar embeddings to the query, regardless of keyword overlap. To be retrieved:

Write passages that are specific to a question. Generic statements have generic embeddings that don't cluster well against specific questions. A passage answering exactly "how does PKCE prevent code interception" embeds closer to that query than a passage broadly explaining OAuth.
Structure paragraphs around discrete claims. Vector retrieval often operates at the chunk level (paragraphs or 200-500 token windows). A paragraph with one clear claim retrieves more cleanly than one mixing several.
Match the user's phrasing. Even with semantic search, embeddings still cluster more tightly when phrasings overlap. Writing question-shaped headings ("How long should session timeouts be?") that mirror how users phrase queries improves retrieval.

Most publishers don't directly control vector indexing; the engines do. But the structural choices above influence whether your content gets retrieved.