Deepak Gupta

Implementation · AEO + GEO · 13 min read · last updated 2026-05-21

Schema.org for AEO and GEO: which structured data actually matters

Schema.org has hundreds of types. Only a small subset moves the needle for AI engines, and the priority order is not obvious

Schema.org has hundreds of types and tens of thousands of properties. AI engines don't weight them equally. This guide is the working priority order from a publisher's perspective: which types matter, which don't, and where the common implementation mistakes hide.

The general principle: declare schema that accurately describes the page, focus on the entity-graph foundation (Article + Person + Organization), add structured types where the content genuinely matches, and skip types that don't accurately describe what's on the page. Stuffed or inaccurate schema actively hurts.

Tier 1: the foundation

Article + Author (Person)

The single most important schema type for AEO and GEO. Article schema declares the page is an article (vs a product page, navigation page, etc.) and gives the engine the metadata it needs to understand the page as a citable artifact. The Author property pointing to a Person entity ties the content to a real human.

The Person should be referenced by stable @id (e.g. https://example.com/#person) and resolved to a full Person node on a page on your domain (typically the home page or an about page). The Person should include name, URL, image, jobTitle, worksFor (Organization), and sameAs (an array of verified profile URLs: LinkedIn, X, GitHub, etc.).

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Article title",
  "datePublished": "2026-05-21",
  "dateModified": "2026-05-21",
  "author": { "@id": "https://example.com/#person" },
  "publisher": { "@id": "https://example.com/#organization" },
  "mainEntityOfPage": "https://example.com/this-article/"
}

Why this matters: AI engines weight Person-attributed content meaningfully higher than unattributed content for E-E-A-T signals. An article without an Author is treated as anonymous; an article with an Author tied to a verifiable Person inherits that Person's authority signal.

Organization

The publisher-side complement to Person. Declares the entity behind the publication and ties it to people via founder, employee, or parentOrganization.

The graph of Person → Organization → Article is what AI engines parse as "this content was created by a real entity with verifiable identity, published by a real organization." A page with all three properly linked carries the strongest baseline signal.

Tier 2: high-leverage extractive schema

FAQPage

The most-extracted schema type for AEO. FAQPage with Question + Answer pairs gets pulled into featured snippets, AI Overviews, and voice-assistant responses at a higher rate than equivalent unstructured content.

The implementation rule: only mark up content that genuinely is a question and its answer. Don't mark up arbitrary headings as questions. Don't mark up a section header followed by paragraphs of context as Q&A. Stuffed FAQPage schema is the most common AEO anti-pattern and Google has been actively devaluing it since 2023.

HowTo

Step-by-step procedural content. Heavily extracted by AI Overviews and voice assistants for procedural queries ("how to X"). The implementation rule is similar to FAQPage: only mark up genuinely procedural content with discrete steps. Generic "tips for" articles are not HowTo content.

DefinedTerm and DefinedTermSet

Glossary entries. Highly LLM-friendly because they give engines unambiguous (term, definition) pairs. The DefinedTermSet wrapper declares "this is the glossary"; each DefinedTerm declares an entry within it.

GEO Compass uses this aggressively: every glossary page renders DefinedTerm JSON-LD with the term name, alternate names, description, URL, and the DefinedTermSet it belongs to. The result is AI engines can ingest the glossary as a structured key-value resource rather than parsing prose.

This is the schema type most underused by publishers in 2026; adoption is low and the LLM-friendliness is high.

Tier 3: situational but useful

BreadcrumbList

Site structure. Helps engines understand the hierarchy of your content. Not heavily extracted but provides context that improves entity disambiguation.

ItemList + Product/Review

For comparison content and reviews. Increasingly extracted into AI answers for buyer-intent queries. ItemList declares a ranked or unranked list (perfect for "top 10" listicle content); each item can be a Product, Review, or SoftwareApplication entity with structured properties.

The "tools" portal in GEO Compass's sibling network uses this for every listicle: each tool is a SoftwareApplication with properties, the listicle is an ItemList, and individual ratings are Reviews.

Speakable

Marks the parts of a page suitable for text-to-speech extraction by voice assistants. Less critical than it was in 2018-2020 but still useful for voice-AEO. Apply to the elements you want read aloud (typically H1, summary paragraph, FAQ answers).

TechArticle (vs Article)

Subtype of Article for technical content. Adds properties like dependencies, proficiencyLevel. Useful when your content is genuinely technical documentation; not necessary for general-audience writing.

Tier 4: entity-graph extenders

sameAs

Property on Person and Organization linking to verified profiles on external platforms (LinkedIn, X, GitHub, ORCID, Wikipedia, Wikidata). The sameAs list is what lets AI engines resolve your entity confidently across platforms.

A Person with 5-7 verified sameAs links is meaningfully more confidently disambiguated than a Person with no sameAs. For E-E-A-T, this is one of the highest-leverage properties to invest in.

worksFor, alumniOf, knowsAbout

Properties on Person that establish professional context. worksFor ties Person to a current Organization; alumniOf to a prior one; knowsAbout to topics the Person has expertise in. AI engines use these for "who is this author and why should I trust them on this topic" inference.

Anti-patterns to avoid

Contradictory schema. Article + Product schema on the same page with mismatched metadata confuses engines. Pick one primary type per page.

Schema that lies. Claiming HowTo when the content isn't procedural; FAQ when there are no real questions; Review when there's no review. Engines have been getting better at detecting this and devalue pages that do it.

Stuffed schema. Declaring every possible type "just in case" doesn't help and may hurt. Schema should accurately describe what's on the page, not aspirationally describe what you wish were there.

Schema without an @id graph. Person and Organization referenced inline on every page rather than by @id results in N separate Person nodes from an engine's perspective, fragmenting the entity. Always use @id references to a canonical Person/Organization node and resolve that node on a dedicated page.

Missing dateModified. datePublished is required; dateModified is what most engines actually read for freshness signal. Update dateModified when you materially change the content.

Microdata or RDFa instead of JSON-LD. JSON-LD is the preferred format. Microdata and RDFa still work but JSON-LD is what every major engine and validator optimizes for in 2026.

Practical checklist

[ ] Every article page has Article schema with headline, datePublished, dateModified, author, publisher, mainEntityOfPage
[ ] Author and publisher reference a canonical Person and Organization by @id
[ ] The canonical Person and Organization nodes are resolved on a dedicated page (typically the home page) with full properties
[ ] sameAs property on the Person includes 5-7 verified external profile URLs
[ ] FAQPage schema only on pages with real Q&A content (not stuffed)
[ ] HowTo schema only on pages with real procedural content (not stuffed)
[ ] Listicle/comparison content uses ItemList with Product or SoftwareApplication entities
[ ] Glossary content uses DefinedTerm and DefinedTermSet
[ ] Schema validates with Google's Rich Results Test and Schema.org validator
[ ] All schema in JSON-LD format (not Microdata or RDFa)

The schema work is foundational. Get the Tier 1 entity graph right; add the Tier 2 extractive types where they accurately describe content; layer Tier 3 and 4 as situations call for them. Most AEO and GEO programs underinvest in this layer because the work is invisible, but the AI engines see it, and the entity-graph cleanliness is what determines whether you're cited as "you" or as "some related entity."

Related guides