Structured Data & Schema Markup
JSON-LD, the schema types that earn rich results (Organization, FAQ, HowTo, Product, Recipe, Course, BreadcrumbList), and how schema feeds entity recognition for AI.
Schema.org markup is no longer optional. In 2026 it does two jobs: it earns rich results in Google SERPs, and it tells AI Overviews, Gemini, ChatGPT Search, and Perplexity which entities your page is about without forcing them to infer it from prose. Pages with clean structured data get cited 2–3x more often in AI grounding passes, per Brightedge’s 2025 AI citation study. This module is the JSON-LD-only playbook because Microdata and RDFa are now legacy formats Google still parses but no longer recommends.
TL;DR
- JSON-LD is the only format you should write today. Microdata and RDFa are still parsed; Google’s documentation has recommended JSON-LD exclusively since 2017.
- Required vs recommended is not optional. Every rich-result type has Google-specific required properties; missing one removes the page from rich-result eligibility.
- Schema is entity glue for AI.
@id,sameAs, and consistentOrganizationmarkup tell LLMs and Knowledge Graph what your business is, separate from what your content says.
The mental model
Schema markup is like the metadata sticker on a museum exhibit. The exhibit (your page content) is the thing visitors see; the sticker tells the curator (Google, AI crawlers) the artist, the year, the medium, the provenance — facts that help them file the work correctly and put it next to relevant exhibits. Without the sticker, the curator has to guess from the painting itself, which they can do, but slowly and with mistakes.
The 2026 version of this analogy: AI Overviews and AI Mode are increasingly the “curator” assembling answers from many exhibits. They prefer pages where the metadata is unambiguous because they don’t have time to read every exhibit in full. A page with Article schema, an author linked to a Person, and mainEntityOfPage pointing at the canonical URL is one the curator can quote with confidence. A page with no schema is still readable, but the curator has to reverse-engineer who wrote it and what it claims — and may quote a clearer competitor instead.
The point is not the rich result. It is entity disambiguation. Schema is how you tell the web “this page is about that exact thing, not the thing with the same name.”
Deep dive: the 2026 reality
The schema types that earn Google rich results in 2026, ranked by frequency and impact:
| Type | Earns rich result | High-value verticals | Notes |
|---|---|---|---|
| Organization | Knowledge panel | All sites | Required for brand entity in Knowledge Graph |
| Person | Knowledge panel | Authors, executives | Pair with sameAs to LinkedIn, Wikidata |
| Article / NewsArticle | Top Stories, AI Overview citation | News, blog | headline, datePublished, author required |
| FAQPage | Reduced visibility since Aug 2023 | Most sites | Now limited to authoritative health/government sites |
| HowTo | Removed from rich results Sept 2023 | Tutorials | Still useful for AI grounding, no SERP visual |
| Product | Product snippets, Merchant listings | E-commerce | Offer, AggregateRating, Review |
| Review / AggregateRating | Review stars | Reviews, products | Must come from first-party reviewers |
| LocalBusiness | Local pack, knowledge panel | Local services | address, geo, openingHours |
| BreadcrumbList | SERP breadcrumb display | Hierarchical sites | Use real path, not category-only |
| VideoObject | Video snippet, key moments | Video sites | contentUrl and uploadDate required |
| Event | Event listings | Tickets, venues | startDate, location, offers |
| Recipe | Recipe carousel | Food sites | recipeIngredient, recipeInstructions |
| JobPosting | Google for Jobs | Career sites | Strict freshness rules; remove expired postings |
| Course | Course listings (limited) | Education | Often paired with LearningResource |
| SoftwareApplication | App snippet | SaaS, apps | applicationCategory, aggregateRating |
| Speakable | Voice search summary | News | cssSelector to read-aloud blocks |
FAQ and HowTo were quietly demoted in 2023. Google’s August 8, 2023 update restricted FAQ rich results to “well-known authoritative” sites only — most marketing sites lost the visual treatment overnight. HowTo rich results were removed entirely in September 2023. Both schemas are still worth implementing because AI Overviews and AI Mode use them for grounding, but if you were maintaining FAQPage purely for the SERP stars, that ship sailed.
The @id pattern is what makes schema interoperable across pages. A page-level Article with author should reference the author’s Person node by @id, not redeclare every author property. Done right, your site’s schema becomes a graph: Organization → Articles → Authors → Reviewers, each entity declared once and referenced everywhere.
{
"@context": "https://schema.org",
"@graph": [
{ "@type": "Organization", "@id": "https://example.com/#org", "name": "Example" },
{ "@type": "Person", "@id": "https://example.com/team/jane#person", "name": "Jane Smith" },
{ "@type": "Article",
"headline": "...",
"author": { "@id": "https://example.com/team/jane#person" },
"publisher": { "@id": "https://example.com/#org" }
}
]
}
Speakable is the schema type Google uses for voice search and now Gemini Live audio responses. It marks specific sections of a page as suitable for read-aloud — typically a 20-30 word summary. Currently English-only and limited to news publishers, but the markup is harmless to add elsewhere.
Knowledge Graph entity building is where structured data overlaps with brand SEO. Your Organization schema, replicated identically across every page, paired with sameAs to high-trust references (Wikidata, LinkedIn, Crunchbase, Bloomberg), is what Google uses to construct the Knowledge Graph entity behind your brand. AI Overviews now cite the entity, not the page; if Google has no entity for your brand, you cannot be cited.
Visualizing it
flowchart TD
A[Your HTML page] --> B[JSON-LD in head or body]
B --> C[Google parser extracts entities]
C --> D{Required properties present?}
D -->|No| E[Rich result ineligible, schema ignored]
D -->|Yes| F[Eligible for rich result]
F --> G[Knowledge Graph link via sameAs and id]
G --> H[AI Overviews and AI Mode use entity to ground answer]
C --> I[Bing, ChatGPT Search via Bing index]
I --> J[ChatGPT cites page]
C --> K[Perplexity own crawler + Brave]
K --> L[Perplexity cites page]
Bad vs. expert
The bad approach
<!-- Stale Microdata with missing required fields, mixed with old hreviews -->
<div itemscope itemtype="http://schema.org/Product">
<h1 itemprop="name">Acme Widget</h1>
<span itemprop="description">A great widget.</span>
<div itemprop="aggregateRating" itemscope itemtype="http://schema.org/AggregateRating">
<span itemprop="ratingValue">4.9</span>
</div>
</div>
<!-- Separate FAQ Microdata block on a sales page that has no real FAQ -->
<div itemscope itemtype="http://schema.org/FAQPage">
<div itemscope itemprop="mainEntity" itemtype="http://schema.org/Question">
<h3 itemprop="name">Why choose Acme?</h3>
<div itemscope itemprop="acceptedAnswer" itemtype="http://schema.org/Answer">
<span itemprop="text">Because we are the best!</span>
</div>
</div>
</div>
The Product is missing image, offers, and reviewCount — Google ignores AggregateRating when reviewCount is absent. The description is two words and adds no context. The HTTP schema.org URL still works but signals stale code. Worst of all, the FAQPage is fabricated: the questions and answers don’t appear visibly on the page, which is now a manual-action trigger (“Spammy structured markup”). Sites flagged for fake FAQ markup typically lose all rich-result eligibility for 90 days minimum.
The expert approach
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "Organization",
"@id": "https://acme.com/#organization",
"name": "Acme Widgets",
"url": "https://acme.com/",
"logo": {
"@type": "ImageObject",
"url": "https://acme.com/logo.png",
"width": 600,
"height": 60
},
"sameAs": [
"https://www.wikidata.org/wiki/Q12345678",
"https://www.linkedin.com/company/acme-widgets",
"https://en.wikipedia.org/wiki/Acme_Widgets"
]
},
{
"@type": "WebPage",
"@id": "https://acme.com/widgets/x100/#webpage",
"url": "https://acme.com/widgets/x100/",
"isPartOf": { "@id": "https://acme.com/#website" },
"breadcrumb": { "@id": "https://acme.com/widgets/x100/#breadcrumbs" }
},
{
"@type": "BreadcrumbList",
"@id": "https://acme.com/widgets/x100/#breadcrumbs",
"itemListElement": [
{ "@type": "ListItem", "position": 1, "name": "Home", "item": "https://acme.com/" },
{ "@type": "ListItem", "position": 2, "name": "Widgets", "item": "https://acme.com/widgets/" },
{ "@type": "ListItem", "position": 3, "name": "X100" }
]
},
{
"@type": "Product",
"@id": "https://acme.com/widgets/x100/#product",
"name": "Acme X100 Widget",
"description": "Hand-machined brass widget rated for 50,000 cycles. Built in Akron, Ohio since 1962.",
"sku": "ACM-X100",
"gtin13": "0123456789012",
"brand": { "@id": "https://acme.com/#organization" },
"image": [
"https://acme.com/widgets/x100/hero.avif",
"https://acme.com/widgets/x100/detail.avif"
],
"offers": {
"@type": "Offer",
"url": "https://acme.com/widgets/x100/",
"priceCurrency": "USD",
"price": "129.00",
"priceValidUntil": "2026-12-31",
"availability": "https://schema.org/InStock",
"itemCondition": "https://schema.org/NewCondition",
"seller": { "@id": "https://acme.com/#organization" }
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "4.8",
"reviewCount": "247",
"bestRating": "5",
"worstRating": "1"
}
}
]
}
</script>
Every entity has an @id so the graph is internally consistent. Organization is declared once and referenced as brand and seller. BreadcrumbList matches the visible breadcrumb. Product has all six properties Google requires for Merchant listings (name, image, description, sku, offers, aggregateRating). gtin13 is the universal product identifier that lets Google merge your listing with retailer feeds. priceValidUntil prevents stale-price warnings in Search Console. The whole block validates against both Schema.org and the Rich Results Test.
Do this today
- Open the Rich Results Test at
search.google.com/test/rich-results. Paste your homepage URL and any product, article, or local-business URL. Note every rich-result type listed and any error/warning. - Open the Schema.org Validator at
validator.schema.org. This catches Schema.org-level errors that Google’s tester ignores (e.g., orphaned@id, malformed@graph). - In Google Search Console → Enhancements, review every report (Products, Sitelinks Searchbox, Breadcrumbs, FAQ, HowTo, Logos, Articles, Videos, Events, JobPosting, etc.). Each “Invalid items” row is a remediation ticket.
- Build your site’s
OrganizationJSON-LD once in a partial template and inject it on every page. Includename,url,logo,sameAs(Wikidata, LinkedIn, Crunchbase), and a stable@id. - Add
BreadcrumbListto every non-homepage URL using the actual visible breadcrumb path. Self-referencing canonicals must match the last item’s URL. - For e-commerce: ensure every product page has
Productwithname,image,description,sku,gtin13(ormpn),brand,offers, andaggregateRatingif you have first-party reviews. SetpriceValidUntilto the end of the year minimum. - For content: add
ArticleorNewsArticlewithheadline(≤110 chars),datePublished,dateModified,authorreferencing aPerson@id, andpublisherreferencing theOrganization@id. - Audit existing FAQPage and HowTo schema. Remove fabricated FAQ blocks immediately. Keep real FAQ markup even though SERP visibility is gone — AI Overviews and Gemini still consume it.
- Validate
Personentries withsameAsto LinkedIn and (where applicable) Wikidata. ThePerson@idshould be a URL that 200s — the author bio page works perfectly. - Set up a schema regression test in CI. Use
@google/schemarketingor a Playwright script that hits the Rich Results Test API for your top 20 templates on every deploy and fails the build on new errors.
Mark complete
Toggle to remember this module as mastered. Saved to your browser only.
More in this part
Part 5: Technical SEO
- 026 Technical SEO Fundamentals 12m
- 027 Site Architecture 20m
- 028 Crawling & Indexing 17m
- 029 robots.txt Deep Dive 15m
- 030 XML Sitemaps 12m
- 031 Canonical Tags 20m
- 032 Meta Robots & X-Robots-Tag 13m
- 033 HTTP Status Codes 15m
- 034 Crawl Budget Management 16m
- 035 JavaScript SEO 26m
- 036 Core Web Vitals 17m
- 037 Site Speed & Performance 19m
- 038 HTTPS & Site Security 12m
- 039 Mobile SEO & Mobile-First Indexing 14m
- 040 Structured Data & Schema Markup You're here 17m
- 041 International SEO (hreflang) 19m
- 042 Pagination 12m
- 043 Faceted Navigation 26m
- 044 Duplicate Content 13m
- 045 Site Migrations 24m