Module 040 Advanced 17 min read

Structured Data & Schema Markup

JSON-LD, the schema types that earn rich results (Organization, FAQ, HowTo, Product, Recipe, Course, BreadcrumbList), and how schema feeds entity recognition for AI.

By SEO Mastery Editorial

Schema.org markup is no longer optional. In 2026 it does two jobs: it earns rich results in Google SERPs, and it tells AI Overviews, Gemini, ChatGPT Search, and Perplexity which entities your page is about without forcing them to infer it from prose. Pages with clean structured data get cited 2–3x more often in AI grounding passes, per Brightedge’s 2025 AI citation study. This module is the JSON-LD-only playbook because Microdata and RDFa are now legacy formats Google still parses but no longer recommends.

TL;DR

  • JSON-LD is the only format you should write today. Microdata and RDFa are still parsed; Google’s documentation has recommended JSON-LD exclusively since 2017.
  • Required vs recommended is not optional. Every rich-result type has Google-specific required properties; missing one removes the page from rich-result eligibility.
  • Schema is entity glue for AI. @id, sameAs, and consistent Organization markup tell LLMs and Knowledge Graph what your business is, separate from what your content says.

The mental model

Schema markup is like the metadata sticker on a museum exhibit. The exhibit (your page content) is the thing visitors see; the sticker tells the curator (Google, AI crawlers) the artist, the year, the medium, the provenance — facts that help them file the work correctly and put it next to relevant exhibits. Without the sticker, the curator has to guess from the painting itself, which they can do, but slowly and with mistakes.

The 2026 version of this analogy: AI Overviews and AI Mode are increasingly the “curator” assembling answers from many exhibits. They prefer pages where the metadata is unambiguous because they don’t have time to read every exhibit in full. A page with Article schema, an author linked to a Person, and mainEntityOfPage pointing at the canonical URL is one the curator can quote with confidence. A page with no schema is still readable, but the curator has to reverse-engineer who wrote it and what it claims — and may quote a clearer competitor instead.

The point is not the rich result. It is entity disambiguation. Schema is how you tell the web “this page is about that exact thing, not the thing with the same name.”

Deep dive: the 2026 reality

The schema types that earn Google rich results in 2026, ranked by frequency and impact:

TypeEarns rich resultHigh-value verticalsNotes
OrganizationKnowledge panelAll sitesRequired for brand entity in Knowledge Graph
PersonKnowledge panelAuthors, executivesPair with sameAs to LinkedIn, Wikidata
Article / NewsArticleTop Stories, AI Overview citationNews, blogheadline, datePublished, author required
FAQPageReduced visibility since Aug 2023Most sitesNow limited to authoritative health/government sites
HowToRemoved from rich results Sept 2023TutorialsStill useful for AI grounding, no SERP visual
ProductProduct snippets, Merchant listingsE-commerceOffer, AggregateRating, Review
Review / AggregateRatingReview starsReviews, productsMust come from first-party reviewers
LocalBusinessLocal pack, knowledge panelLocal servicesaddress, geo, openingHours
BreadcrumbListSERP breadcrumb displayHierarchical sitesUse real path, not category-only
VideoObjectVideo snippet, key momentsVideo sitescontentUrl and uploadDate required
EventEvent listingsTickets, venuesstartDate, location, offers
RecipeRecipe carouselFood sitesrecipeIngredient, recipeInstructions
JobPostingGoogle for JobsCareer sitesStrict freshness rules; remove expired postings
CourseCourse listings (limited)EducationOften paired with LearningResource
SoftwareApplicationApp snippetSaaS, appsapplicationCategory, aggregateRating
SpeakableVoice search summaryNewscssSelector to read-aloud blocks

FAQ and HowTo were quietly demoted in 2023. Google’s August 8, 2023 update restricted FAQ rich results to “well-known authoritative” sites only — most marketing sites lost the visual treatment overnight. HowTo rich results were removed entirely in September 2023. Both schemas are still worth implementing because AI Overviews and AI Mode use them for grounding, but if you were maintaining FAQPage purely for the SERP stars, that ship sailed.

The @id pattern is what makes schema interoperable across pages. A page-level Article with author should reference the author’s Person node by @id, not redeclare every author property. Done right, your site’s schema becomes a graph: Organization → Articles → Authors → Reviewers, each entity declared once and referenced everywhere.

{
  "@context": "https://schema.org",
  "@graph": [
    { "@type": "Organization", "@id": "https://example.com/#org", "name": "Example" },
    { "@type": "Person", "@id": "https://example.com/team/jane#person", "name": "Jane Smith" },
    { "@type": "Article",
      "headline": "...",
      "author": { "@id": "https://example.com/team/jane#person" },
      "publisher": { "@id": "https://example.com/#org" }
    }
  ]
}

Speakable is the schema type Google uses for voice search and now Gemini Live audio responses. It marks specific sections of a page as suitable for read-aloud — typically a 20-30 word summary. Currently English-only and limited to news publishers, but the markup is harmless to add elsewhere.

Knowledge Graph entity building is where structured data overlaps with brand SEO. Your Organization schema, replicated identically across every page, paired with sameAs to high-trust references (Wikidata, LinkedIn, Crunchbase, Bloomberg), is what Google uses to construct the Knowledge Graph entity behind your brand. AI Overviews now cite the entity, not the page; if Google has no entity for your brand, you cannot be cited.

Visualizing it

flowchart TD
  A[Your HTML page] --> B[JSON-LD in head or body]
  B --> C[Google parser extracts entities]
  C --> D{Required properties present?}
  D -->|No| E[Rich result ineligible, schema ignored]
  D -->|Yes| F[Eligible for rich result]
  F --> G[Knowledge Graph link via sameAs and id]
  G --> H[AI Overviews and AI Mode use entity to ground answer]
  C --> I[Bing, ChatGPT Search via Bing index]
  I --> J[ChatGPT cites page]
  C --> K[Perplexity own crawler + Brave]
  K --> L[Perplexity cites page]

Bad vs. expert

The bad approach

<!-- Stale Microdata with missing required fields, mixed with old hreviews -->
<div itemscope itemtype="http://schema.org/Product">
  <h1 itemprop="name">Acme Widget</h1>
  <span itemprop="description">A great widget.</span>
  <div itemprop="aggregateRating" itemscope itemtype="http://schema.org/AggregateRating">
    <span itemprop="ratingValue">4.9</span>
  </div>
</div>

<!-- Separate FAQ Microdata block on a sales page that has no real FAQ -->
<div itemscope itemtype="http://schema.org/FAQPage">
  <div itemscope itemprop="mainEntity" itemtype="http://schema.org/Question">
    <h3 itemprop="name">Why choose Acme?</h3>
    <div itemscope itemprop="acceptedAnswer" itemtype="http://schema.org/Answer">
      <span itemprop="text">Because we are the best!</span>
    </div>
  </div>
</div>

The Product is missing image, offers, and reviewCount — Google ignores AggregateRating when reviewCount is absent. The description is two words and adds no context. The HTTP schema.org URL still works but signals stale code. Worst of all, the FAQPage is fabricated: the questions and answers don’t appear visibly on the page, which is now a manual-action trigger (“Spammy structured markup”). Sites flagged for fake FAQ markup typically lose all rich-result eligibility for 90 days minimum.

The expert approach

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "Organization",
      "@id": "https://acme.com/#organization",
      "name": "Acme Widgets",
      "url": "https://acme.com/",
      "logo": {
        "@type": "ImageObject",
        "url": "https://acme.com/logo.png",
        "width": 600,
        "height": 60
      },
      "sameAs": [
        "https://www.wikidata.org/wiki/Q12345678",
        "https://www.linkedin.com/company/acme-widgets",
        "https://en.wikipedia.org/wiki/Acme_Widgets"
      ]
    },
    {
      "@type": "WebPage",
      "@id": "https://acme.com/widgets/x100/#webpage",
      "url": "https://acme.com/widgets/x100/",
      "isPartOf": { "@id": "https://acme.com/#website" },
      "breadcrumb": { "@id": "https://acme.com/widgets/x100/#breadcrumbs" }
    },
    {
      "@type": "BreadcrumbList",
      "@id": "https://acme.com/widgets/x100/#breadcrumbs",
      "itemListElement": [
        { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://acme.com/" },
        { "@type": "ListItem", "position": 2, "name": "Widgets", "item": "https://acme.com/widgets/" },
        { "@type": "ListItem", "position": 3, "name": "X100" }
      ]
    },
    {
      "@type": "Product",
      "@id": "https://acme.com/widgets/x100/#product",
      "name": "Acme X100 Widget",
      "description": "Hand-machined brass widget rated for 50,000 cycles. Built in Akron, Ohio since 1962.",
      "sku": "ACM-X100",
      "gtin13": "0123456789012",
      "brand": { "@id": "https://acme.com/#organization" },
      "image": [
        "https://acme.com/widgets/x100/hero.avif",
        "https://acme.com/widgets/x100/detail.avif"
      ],
      "offers": {
        "@type": "Offer",
        "url": "https://acme.com/widgets/x100/",
        "priceCurrency": "USD",
        "price": "129.00",
        "priceValidUntil": "2026-12-31",
        "availability": "https://schema.org/InStock",
        "itemCondition": "https://schema.org/NewCondition",
        "seller": { "@id": "https://acme.com/#organization" }
      },
      "aggregateRating": {
        "@type": "AggregateRating",
        "ratingValue": "4.8",
        "reviewCount": "247",
        "bestRating": "5",
        "worstRating": "1"
      }
    }
  ]
}
</script>

Every entity has an @id so the graph is internally consistent. Organization is declared once and referenced as brand and seller. BreadcrumbList matches the visible breadcrumb. Product has all six properties Google requires for Merchant listings (name, image, description, sku, offers, aggregateRating). gtin13 is the universal product identifier that lets Google merge your listing with retailer feeds. priceValidUntil prevents stale-price warnings in Search Console. The whole block validates against both Schema.org and the Rich Results Test.

Do this today

  1. Open the Rich Results Test at search.google.com/test/rich-results. Paste your homepage URL and any product, article, or local-business URL. Note every rich-result type listed and any error/warning.
  2. Open the Schema.org Validator at validator.schema.org. This catches Schema.org-level errors that Google’s tester ignores (e.g., orphaned @id, malformed @graph).
  3. In Google Search Console → Enhancements, review every report (Products, Sitelinks Searchbox, Breadcrumbs, FAQ, HowTo, Logos, Articles, Videos, Events, JobPosting, etc.). Each “Invalid items” row is a remediation ticket.
  4. Build your site’s Organization JSON-LD once in a partial template and inject it on every page. Include name, url, logo, sameAs (Wikidata, LinkedIn, Crunchbase), and a stable @id.
  5. Add BreadcrumbList to every non-homepage URL using the actual visible breadcrumb path. Self-referencing canonicals must match the last item’s URL.
  6. For e-commerce: ensure every product page has Product with name, image, description, sku, gtin13 (or mpn), brand, offers, and aggregateRating if you have first-party reviews. Set priceValidUntil to the end of the year minimum.
  7. For content: add Article or NewsArticle with headline (≤110 chars), datePublished, dateModified, author referencing a Person @id, and publisher referencing the Organization @id.
  8. Audit existing FAQPage and HowTo schema. Remove fabricated FAQ blocks immediately. Keep real FAQ markup even though SERP visibility is gone — AI Overviews and Gemini still consume it.
  9. Validate Person entries with sameAs to LinkedIn and (where applicable) Wikidata. The Person @id should be a URL that 200s — the author bio page works perfectly.
  10. Set up a schema regression test in CI. Use @google/schemarketing or a Playwright script that hits the Rich Results Test API for your top 20 templates on every deploy and fails the build on new errors.

Mark complete

Toggle to remember this module as mastered. Saved to your browser only.

More in this part

Part 5: Technical SEO

View all on the home page →
  1. 026 Technical SEO Fundamentals 12m
  2. 027 Site Architecture 20m
  3. 028 Crawling & Indexing 17m
  4. 029 robots.txt Deep Dive 15m
  5. 030 XML Sitemaps 12m
  6. 031 Canonical Tags 20m
  7. 032 Meta Robots & X-Robots-Tag 13m
  8. 033 HTTP Status Codes 15m
  9. 034 Crawl Budget Management 16m
  10. 035 JavaScript SEO 26m
  11. 036 Core Web Vitals 17m
  12. 037 Site Speed & Performance 19m
  13. 038 HTTPS & Site Security 12m
  14. 039 Mobile SEO & Mobile-First Indexing 14m
  15. 040 Structured Data & Schema Markup You're here 17m
  16. 041 International SEO (hreflang) 19m
  17. 042 Pagination 12m
  18. 043 Faceted Navigation 26m
  19. 044 Duplicate Content 13m
  20. 045 Site Migrations 24m