Entity-Based SEO & Topical Authority

Google stopped ranking strings of characters years ago. It ranks entities — concepts, people, places, products, events — and the relationships between them. If your site is a loose pile of keyword pages with no entity coherence, you are competing in 2026 with a 2012 model.

TL;DR

Entities are the unit of meaning, not keywords. Google’s Knowledge Graph maps your content to canonical IDs (Wikidata Q-codes, Google MIDs); pages that strengthen those mappings rank, pages that don’t get demoted in favor of ones that do.
Topical authority is built by completeness, not volume. Sites that cover the full entity graph around a central topic — pillar, inner sections, outer sections — earn the “trusted on this subject” signal that AI Overviews and AI Mode now use to pick sources.
Silos still work, but the modern silo is semantic, not just URL-based. Internal linking, structured data, and consistent entity references inside the body matter more than /category/subcategory/ directory structure ever did.

The mental model

Topical authority is like academic tenure: it isn’t granted for one good paper, it’s granted when the field decides you’ve published deeply across every adjacent question and other experts cite you. You earn it by covering the map, not the mountain.

Picture every topic as a graph. At the center sits the central entity — the thing your site is supposed to be about, like “personal finance” or “dog training” or “headless CMS.” Around it, an inner ring of pillar entities like “investing,” “credit cards,” “retirement,” each of which is a subtopic substantial enough to deserve its own hub page. Outside that, an outer ring of inner-section entities (“Roth IRA,” “401(k) rollovers,” “index funds”) and finally a periphery of outer-section entities — long-tail concepts, edge cases, comparisons, definitions, FAQs.

A site that covers only the inner two rings looks like a 2013 SEO blog. A site that covers all four — and links them coherently — is what Google calls topically authoritative. The Helpful Content system, MUM, and the Gemini-powered AI Overviews ranker all favor the latter when they pick a citation.

The trap most teams fall into: they chase keyword difficulty rather than entity coverage. They publish ten posts on “best credit cards” because that’s the volume keyword, and zero on “soft pull vs hard pull,” “credit utilization timing,” or “secured card graduation policies” — the entities that make the cluster trustworthy.

Deep dive: the 2026 reality

Google’s understanding of entities is anchored in three internal systems plus a public mirror.

The Knowledge Graph is the public-facing layer — what powers knowledge panels and the snippets you see in AI Overviews. Every entity has a stable MID (machine ID, e.g. /m/0dl567 for Taylor Swift). Wikidata’s Q-codes (Q26876 for the same artist) provide the cross-language pivot. As of the 2025 documentation update, Google publicly confirmed using Wikidata as one of three priority entity-resolution sources alongside Wikipedia and a curated internal store.

MUM (Multitask Unified Model, in production since 2021 and substantially upgraded in 2024) reasons across modalities and languages to figure out what a query is about even when it doesn’t contain the canonical entity name. “What’s the best portable typewriter from the 1960s under $200?” gets resolved into the entity set {portable typewriter, 1960s, retail price band} and matched to pages that cover those entities together — not pages with the highest exact-match keyword density.

The Helpful Content system, fully merged into the core algorithm in March 2024, uses entity coverage as one of its quality signals. A site that has 80 articles all targeting permutations of “best [thing]” but no foundational coverage of what the thing is gets sitewide-demoted. Coverage is checked against a model of what a knowledgeable site on the topic would contain.

For AI surfaces, the math is even more brutal. AI Overviews and AI Mode (Google’s full-conversational search experience, launched May 2024 and expanded globally in 2025) both pick citations partly by topical breadth. Perplexity and ChatGPT Search behave similarly, weighting domains they’ve previously cited as entity-coherent. GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, and Google-Extended all harvest your entity graph; the more consistent your internal entity references, the more of your pages end up in their training and retrieval sets.

A practical consequence: a site that publishes 500 thin keyword posts now ranks worse than a site that publishes 80 well-linked pages covering the same entity graph completely. The 2024 Reddit-content boom in SERPs is a symptom of this — Reddit threads are entity-rich (every comment references concrete people, products, decisions) even when individually short.

Visualizing it

flowchart TD
  A["Central entity: Personal Finance"] --> B["Pillar: Investing"]
  A --> C["Pillar: Credit"]
  A --> D["Pillar: Retirement"]
  B --> B1["Inner: Index Funds"]
  B --> B2["Inner: Brokerages"]
  B1 --> B1a["Outer: Vanguard vs Fidelity"]
  B1 --> B1b["Outer: Expense Ratio Math"]
  C --> C1["Inner: Credit Cards"]
  C --> C2["Inner: Credit Score"]
  C2 --> C2a["Outer: Soft vs Hard Pull"]
  C2 --> C2b["Outer: Utilization Timing"]
  D --> D1["Inner: Roth IRA"]
  D --> D2["Inner: 401k"]
  D1 --> D1a["Outer: Backdoor Roth"]

Each node is a URL. Every line is an internal link. The graph is the site.

Bad vs. expert

The bad approach

Most sites build keyword-first. They open Ahrefs, sort by volume × difficulty, and publish whatever surfaces. The output looks like this:

<!-- /best-credit-cards-2026/ -->
<h1>Best Credit Cards 2026</h1>
<p>Looking for the best credit cards in 2026? Our list of the top credit cards
for 2026 will help you find the best credit card for your needs in 2026.</p>

<!-- /best-credit-cards-for-travel/ -->
<h1>Best Credit Cards for Travel</h1>
<p>Looking for the best travel credit cards? Our list of the top travel
credit cards will help you...</p>

<!-- /best-credit-cards-for-cashback/ -->
<h1>Best Credit Cards for Cashback</h1>
<p>Looking for the best cashback credit cards? Our list...</p>

There is no foundational page on “what is a credit card,” no glossary, no entity-level explainer of soft vs hard pulls, no Person schema on the author, no internal links pointing at the rare entities each list mentions. From Google’s perspective, this site has 50 lottery tickets on the same number. The Helpful Content classifier reads it as a thin keyword farm and pushes it down.

The expert approach

Build the entity graph first, then attack keywords inside it. Start every cluster with a pillar page that defines the central concept and links out to every inner-section page, each of which links to its outer-section pages. Add structured data that resolves to known entities, and link author pages with Person schema that resolves to a canonical identity.

<!-- /credit/ — pillar -->
<article>
  <h1>Credit: A Complete Guide</h1>
  <p>Credit is your borrowing capacity, scored by bureaus...</p>
  <h2>Foundations</h2>
  <ul>
    <li><a href="/credit/score/">How credit scores are calculated</a></li>
    <li><a href="/credit/report/">What's on a credit report</a></li>
    <li><a href="/credit/utilization/">Credit utilization</a></li>
  </ul>
  <h2>Cards</h2>
  <ul>
    <li><a href="/credit/cards/">Choosing a credit card</a></li>
    <li><a href="/credit/cards/secured/">Secured cards</a></li>
    <li><a href="/credit/cards/travel/">Travel rewards cards</a></li>
  </ul>
</article>

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "How Credit Utilization Affects Your Score",
  "author": {
    "@type": "Person",
    "name": "Maya Chen",
    "url": "https://example.com/team/maya-chen/",
    "sameAs": [
      "https://www.linkedin.com/in/mayachen/",
      "https://www.wikidata.org/wiki/Q123456789"
    ],
    "jobTitle": "Senior Personal Finance Editor",
    "knowsAbout": ["Credit", "Personal Finance", "Consumer Lending"]
  },
  "about": [
    { "@type": "Thing", "name": "Credit utilization", "sameAs": "https://en.wikipedia.org/wiki/Credit_utilization" },
    { "@type": "Thing", "name": "Credit score", "sameAs": "https://en.wikipedia.org/wiki/Credit_score" }
  ],
  "mentions": [
    { "@type": "Organization", "name": "FICO", "sameAs": "https://en.wikipedia.org/wiki/FICO" }
  ]
}

The sameAs arrays anchor your entities to Wikidata and Wikipedia, which is how Google reconciles them to MIDs. The pillar’s link structure tells the crawler which pages the editorial team considers authoritative on each subtopic. The Person schema gives the author a stable identity across articles, which is what E-E-A-T’s “Experience” signal looks for.

Approach	Coverage	Internal links	Schema	Outcome
Keyword-first	Holes everywhere	Random	Boilerplate Article	Demoted by HCS
Entity-first	Complete graph	Pillar to inner to outer	Person, Article, about, mentions	Cited in AI Overviews

Do this today

Open Google Search Console and click Search results. Filter by your top three URL prefixes and export the Queries report. In a spreadsheet, group queries by the entity they reference (use a LEFT(query, FIND(" ", query)-1) heuristic if the dataset is big). Empty entity groups are your gaps.
In Ahrefs, run Keywords Explorer on your central entity, switch to the Matching terms report, and toggle Questions. Export the top 1,000 questions; cluster them by the head noun in each (e.g., “Roth IRA,” “401k”). Each cluster is a candidate inner section.
Open Wikipedia for your central entity and copy the table of contents. Every H2 in that ToC is a known sub-entity. Cross-check against your sitemap; if you have no page targeting a Wikipedia H2, add it to the editorial backlog.
Build a topical map spreadsheet with four columns: Central entity, Pillar, Inner section, Outer section. Populate from steps 1–3. This is your 12-month editorial plan.
For every existing pillar page, audit the internal-link block. It must link to every inner section it claims to cover. Use Screaming Frog with Configuration > Include set to your pillar URL prefix to crawl just that silo and visualize missing links in Site Visualization > Force-Directed Diagram.
Add sameAs Wikidata links to your top 25 author pages. Find the Wikidata ID by searching the author’s name on wikidata.org; if they don’t have one and they meet notability, create it.
Validate your Person, Article, and WebSite schema on the Schema Markup Validator at validator.schema.org. Then run the same URLs through Google’s Rich Results Test to confirm Google parses them.
In Sitebulb, run a crawl with Internal Link Score enabled. Sort pages by entity cluster and inspect any inner-section page with fewer than 5 internal links — those are orphan-adjacent and won’t accumulate authority.
Track branded entity queries monthly in GSC: filter Queries to ones containing your brand name + a topic (“yourbrand credit utilization”). Growth in this segment is the leading indicator that you’re being learned as the authority on that topic.
In Ahrefs Site Explorer, open the Top topics report. The list reflects the entities Google associates with your domain. Anything in your editorial plan that doesn’t appear there after six months means your linking and schema aren’t reinforcing it strongly enough — fix the silo before publishing more.