URL Structure
Short, descriptive, keyword-aware URLs. Folder hierarchy, trailing-slash decisions, parameters, lowercase, hyphens — and how to change a URL safely when you must.
URLs are the most permanent decision you make about a piece of content. Title tags, meta descriptions, even body copy can be rewritten freely; URLs accumulate inbound links, browser bookmarks, social-share embeds, and crawl history. Pick badly and you pay off the debt for years; change one without the right redirects and you erase its earned authority overnight.
TL;DR
- Short, descriptive, lowercase, hyphenated. No dates in URLs unless the content is dated; no IDs unless they’re stable; no parameters when a static path will do.
- Pick one canonical form: trailing slash or no slash, www or apex, http or https — and 301 every other version to it. Mixed canonicals are the single most common source of crawl-budget waste on otherwise healthy sites.
- Changing a URL costs you authority; do it only when the old one is fundamentally wrong, and always with a permanent (301) redirect. A clean URL change with a redirect typically retains 90–99% of link equity; a sloppy one loses 30–50%.
The mental model
A URL is an address. The right test is: would a stranger reading the address understand roughly what’s at the end of it, and could you tell it to someone over the phone without spelling? /best-crm-for-saas/ passes both tests. /?p=2147483647 passes neither. /best_crm_for_SAAS/ is unreadable and requires spelling out underscores and capitalization.
URLs are also the primary navigation key for users, search engines, and AI crawlers. Google uses URL structure as a weak ranking signal, but a stronger signal for understanding what a page is about before crawling it. AI surfaces — GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended — all use URL paths to bucket content during indexing. A clean URL is a free hint to every system reading your site.
The folder hierarchy in your URLs models your site’s topic graph. /credit/cards/secured/ tells crawlers and users that “secured cards” is a subset of “cards” which is a subset of “credit.” This grouping reinforces the topical authority work in module 13.
Deep dive: the 2026 reality
URL as ranking signal. Google has confirmed (Mueller, 2017 and re-confirmed 2023) that words in URLs are a “very small ranking factor.” The practical impact is small but real, and the indirect benefits — CTR, link anchor text reuse, social share copy — make URL hygiene worth the effort.
Length. A 2024 Backlinko analysis of 11M SERPs: top-ranking URLs were about 9 characters shorter on average than position-50 URLs. The mechanism is mostly cleanliness, not length per se — short URLs tend to be clean URLs.
Trailing slash. Google treats /page/ and /page as different URLs, and the difference is real for crawl budget if both versions return 200. Pick one as canonical, 301 the other, and never mix:
| Decision | When | Why |
|---|---|---|
| Trailing slash | Most CMSes, content sites | Default for Apache, WordPress, Astro |
| No trailing slash | Static sites, JAMstack, Vercel | Cleaner, easier to reason about |
| Mixed | Never | Halves your crawl efficiency |
www vs. apex. No SEO difference, but pick one. Apex (example.com) is more popular in 2026 and is the recommended pattern for static hosting (Vercel, Netlify, Cloudflare Pages). Whichever you pick, 301 the other to it.
HTTPS. Universal in 2026. Any HTTP URL on a public site is a hard fail — both Google and modern browsers flag them.
Parameters. URL parameters (?page=2&sort=price) are crawled but not preferred. Use them for true ephemeral filters (search results, faceted navigation), never for canonical content. Set parameters to noindex or, better, use canonical URLs to point parameter pages back to the static version. Faceted navigation is one of the largest crawl-budget leaks on e-commerce sites; the solution is a canonical link plus careful robots.txt rules. The old GSC URL Parameters tool was deprecated in 2022 — rel=canonical is the modern replacement.
Hyphens vs. underscores. Hyphens are word separators to Google; underscores are not (confirmed Mueller 2018). secured-cards is parsed as “secured” + “cards”; secured_cards is parsed as “secured_cards.” Use hyphens.
Case. Lowercase always. URLs are technically case-sensitive on most servers; /Page/ and /page/ returning the same content with both at status 200 creates duplicate content. Force-lowercase via your reverse proxy.
Stop words. Articles (“the,” “a,” “and”) add length without value. Strip them: /credit-utilization-and-your-score/ becomes /credit-utilization-score/. Don’t go cryptic — preserve readability — but trim what’s not load-bearing.
Dates in URLs. Avoid unless the content is genuinely date-bound (news, time-stamped reports). For evergreen content, dated URLs become liabilities the moment they go stale; you’ve seen /best-crm-2022/ ranking for nothing in 2026 because it can’t be refreshed without breaking links.
Stable IDs. When you must use an ID, make it human-readable and stable. /articles/abc123-clean-slug/ is fine; /?p=987 is opaque and fragile.
URL changes. Sometimes you must change a URL — a typo, a rebrand, a structural reorg. The rule:
- 301 (permanent) every old URL to its new home.
- Update all internal links in your CMS to the new URL — don’t rely on the redirect chain for internal traffic.
- Update canonical tags and the sitemap.xml the same day.
- Submit the updated sitemap to GSC and Bing Webmaster Tools.
- Allow 30–90 days for full re-association of authority.
A clean change retains 90–99% of link equity. A sloppy one (302 instead of 301, missing pages, redirect chains > 1 hop) can lose 30–50%.
Visualizing it
flowchart TD
Old["/old-url/ (with backlinks, ranks)"] --> Change["URL change required"]
Change --> Redirect["301 from /old-url/ to /new-url/"]
Redirect --> Internal["Update all internal links"]
Internal --> Canonical["Update rel canonical and sitemap.xml"]
Canonical --> Submit["Resubmit sitemap in GSC + Bing"]
Submit --> Wait["30 to 90 days for full re-association"]
Wait --> Result["90 to 99 percent of authority retained"]
Bad vs. expert
The bad approach
/index.php?p=234&cat=12&utm_source=oldcampaign
/Best_CRM_2022_for_SAAS_Founders_compare-now-cheap.html
/blog/2024/05/22/our-thoughts-on-the-new-crm-launch-from-our-team/
/PRODUCT/CATEGORY/Item.html
http://www.example.com/page/ <-- mixed http, www, trailing slash
The first hands all canonical leverage to query parameters. The second is keyword-stuffed and capitalized. The third buries the slug under three date folders and uses a verbose, conversational slug. The fourth has uppercase. The fifth uses HTTP, www, and a trailing slash where the rest of the site uses none of those.
The expert approach
/best-crm-for-saas/ <-- short, descriptive, lowercase
/credit/cards/secured/ <-- folder hierarchy reflects topic graph
/blog/why-we-rebuilt-our-onboarding-in-rust/ <-- /blog/ prefix only if you have a blog section
/products/widgetbox-pro/ <-- product slug, no SKU in URL
/team/maya-chen/ <-- single-folder author URL
The reverse proxy enforces the canonical form for everything else.
# Force HTTPS, apex (no www), and lowercase
server {
listen 80;
server_name www.example.com example.com;
return 301 https://example.com$request_uri;
}
server {
listen 443 ssl http2;
server_name www.example.com;
return 301 https://example.com$request_uri;
}
server {
listen 443 ssl http2;
server_name example.com;
# Force trailing slash on directory routes
rewrite ^([^.]*[^/])$ $1/ permanent;
# Force lowercase
if ($request_uri ~ [A-Z]) {
rewrite ^(.*)$ $1 permanent;
}
# Old dated CRM URLs collapse to canonical
rewrite ^/best-crm-software-(2020|2021|2022|2023|2024|2025)/?$ /best-crm-for-saas/ permanent;
}
<!-- For URLs that must use parameters (search, filters), set canonical -->
<link rel="canonical" href="https://example.com/products/" />
| Element | Bad | Expert |
|---|---|---|
| Length | 90+ chars, keyword-stuffed | 25–60 chars, descriptive |
| Case | Mixed or uppercase | Lowercase |
| Separators | Underscores or spaces | Hyphens |
| Trailing slash | Mixed | Consistent, redirected |
| www / apex | Mixed | One canonical, redirected |
| Parameters | Used as canonical | Static path canonical, parameters noindex/canonical |
| Dates | Embedded in evergreen | Only for genuinely dated content |
Do this today
- In Screaming Frog, run a full crawl. Click URL tab > Filter: Uppercase, Underscores, Parameters, Over 115 Characters. Each list is a fix backlog.
- Confirm one canonical form for trailing slash and www / apex. Test by visiting both versions in incognito Chrome — exactly one should return 200; the other must 301-redirect to it.
- Open your reverse proxy config (nginx, Cloudflare Rules, Vercel
vercel.json, Netlify_redirects) and add rules to enforce the canonical form. Test withcurl -I https://yourdomain.com/pageand confirm the response headers. - In Google Search Console > Settings > Domain Property, confirm your domain is set up at the apex level (covers www and apex automatically). If not, migrate.
- Run a redirect audit with httpstatus.io or Screaming Frog’s Response Codes tab. Fix any chains > 1 hop (
A -> B -> Cshould beA -> CandB -> C). - For URLs containing dates in evergreen content, plan a 301 collapse.
/best-crm-2022/→/best-crm/. Update the canonical URL in the CMS, then ship the redirect. - In Google Search Console > Pages, look for Duplicate, Google chose different canonical than user. Each such URL is a canonical or redirect issue.
- For e-commerce or faceted navigation pages, audit your rel=canonical on filter and sort pages. They should canonicalize to the parent category, not to themselves.
- Update your sitemap.xml to contain only canonical URLs (lowercase, correct trailing slash, no parameters). Resubmit in GSC under Sitemaps.
- Document your URL conventions in a
/docs/url-conventions.mdfile in the repo. Pin it as the source of truth so future engineers and writers don’t reintroduce variance. Include: trailing slash policy, casing, hyphenation, folder hierarchy, redirect rules.
Mark complete
Toggle to remember this module as mastered. Saved to your browser only.
More in this part