Div soup — pages built entirely of nested divs with class names — works visually but tells AI engines nothing about content structure. Where does the article end and the sidebar begin? Is that navigation or main content? Semantic HTML5 elements (article, main, nav, aside, header, footer, section) make boundaries explicit. AI engines extract main content cleanly and ignore boilerplate. This guide covers the swap.
| Element | Purpose | Implicit ARIA role |
|---|---|---|
<header> | Page or section intro, branding | banner (if page-level) |
<nav> | Major navigation block | navigation |
<main> | Primary content of the document | main |
<article> | Self-contained content (post, product, comment) | article |
<section> | Thematic grouping with heading | region |
<aside> | Tangentially related (sidebar, callout) | complementary |
<footer> | Footer of document or section | contentinfo (if page-level) |
<address> | Contact info for nearest article/body | — |
<time> | Date/time with machine-readable datetime | — |
<figure> + <figcaption> | Image/diagram with caption | figure |
<div class="page">
<div class="header">
<div class="logo">...</div>
<div class="nav-bar">
<div class="nav-item">...</div>
</div>
</div>
<div class="content">
<div class="article">
<div class="title">...</div>
<div class="body">...</div>
</div>
<div class="sidebar">...</div>
</div>
<div class="footer">...</div>
</div>
<body>
<header>
<a href="/" class="logo">Acme</a>
<nav aria-label="Main">
<ul>
<li><a href="/audit-tools.html">Products</a></li>
<li><a href="/seo-audit-platform.html">About</a></li>
</ul>
</nav>
</header>
<main>
<article>
<header>
<h1>Article title</h1>
<p>By <a href="/learning-hub.html">Jane</a>,
<time datetime="2026-05-18">18 May 2026</time></p>
</header>
<section>
<h2>Introduction</h2>
<p>...</p>
</section>
<section>
<h2>Methodology</h2>
<p>...</p>
</section>
<footer>
<p>Tags: <a href="/learning-hub.html">CRM</a></p>
</footer>
</article>
<aside aria-label="Related">
<h2>Related articles</h2>
<ul>...</ul>
</aside>
</main>
<footer>
<p>© 2026 Acme. <a href="/seo-auth/privacy.html">Privacy</a></p>
</footer>
</body>
Article: self-contained, could syndicate. Blog post, news item, product card, comment, forum reply.
Section: thematic chunk within larger content. Introduction, Methodology, FAQ within an article.
<!-- Article contains multiple sections -->
<article>
<h1>Complete CRM buyer's guide</h1>
<section>
<h2>What is a CRM?</h2>
<p>...</p>
</section>
<section>
<h2>How to evaluate options</h2>
<p>...</p>
</section>
</article>
<!-- Multiple articles on a listing page -->
<main>
<h1>Latest posts</h1>
<article>
<h2><a href="/learning-hub.html">Post 1</a></h2>
<p>Excerpt...</p>
</article>
<article>
<h2><a href="/learning-hub.html">Post 2</a></h2>
<p>Excerpt...</p>
</article>
</main>
One h1 per page. h2 starts main sections. Don't skip levels (h2 → h4 is wrong).
<main>
<article>
<h1>Complete CRM buyer's guide</h1>
<section>
<h2>Pricing</h2>
<section>
<h3>Subscription vs perpetual</h3>
<p>...</p>
</section>
<section>
<h3>Per-user vs flat-rate</h3>
<p>...</p>
</section>
</section>
<section>
<h2>Implementation</h2>
<p>...</p>
</section>
</article>
</main>
<figure>
<img src="/charts/cwv-improvement.png"
alt="CWV improvement chart showing 40% LCP reduction over 3 months"
width="800" height="400" />
<figcaption>
Core Web Vitals improvement after CDN migration: LCP dropped 40%,
FCP dropped 35%, TTFB dropped 75% (Jan-Mar 2026)
</figcaption>
</figure>
<!-- figcaption text is extractable; alt text is fallback -->
<!-- Both reach AI engines -->
<p>Published <time datetime="2026-05-18T09:00:00Z">18 May 2026</time> </p> <p>Updated <time datetime="2026-08-22T14:30:00Z">22 August 2026</time> </p> <!-- Machine-readable datetime attribute lets AI engines parse exact instant --> <!-- Visible text can be human-friendly -->
<nav aria-label="Main">
<ul>
<li><a href="/audit-tools.html">Products</a></li>
<!-- ... -->
</ul>
</nav>
<nav aria-label="Breadcrumb">
<ol>
<li><a href="/">Home</a></li>
<li><a href="/audit-tools.html">Products</a></li>
<li aria-current="page">Widget</li>
</ol>
</nav>
<nav aria-label="On this page">
<ul>
<li><a href="#intro">Introduction</a></li>
<li><a href="#pricing">Pricing</a></li>
</ul>
</nav>
<!-- aria-label differentiates multiple navs to AI engines and screen readers -->
<header> + <main> + <footer> at minimum. That single change tells every AI engine where your boilerplate ends and content begins. Then iteratively swap inner divs for article, section, nav, aside as you touch templates.