Skip to main content
PUBLISHED

Semantic HTML: Writing Code That LLMs Understand

Key Takeaways & Executive Summary

Div soup kills your GEO rankings. Semantic HTML provides the necessary hierarchy and structure for AI web scrapers to parse your content efficiently.

The Problem with Modern Web Dev

With the rise of React, Tailwind, and component-based frameworks, many websites have devolved into a nested mess of <div> tags. While this looks fine to human eyes, it is a nightmare for AI web scrapers operating under strict latency constraints.

CORE_CONCEPT

Semantic HTML

The use of HTML markup to reinforce the semantics, or meaning, of the information in webpages rather than merely to define its presentation.

How LLMs Parse the Web

When ChatGPT or Claude browses your site, they run a headless browser, extract the DOM, and typically run a readability script to strip out navbars and footers. They rely heavily on standard HTML tags to understand what is important.

Bad (Div Soup)Good (Semantic)
<div class='text-2xl font-bold'>Title</div><h1>Title</h1>
<div class='flex'><div class='w-1/2'>Point A</div></div><ul><li>Point A</li></ul>
<div class='grid'>...</div> (for data)<table><thead>...</thead></table>

Strategic Action Plan

  1. Strict Heading Hierarchy: One H1 per page. H2s for main sections. H3s for sub-sections. Never skip heading levels.
  2. Use Data Tables: LLMs love tables. If you are comparing your product to a competitor, ALWAYS use an HTML <table>, not a flexbox grid. Tables are ingested flawlessly into AI context windows.
  3. Article and Section tags: Wrap your main content in <article> and logically divide it with <section> tags.