AI Search Optimization (GEO-AEO)

May 7, 2026

llms.txt: What It Actually Does, What It Doesn’t, and Whether It Matters for Your AI Visibility Today

Q: What is an llms.txt file?

A plain Markdown file placed at your website's root (yourdomain.com/llms.txt) that lists curated links to your most important content for AI systems to read first. It's a curation file, not a security boundary — voluntary opt-in, no enforcement. The full technical proposal is at llmstxt.org.

Q: What is the difference between robots.txt and llms.txt?

robots.txt is the standardized, RFC-9309-defined web protocol every crawler is expected to honor — it controls who's allowed to crawl what. llms.txt is a proposed (not standardized) Markdown file that tries to curate content for AI systems to prefer reading. Different jobs entirely: robots.txt gates access; llms.txt suggests focus. Right now, only robots.txt is enforced. llms.txt is voluntary on the AI side and largely ignored.

You’ve been hearing about llms.txt for months. Maybe your SEO plugin nudged you to enable it inside the dashboard. Maybe a peer asked if you’d shipped one yet. Maybe an SEO YouTuber called it the next big thing for AI visibility, or maybe you saw it land in a newsletter you actually trust. You’re here because you want to know if it actually matters for your business — before you spend an afternoon on it.

And if you’ve started looking into it seriously, you’ve probably noticed something else: the SEO industry can’t agree on the answer. Semrush and Ahrefs are telling you to skip the file entirely. Yoast and RankMath quietly built features so the file ships for you — and even they admit no major LLM uses it today. SEO communities like r/SEO are split right alongside the tool vendors.

So who’s right, for your business specifically?

That’s the question this article exists to answer, for one specific audience: established B2B advisory firms in the $500K–$10M revenue range, sophisticated enough to read SEO blogs but tired of vendor pieces that won’t commit to a verdict.

Here’s what you’ll get: a 2-minute honest definition of what llms.txt actually is, primary-source data on which AI systems honor it (spoiler: none of the major ones), a frame to resolve the industry split for your situation — and a clear path forward whether you decide to ship, skip, or wait.

⚡ TL;DR — Key Takeaways

The file itself is a curation map, not enforcement: llms.txt is a Markdown file at your site root that lists URLs you’d most want AI systems to read first. It doesn’t block training, override robots.txt, or enforce anything — it’s voluntary opt-in on the AI side.
None of the six major LLM providers reference it in their official crawler docs: not OpenAI, not Anthropic, not Google, not Perplexity, not Meta, not Microsoft. We checked. Every one of them controls AI bot access via robots.txt only. Google’s John Mueller stated publicly: “FWIW no AI system currently uses llms.txt.”
The SEO industry is split, but agrees on the facts: Semrush and Ahrefs say skip the file entirely. Yoast and RankMath built features and ship it as a hedge — yet even they explicitly admit no LLM uses it today. The disagreement is on action, not on what’s actually happening with adoption.
Our verdict for $500K-$10M B2B advisory firms: skip for most readers (the higher-leverage infrastructure work usually isn’t done yet); ship as a hedge if your schema layer is already clean and your content surface is small enough to curate by hand; wait if you want to see real-world adoption data first.
The decision works for any of those answers: three explicit conditions per path so you can apply the frame in 5 minutes against your last AI Visibility Assessment Tool result. No matter which path you pick, the higher-impact AI search visibility work — schema, robots.txt review, AI bot real-fetch testing, server-log review — moves the needle more than llms.txt can today.

📋 Table of Contents

What llms.txt Is — the Spec, in 2 Minutes
Who’s Actually Honoring It: a Brutally Honest Adoption Status
The Real Industry Split: Skeptical Camp vs Hedge Camp
The Honest Verdict: Ship, Skip, or Wait — for This Audience
What to Audit Instead: the Real AI Search Infrastructure Questions
Frequently Asked Questions
The Bottom Line

What llms.txt Is — the Spec, in 2 Minutes

How the file is structured

llms.txt is a plain Markdown file you place at the root of your domain — https://yoursite.com/llms.txt — written in human-readable Markdown so you can author it directly, without a plugin or generator. The format is simple: a heading with your site’s name, an optional summary paragraph, then curated sections of links to the content you’d most want an AI system to read when answering questions about your business.

The full technical proposal — including the llms-full.txt companion that inlines the actual content rather than just listing links — is at llmstxt.org, drafted by Jeremy Howard. Read it once if you want the spec details; it’s short.

What it’s designed to do — and what it doesn’t do

llms.txt is a curation file. It tells AI systems: “Here’s the stuff on my site I think you should pay attention to first.” Think of it as the museum-guide map for your content, not the security guard.

What it doesn’t do: block training, override robots.txt, control crawler behavior, or enforce anything. It’s voluntary opt-in on the AI side. If the AI doesn’t know to look for it — and as we’ll see in the next section, none of the major ones currently do — the file may as well not exist.

This is the infrastructure layer of AI search visibility. It sits next to robots.txt (which controls who’s allowed to crawl) and sitemap.xml (which lists what exists). For the content layer — schema markup, on-page entity clarity, and how individual articles get optimized for AI surfaces — see our companion piece on content-layer optimization for LLMs.

Who’s Actually Honoring It: a Brutally Honest Adoption Status

Here’s the part nobody in the SEO industry quite says straight: as of today, none of the six major LLM providers reference llms.txt in their official crawler documentation. Not OpenAI. Not Anthropic. Not Google. Not Perplexity. Not Meta. Not Microsoft. Every one of them controls AI bot access through robots.txt directives — and only robots.txt directives. We checked all six. Zero adoption.

The detail per provider is worth knowing, because it shapes what you can actually do.

OpenAI

OpenAI runs three crawlers, each with a different job: GPTBot (collects training data for the underlying model), OAI-SearchBot (powers the live web search inside ChatGPT), and ChatGPT-User (fetches a webpage when a user asks ChatGPT to “summarize this URL”). All three honor robots.txt. None of OpenAI’s published documentation mentions llms.txt.

Anthropic (Claude)

Anthropic also runs three: ClaudeBot for training, Claude-User which (per Anthropic’s own documentation) “supports Claude AI users — when individuals ask questions to Claude, it may access websites using a Claude-User agent”, and Claude-SearchBot which improves search relevance inside Claude’s answers. Same pattern: robots.txt is the only control surface. llms.txt is not mentioned. Anthropic does host their own developer-docs llms-full.txt at platform.claude.com/llms-full.txt — but that’s them publishing one for their own developer audience, not a statement that ClaudeBot reads anyone else’s.

Google (Google-Extended + Gemini)

Google’s AI training control is Google-Extended — and there’s a technical detail most articles miss: Google-Extended doesn’t have its own user-agent string. It uses Googlebot’s existing UA. The token Google-Extended only appears as a robots.txt directive in a “control capacity.” To opt out of AI training, you write User-agent: Google-Extended then Disallow: / in your robots.txt. And in late 2025, Google’s John Mueller stated publicly: “FWIW no AI system currently uses llms.txt.” Google has not committed to a future timeline either.

Perplexity, Meta, and Microsoft

Perplexity runs PerplexityBot (which they explicitly say “is not used to crawl content for AI foundation models — it indexes for live search results”) and Perplexity-User. They host their own /llms.txt for their docs. Worth noting: Cloudflare de-listed Perplexity as a verified bot in 2025 for stealth-crawling sites that had explicitly disallowed PerplexityBot. So even on the standards Perplexity does claim to honor, enforcement has been uneven — which weakens any voluntary opt-in standard further.

Meta runs the most extensive AI crawler suite of any provider — five distinct bots: Meta-ExternalAgent (training), Meta-WebIndexer (Meta AI search), Meta-ExternalAds (ad-product improvement), Meta-ExternalFetcher (agentic on-demand fetching), and the long-standing FacebookExternalHit (preview-card crawler). Zero mentions of llms.txt across any of it.

Microsoft runs bingbot, which serves both traditional Bing search and Microsoft Copilot answers — same UA, dual purpose. Notably, Copilot’s newer agentic features run on standard Edge browser user-agents with no bot signal at all — meaning even if llms.txt were honored at scale, agentic AI traffic of that kind would be indistinguishable from a human visitor, and no opt-in standard could touch it.

🎯 What the SEO Tools Agree On

The four major SEO vendors that wrote about llms.txt — Semrush, Ahrefs, Yoast, RankMath — all four publicly acknowledge that no major LLM uses the file today. Even Yoast, who ships llms.txt automatically inside their plugin, frames it as “a low-effort, no-risk addition that helps prepare your site for a future where structured LLM access becomes more standardized.” Even RankMath, who built their own llms.txt module, calls it “a proposal and not an official standard” with “still limited” platform adoption. The industry isn’t divided on the facts. It’s divided on what to do about them.

The Real Industry Split: Skeptical Camp vs Hedge Camp

So if every vendor agrees on the facts, why isn’t there one consensus answer? Because the four vendors split into two camps — neither of which resolves the decision for your specific business.

The Skeptical camp: Semrush + Ahrefs say skip

Semrush published their take in November 2025, after running their own server-log testing from March through October. Their conclusion was unambiguous: “Using llms.txt is probably not worth your time right now, unless you’re just curious.” They cite zero AI crawler hits across seven months of monitoring, NerdyData adoption stats showing roughly 951 domains had llms.txt files as of mid-2025 (~0.0% of the web), and Mueller’s denial.

Ahrefs went further in their March 2026 update: “No major LLM provider currently supports llms.txt. Not OpenAI. Not Anthropic. Not Google.” And they compared the format to the now-defunct keywords meta tag — “a false solution marketers latched onto” — recommending readers use real bot-analytics tooling instead of speculative infrastructure files.

Both pieces are honest. Both cite primary-source data. Both tell you not to ship.

The Hedge camp: Yoast + RankMath say ship-and-forget

Yoast became the first SEO plugin to ship a free llms.txt generator, in June 2025. Their framing: “Right now, no major LLM provider officially supports llms.txt … there’s no evidence that any crawler is actively using them in retrieval or training. Still, it’s a low-effort, no-risk addition that helps prepare your site for a future where structured LLM access becomes more standardized.”

RankMath built their own module shortly after, with similar reasoning: the standard isn’t official, adoption is limited, and the file might still help if AI platforms shift. Both vendors built features anyway — to give site owners optionality.

Both pieces are honest. Both cite real data. Both tell you to ship.

Why both camps are right — and why neither answers your question

The disagreement isn’t on the facts. Both camps acknowledge zero current adoption. The disagreement is on action: the Skeptical camp values the time you’d spend not shipping; the Hedge camp values the optionality you’d get by shipping. Both are defensible positions for the average reader of an SEO blog. Neither was written specifically for a $500K–$10M B2B advisory firm with a small content surface, an AI-curious audience, and limited internal infrastructure capacity. That’s the gap this article fills.

(The same split shows up reader-side, by the way: SEO communities like r/SEO are split right along the tool-vendor lines — the community-side breakdown ships as our companion piece, What r/SEO Actually Thinks About llms.txt, immediately after this article.)

The Honest Verdict: Ship, Skip, or Wait — for This Audience

Here’s the decision frame for a $500K–$10M B2B advisory firm. Not for everyone. For you, with your operating constraints.

Ship if all three of these are true

You already have a clean, schema-marked-up site. Article schema, Organization, Person/Author, FAQPage where appropriate. If you don’t have schema in place, ship the schema first — it’s a layer with proven AI-citation impact today, unlike llms.txt.
Your content surface is small enough to curate by hand. Thirty to sixty high-value pages? You can curate a sharp llms.txt in under an hour. Five hundred articles? You need a generator and an opinion about what to include — and a generator-based file defeats the curation premise of the spec.
You can absorb the 30-minute setup cost without trading off something else. If shipping llms.txt means NOT auditing your robots.txt, NOT fixing schema gaps, NOT writing the next article in your cluster — skip it. The opportunity cost is real even when the dollar cost is zero.

If all three are true, ship it. You’re buying optionality, not visibility. Document your file’s structure so when adoption shifts (if it does), you can iterate quickly.

Skip if you’re resource-constrained

If shipping llms.txt means your team won’t get to the work that has measurable AI-visibility impact this quarter — robots.txt review, schema layer, server-log review for AI bot blocks, content gaps in your cluster — skip it. Spend the half-day on the layer that moves something. llms.txt will still be there in 90 days if you decide to revisit.

This is the right call for most readers in this audience band. Not because llms.txt is bad — but because the higher-impact infrastructure work usually isn’t done yet, and that’s where the leverage is.

Wait if you want to see real-world data first

Wait if you want to see whether adoption shifts in the next quarter, or whether one of the major LLM providers makes an announcement, or whether independent measurement starts to catch AI bot fetches in the wild. There’s no penalty for waiting. The file isn’t disappearing.

How to decide in 5 minutes

Pull up your last AI Visibility Assessment Tool result, or run one if you haven’t. If your schema-layer score is below 70%, skip llms.txt. If it’s above 85% and you have a small content surface, ship it. If it’s between, wait — the lift from finishing the schema layer is bigger than anything llms.txt can deliver this quarter.

Trying to Figure Out Which AI Search Signals Actually Matter?

If you’re trying to figure out which AI search signals actually matter for your business — not just llms.txt — that’s the strategic gap our GEO service is built to close. We audit the layers that move the needle today, not the ones that might matter someday.

Explore Our AI Search Optimization Service →

What to Audit Instead: the Real AI Search Infrastructure Questions

If you skip llms.txt, here are the five questions that actually move AI search visibility — in priority order.

1. Is your schema layer in place?

Article schema, Organization, Person/Author for every byline, FAQPage where appropriate, Service for service pages. Validate via Google’s Rich Results Test. This is the layer where AI systems pull citation evidence today.

2. Does your `robots.txt` distinguish citation bots from training bots?

If you blanket-block AI crawlers, you also block the bots that bring you AI citations. Allow OAI-SearchBot, Claude-SearchBot, Perplexity-User, and Meta-WebIndexer (the citation bots). Block GPTBot, ClaudeBot, Google-Extended, and Meta-ExternalAgent (the training bots) only if you have a content-licensing reason to. Most B2B advisory firms shouldn’t blanket-block — citation traffic is exactly what you want.

3. Do AI crawlers actually receive your content when they fetch it?

Run a curl test from each major AI bot’s user-agent against your homepage and your top five pages. If you get back a WAF challenge stub, a CAPTCHA, or a 403, your content is invisible no matter what llms.txt you ship. (We’ve seen this happen on PowerfulCombo’s own site: a leftover security plugin from a previous host was silently serving CAPTCHAs to AI bots for months. Quiet kind of disaster.)

4. Is your hosting layer interfering?

Security plugins from previous hosting providers, Cloudflare WAF rules set up months ago, LiteSpeed bot-challenges enabled by default — any of these can silently block AI bots. Audit on every host migration.

5. Are AI bots actually crawling — and what are they fetching?

Server logs are the only honest answer. If no AI bot has visited your site in 30 days, fix the layers above before worrying about llms.txt.

Get a Concrete AI Visibility Audit

We built the AI Visibility Assessment Tool to make these questions concrete. Tier 1 audits your schema layer today; Tier 2 (in build) covers the bot infrastructure questions including llms.txt — get notified when it ships.

Run the Free Audit →

The Bottom Line

llms.txt today is a 30-minute hedge, not a critical infrastructure decision. The SEO industry agrees on that fact even when it disagrees on what to do about it. For a $500K–$10M B2B advisory firm, the right call depends on three things: whether your schema layer is in place, whether your content surface is small enough to curate by hand, and whether you can spend the half-hour without trading off work that has measurable impact this quarter.

For most readers in this audience band, that math comes out as skip for now — and use the time to fix the higher-leverage infrastructure layers we covered above. For some, it’s ship as a hedge. For a few, it’s wait. The decision frame works for any of those answers.

We’ve staked our verdict on the decision frame. If your math comes out skip, that’s the right call. If it comes out ship, our companion deployment guides cover the practical work. Either way, you’ve saved an afternoon of guessing.

Frequently Asked Questions

The questions readers ask most after working through the verdict frame.

What is an llms.txt file?

A plain Markdown file placed at your website’s root (yourdomain.com/llms.txt) that lists curated links to your most important content for AI systems to read first. It’s a curation file, not a security boundary — voluntary opt-in, no enforcement. The full technical proposal is at llmstxt.org.

Is llms.txt being used by major LLMs in 2026?

No — and that’s the cleanest answer this article gives. None of OpenAI, Anthropic, Google, Perplexity, Meta, or Microsoft references llms.txt in their official crawler documentation. All six control AI bot access through robots.txt only. Google’s John Mueller stated publicly: “FWIW no AI system currently uses llms.txt.” Even the SEO plugins that ship llms.txt (Yoast, RankMath) explicitly admit no major LLM uses it today.

What is the difference between robots.txt and llms.txt?

robots.txt is the standardized, RFC-9309-defined web protocol every crawler is expected to honor — it controls who’s allowed to crawl what. llms.txt is a proposed (not standardized) Markdown file that tries to curate content for AI systems to prefer reading. Different jobs entirely: robots.txt gates access; llms.txt suggests focus. Right now, only robots.txt is enforced. llms.txt is voluntary on the AI side and largely ignored.

Should small businesses ship llms.txt today?

Most shouldn’t, in our view — not because the file is bad, but because the half-hour of setup is better spent on schema-layer work, robots.txt review, or AI bot real-fetch testing, all of which have measurable impact today. Ship llms.txt if your schema layer is already clean, your content surface is small enough to curate by hand, and you have 30 minutes that aren’t trading off something with bigger leverage.

Does adding llms.txt improve my AI search ranking?

There’s no published evidence that it does. Semrush’s seven-month testing found zero AI bot crawl activity on /llms.txt. Ahrefs found no evidence of retrieval, traffic, or accuracy lift. The honest answer right now is: probably not, today.

How long does it take to set up llms.txt on WordPress?

About 30 minutes if you curate by hand: write a heading with your site name, a short summary, and group your top 20–40 URLs into 3–5 themed sections. About 5 minutes if you use a plugin (Yoast, RankMath, or a generator) — but the curation premise of the file weakens when a plugin auto-generates it. We’ll publish a step-by-step WordPress walkthrough as a companion piece for readers who decide to ship.

AI Search Optimization (GEO-AEO)

Five structural signals AI engines check to cite content — abstract visual for how to optimize content for LLMs.

How to Optimize Content for LLMs to Get Cited by AI

April 21, 2026

AI Search Optimization (GEO-AEO)

5 ways your competitors are getting ai recommandations and you're not!

5 Ways Your Competitors Are Getting AI Recommendations (And You’re Not)

February 14, 2026

AI Search Optimization (GEO-AEO)

From SEO to GEO: What Changed and What Stayed the Same

January 25, 2026