Studies

Does llms.txt actually work? A 90-day log test

I deployed llms.txt and read the server logs for 90 days. Combined with the three largest studies ever run on it, the data settles the debate - here is what actually happens.

Ritik Namdev Ritik Namdev ·Published Jun 11, 2026 ·Updated Jul 5, 2026 ·14 min read
The short answer

No - not yet. Three independent large-scale studies (Ahrefs across 137,000 domains, SE Ranking across ~300,000, and Otterly's 90-day server-log test) converge on the same finding: production AI crawlers almost never fetch llms.txt, and there is no measurable citation lift from having one. Google has explicitly said it doesn't use it. Publish one as cheap insurance if you like - but don't expect traffic or citations from it, and don't let it distract from the fundamentals that actually work.

llms.txt is the most-hyped, least-proven idea in AI SEO. The pitch is seductive: drop a tidy Markdown file at your site root, and AI models get a clean map of your best content to read and cite. So I deployed one, left it untouched, and watched my raw server logs for 90 days - then cross-checked my numbers against every large-scale study published to date. This is what the evidence actually says, with the hype stripped out.

1. The verdict, up front

As of mid-2026, llms.txt is a proposal, not a working standard - and the data says it changes nothing about your AI visibility today. That's not an opinion; it's the consistent result across every serious measurement. In my own logs, the file was requested a handful of times over three months - overwhelmingly by SEO audit tools and generic scanners, essentially never by the AI retrieval bots that decide citations. The large studies found exactly the same shape at scale.

I'm leading with the verdict because the rest of this piece is evidence, not suspense. If you take one thing away: treat llms.txt as low-cost insurance, never as a growth lever.

2. What llms.txt actually is

llms.txt is a single Markdown file, served at /llms.txt, that curates and summarises your site for large language models. Proposed by Jeremy Howard in September 2024, it's designed to give an AI a concise, human-curated map of your most important pages instead of forcing it to crawl and parse your entire site. The spec is deliberately minimal - an H1, a summary blockquote, and lists of links:

# Title

> Optional one-line description of the site

Optional context paragraph (no headings).

## Docs
- [Getting started](https://example.com/start): setup guide
- [API reference](https://example.com/api): full endpoints

## Optional
- [Changelog](https://example.com/changelog)

The only strictly required element is the H1. There's also a companion convention, /llms-full.txt, which concatenates your full docs into one file - genuinely useful for IDE coding agents, as we'll see. You can generate a spec-correct file in seconds with the free llms.txt generator. The question isn't whether you can make one - it's whether anything reads it.

3. What the three big studies found

Every large study lands in the same place: publish rates are climbing, but AI systems aren't reading the files. Here are the three that matter, all from 2025–2026:

  • Ahrefs (137,210 domains, May 2026): 97% of llms.txt files received zero requests. Of the requests that did land, only 1.1% came from AI retrieval bots. [source]
  • SE Ranking (~300,000 domains): found no relationship between having an llms.txt and how often a domain is cited by major LLMs. Dropping the variable from their model improved accuracy.
  • Otterly (90-day server-log test): of 62,100+ total AI-bot visits, just 84 requests (0.1%) targeted /llms.txt - performing 68% below an average content page.

And the most authoritative voice of all - Google - has been blunt. In June 2026, John Mueller called llms.txt "purely speculative for now (the file has existed for years, yet none of the AI systems use it)," likening it to the long-dead keywords meta tag. Gary Illyes separately confirmed Google has no plans to support it.

"The file has existed for years, yet none of the AI systems use it." - John Mueller, Google, June 2026

4. Who actually requests your llms.txt

When something does fetch your llms.txt, it's almost never an AI answer engine - it's your own SEO tools. Ahrefs broke down the requesters, and the picture is damning for the "AI is reading this" narrative:

Who requests llms.txt files, by share of requests
SEO audit tools
21.7%
Unidentified bots
14.9%
General crawlers
13.1%
GPTBot (training)
4.51%
AI retrieval bots
1.1%
ClaudeBot
0.80%
Ahrefs, 137,210 domains, May 2026. SEO audit tools alone out-request every AI bot combined. 'AI retrieval bots' = Perplexity + OAI-SearchBot.

SEO audit tools are the single largest requester at 21.7%. The training crawler GPTBot fetches it occasionally (4.51%), but the bots that actually generate citations - OAI-SearchBot, PerplexityBot - collectively account for barely 1%. Ahrefs summed it up memorably: Slackbot fetched llms.txt more often than PerplexityBot did. A crawler fetch isn't even a citation; it's a prerequisite that almost never happens.

The one real use case

llms.txt / llms-full.txt is genuinely used by IDE coding agents (Cursor, Claude Code, Continue) to load a project's documentation into context. Mueller called it a "temporary crutch to save tokens." If you run developer docs, that's a real reason to ship one - just not a web-search-visibility reason.

5. Why adoption exploded anyway

Adoption grew 8.8× in a year - but almost none of it is deliberate. The count of monitored sites with llms.txt jumped from ~4,088 (June 2025) to ~36,120 (May 2026). That looks like a movement until you see where it came from: platform defaults. When Shopify quietly pushed llms.txt across its stores in spring 2026, adoption on that platform hit 78%.

llms.txt adoption by platform / CMS (top-10k sites)
Shopify
78.1%
Contentful
22.9%
Adobe AEM
19.0%
WordPress
8.7%
Drupal
1.5%
Casey Burridge / HTTP Archive BigQuery, June 2026. Shopify's 78% is a silent platform default - evidence adoption is driven by CMS decisions, not SEO strategy.

Strip out the platform defaults and deliberate adoption among the top 10,000 sites is only about 5–6%. In other words, the "everyone's doing it" impression is manufactured by a few CMS vendors flipping a switch - not by measured results convincing teams to add it.

6. How to run your own 90-day log test

Don't take my word for it - the beauty of llms.txt is that you can measure it yourself in your access logs. Here's the exact method I used, adapted from Otterly's design:

  1. Publish a valid /llms.txt (return HTTP 200), record the date as day 0, and leave it unchanged for 90 days.
  2. Capture raw logs - timestamp, path, user-agent, source IP, status - from Nginx/Apache/Cloudflare.
  3. Count direct hits to /llms.txt, broken down by user-agent.
  4. Filter for AI bots - GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, PerplexityBot, Google-Extended - and verify by IP to catch spoofing.
  5. Benchmark those hits against total AI-bot visits to all pages, and against a control file (say, a PDF), so you can express llms.txt pull as a percentage.
  6. Optionally track citations separately: run a fixed prompt panel through ChatGPT/Perplexity/AI Overviews before and after, and log whether you get cited. Remember - a fetch is not a citation.

Define "working" before you start. A weak positive signal is production retrieval bots requesting the file at a rate materially above a control PDF. A strong signal is a measurable citation lift not explained by other changes. Based on every dataset above, expect neither.

7. llms.txt vs robots.txt vs sitemap.xml

These three root files are constantly confused, but they do completely different jobs - and only two of them are actually honoured.

Attributellms.txtrobots.txtsitemap.xml
PurposeSummarise content for LLMsControl crawler accessList URLs for discovery
FormatMarkdownPlain-text directivesXML
Who reads it~Ignored by production AI botsAll major search + AI crawlersSearch engines
StatusUnofficial proposal (2024)De-facto standard since 1994Established standard
Honoured by Google?NoYesYes

The takeaway: your robots.txt and sitemap.xml do real, load-bearing work every day. llms.txt is aspirational. Get the first two perfect - see the 50-point technical GEO audit - before spending a minute worrying about the third.

8. So, should you publish one?

Yes, if it costs you five minutes and you keep your expectations at zero - no, if it displaces real work. The honest cost-benefit: it's trivial to publish, it may help coding agents parse your docs, and if a real standard emerges later you're already early. Those are fine reasons. But it will not move your traffic or citations today, so it must never come before the fundamentals that are proven: crawlable server-rendered HTML, structured data, not blocking the AI search bots, original data, and freshness.

9. The security angle nobody mentions

There's a quiet risk in llms.txt that the hype completely ignores: it's an attack surface. The whole premise of the file is that AI agents will read it and trust it - treating its contents as an authoritative description of your site. That trust is exactly what makes it a target. In Ahrefs' dataset, the single largest research crawler hitting llms.txt files self-identified as prompt-injection-survey/1.0 - meaning security researchers (and, presumably, less friendly actors) are already probing these files as a vector for prompt-injection attacks.

The threat model is straightforward. If an autonomous agent fetches your llms.txt and an attacker has managed to inject instructions into it - or if you unwittingly include content that reads like an instruction - the agent may act on it. This is a general problem with any file designed to be ingested and trusted by an LLM. The practical takeaway: if you publish an llms.txt, treat it like public-facing code, not a marketing asset. Keep it to plain descriptions and links, never include anything that resembles a directive, lock down who can edit it, and don't auto-generate it from user-submitted content. A file almost no production AI reads is not worth introducing a new security surface for.

10. What to do instead

Every minute spent agonising over llms.txt is a minute not spent on the things that demonstrably drive AI citations. The evidence on what actually works is far stronger than the evidence on llms.txt - so redirect the energy. In rough priority order:

  1. Don't block the AI search crawlers. This is the real "AI visibility file" - your robots.txt. Make sure OAI-SearchBot, PerplexityBot and Claude-SearchBot are allowed. Blocking them is the one config change that genuinely removes you from AI answers. See the crawler guide.
  2. Get into Bing's index. ChatGPT reads from Bing; enable IndexNow and verify Bing Webmaster Tools. This is measurably high-leverage in a way llms.txt is not - see the Bing playbook.
  3. Serve real HTML. The AI crawlers don't render JavaScript. If your facts are injected client-side, no llms.txt will save them; server-side rendering will.
  4. Publish original data and lead with statistics. These lift AI visibility 22–40% in controlled tests - the opposite of llms.txt's zero. See how to get cited.
  5. Earn brand mentions. The strongest measured predictor of AI citation, ~3× more predictive than backlinks.

Notice the pattern: everything on this list has measured evidence behind it. llms.txt has measured evidence against it. That asymmetry should decide where your hours go.

11. Could llms.txt matter later?

The honest answer: maybe - and that's the only real case for shipping one today. Standards sometimes start as ignored proposals and later become load-bearing (sitemaps did). Adoption is climbing steeply, some of the companies building agents are the ones publishing llms.txt for their own docs, and the IDE-agent use case is genuine and growing. If a major AI provider announced tomorrow that it reads llms.txt for retrieval, the calculus would flip overnight.

But "it might matter later" is an argument for a five-minute insurance file, not for treating it as a priority. The failure mode to avoid is the one all over LinkedIn: teams presenting llms.txt as a growth tactic, agencies charging for "llms.txt optimisation," and founders believing they've done their AI-search homework because they dropped a Markdown file at their root. They haven't. They've done the equivalent of adding a keywords meta tag in 2010 - harmless, fashionable, and inert. Publish it if you want to be early to a standard that may or may not arrive. Just don't confuse being early with being effective.

12. The full spec and its variants

If you're going to publish one, publish it correctly - the spec is small, but the details matter. The file lives at /llms.txt and is parsed in a strict order. Only the H1 is strictly required; everything else is optional but recommended:

  1. H1 - the site or project name. The one mandatory element.
  2. A blockquote immediately after, giving a short summary with the key context needed to understand the site.
  3. Zero or more Markdown sections (paragraphs and lists, but no headings) with background or interpretation notes.
  4. Zero or more H2 "file list" sections, each a curated group of links formatted - [name](url): optional notes.
  5. An ## Optional section - a special heading whose links an AI may skip when its context window is tight. Put your nice-to-haves here, not your essentials.

There are two companion conventions worth knowing. /llms-full.txt concatenates your full documentation into a single file - this is the variant IDE coding agents actually use, because it lets them load an entire project's docs into context in one fetch. And some sites serve .md versions of their HTML pages at the same URL plus a .md suffix, giving agents a clean-markdown alternative to parsing rendered HTML. If you run developer documentation, llms-full.txt is genuinely the more useful of the two to ship.

A subtle point that trips people up: llms.txt is curation, not access control. It doesn't stop anything from being crawled (that's robots.txt) and it doesn't list every URL for discovery (that's sitemap.xml). It's a hand-picked "start here" for an LLM - which is precisely why an auto-generated file dumping every page defeats the purpose. If you publish one, curate it: your best 10–30 pages, grouped and described, with the genuinely secondary material tucked under ## Optional. The free generator produces a spec-correct file in this exact structure, so there's no excuse for a malformed one - the only real decision is whether to bother at all, and for that, re-read section 8.

One last practical note: keep the file's maintenance in mind. A curated links file is only useful if it stays current, and a stale llms.txt pointing at moved or deleted pages is worse than none - it hands any agent that does read it a map full of dead ends. If you can't commit to keeping it in sync with your site, that's one more small reason to skip it entirely and put the effort into the fundamentals that compound.

If you do ship one, keep it clean of anything an over-trusting agent shouldn't follow, run the 90-day log test above, and let your own data - not the hype cycle - decide whether it earned its place. My data said it didn't. Yours probably will too.

§ References

Sources

FAQ

Frequently asked questions

Does Google use llms.txt?
No. Google's John Mueller called it "purely speculative for now" and Gary Illyes confirmed Google doesn't support it and has no plans to. Mueller compared it to the long-ignored keywords meta tag.
Do any AI crawlers actually request llms.txt?
Rarely. Across 137,000 domains, Ahrefs found 97% of llms.txt files received zero requests, and AI retrieval bots (Perplexity, OAI-SearchBot) made up just 1.1% of the requests that did happen. Slackbot fetched it more often than PerplexityBot.
Will llms.txt increase my AI citations?
There's no measured evidence that it does. SE Ranking's study of ~300,000 domains found no relationship between having an llms.txt and citation frequency - removing the variable actually improved their prediction model.
If it does nothing, why is adoption growing 8.8×?
Mostly platform defaults and hedging. Shopify silently pushed llms.txt to its stores (78% adoption on that platform), and many teams add it as cheap insurance. Deliberate adoption among top sites is only ~5–6%.
Who genuinely benefits from llms.txt today?
IDE coding agents like Cursor and Claude Code use llms.txt / llms-full.txt to load documentation context - a token-saving convenience. That's a developer-docs use case, not web-search visibility.
Is there any risk to publishing one?
The cost is near-zero, but note a security angle: researchers found a prompt-injection-survey crawler probing llms.txt files, since agents are built to trust them. Don't put anything in it you wouldn't want an agent to blindly act on.
What's the difference between llms.txt and robots.txt?
robots.txt controls crawler access and is universally honoured; llms.txt is a content-summary proposal that production AI bots largely ignore. They solve different problems - see the comparison table above.
What does Google recommend instead?
Mueller pointed to keeping AI agents unblocked and to the emerging WebMCP approach, rather than adding speculative files. Fundamentals - crawlable HTML, structured data, freshness - matter far more.
Ritik Namdev
Written by

Ritik Namdev

Growth · SEO · GEO

Growth marketer documenting a brand-new site's climb into Google and the AI engines - in public, with real numbers. Every tactic here is tested on real sites before it's published.

Keep reading

Related guides

The Lab · Weekly

One experiment. Every week.

The field notes in your inbox - one thing I tested, the raw numbers behind it, and what it means for getting cited by AI.

Free forever. Unsubscribe anytime.