How to Write an llms.txt File (and What It Actually Does in 2026)
By CiteDaily - updated 2026-06-15
Some links on this page are affiliate links. If you buy through them we may earn a commission at no extra cost to you. This never changes our verdicts. Learn more.
If you have spent any time in GEO circles in the past year, you have heard that you "need an llms.txt file." This guide does two things: it shows you exactly how to write one, and it gives you a straight answer on whether it actually does anything in 2026. New to the broader topic? Start with what is GEO.
What is llms.txt?
llms.txt was proposed by Jeremy Howard (of Answer.AI and fast.ai) and published on September 3, 2024. The idea is straightforward: give large language models a clean markdown summary of your site, stripped of the HTML, JavaScript, navigation and ads that make raw web pages noisy. Because models work within a limited context window, a concise, well-structured file is easier for them to ingest than a full crawl of your site.
A useful analogy is robots.txt or an XML sitemap, but written for LLMs instead of search crawlers. One important caveat with that analogy: llms.txt is not an access-control tool. It blocks nothing and grants nothing. It is purely a curated description you offer; whether any model reads or uses it is entirely up to that model.
The format
The specification is deliberately minimal and must be valid, standard markdown. The structure is:
- An H1 with the name of the site or project. This is the only strictly required element.
- A blockquote with a one-to-three sentence summary of what the site is.
- Optional free prose giving more context.
- One or more H2 sections, each containing a markdown list of links, where every link is followed by a colon and a short description.
- An
## Optionalsection for secondary material that a model is free to skip.
Here is a minimal example that follows the format:
# Project Name
> A one to three sentence summary of what this project or site is and who it is for.
An optional paragraph of plain-prose context that helps a model understand the rest.
## Core pages
- [Page title](https://example.com/page): short description after the colon
- [Another page](https://example.com/other): what a reader or model finds here
## Optional
- [Secondary page](https://example.com/extra): lower-priority content an AI may skip
There are two conventions worth knowing. /llms.txt is the curated index described above: a short, hand-picked map of your most important pages. /llms-full.txt is a different, heavier convention: a single file containing your full documentation inline, meant to be dropped straight into a model's context. Use the curated index by default and reach for the full version only when you have a deliberate reason.
For reference implementations, FastHTML (Jeremy Howard's own project), Anthropic and Stripe all publish llms.txt files you can study to see how real teams structure theirs.
How to write one for your site
You can produce a solid file in well under an hour:
- List your most important public pages. Think about what a newcomer, or a model answering a question about you, genuinely needs: your core product or content pages, key guides, pricing, docs.
- Write a factual, specific summary of what your site is and who it is for. Models favour concrete description over marketing language, so "independent reviews and comparisons of GEO tools" beats "the ultimate AI visibility platform."
- Group the links into clear H2 sections so the structure is obvious.
- Put secondary material under
## Optionalso a model can prioritise. - Place the file at the root of your domain so it resolves at
/llms.txt.
For what it's worth, CiteDaily already publishes its own llms.txt, built on exactly these principles.
Does it actually work? The honest 2026 picture
This is where most advice gets vague. Here is what is actually on the record.
Google says it does not use it. This is the clearest signal we have. Gary Illyes confirmed in July 2025 that Google does not use llms.txt, and John Mueller compared it to the long-discredited meta keywords tag. There is no evidence it influences Google's ranking, AI Overviews or AI Mode.
The big AI labs document robots.txt, not llms.txt. OpenAI, Anthropic and Perplexity describe their crawlers through robots.txt. None of them documents llms.txt as a discovery or citation input. In other words, no major answer engine has said "publish this and we will use it."
Log studies show negligible traffic. In a 90-day experiment, Otterly.ai found that only about 0.1% of AI crawler requests touched /llms.txt — effectively a rounding error (Otterly.ai).
Adoption is still small and skewed technical. SE Ranking's study of 300,000 domains in early 2026 found roughly 10% had an llms.txt file, concentrated among technical and software sites rather than the broader web.
To be fair to the other side, the picture is not purely negative. Profound has reported that Microsoft and OpenAI crawlers do fetch llms.txt files, and Google has included an llms.txt in its agent-to-agent (A2A) protocol. The key distinction is that a fetch is not a citation. A bot requesting a file tells you nothing about whether the contents influenced an answer.
One concrete pitfall to avoid: some advice pushes you to publish a markdown copy of every single page. Done at scale, that can create large-scale duplicate content, dilute your crawl budget and actively hurt your SEO. Keep your /llms.txt curated.
So should you bother?
A nuanced yes-and-no.
Yes, if it is low effort and you have documentation or key pages worth exposing to AI agents and developer tools. That is the use case with the strongest documented value today: tools like Cursor, GitHub Copilot and Claude pull external docs in real time, and a clean llms.txt that points to yours genuinely helps them.
No, if you are counting on it to increase your citations in ChatGPT, Perplexity or Google. There is no evidence it does that, and the effort pays off far better elsewhere: building real authority and getting cited by trusted third-party sources. That is the entire argument of our playbook, how to get recommended by ChatGPT in 2026.
If you want to actually measure whether AI engines mention you, that is a different and more useful exercise. Start with our guide on how to check if your brand appears in ChatGPT, then browse our GEO tool reviews to track it at scale. Two affordable options with free or low-cost tiers for monitoring your prompts are Rankscale and Orchly. Publishing an llms.txt is a fine afternoon's work; just spend the rest of the week on the things that actually move your visibility.
Sources
- llms.txt specification — Jeremy Howard, Answer.AI (September 3, 2024): llmstxt.org.
- Otterly.ai, "The llms.txt experiment" — share of AI crawler requests hitting llms.txt: otterly.ai.
- Google's position that it does not use llms.txt — Gary Illyes and John Mueller, July 2025 (on-the-record statements).
- Adoption (~10%) — SE Ranking, 300,000-domain study, early 2026.
- Crawler fetch observations — Profound (Microsoft and OpenAI crawlers fetching llms.txt) and Google's inclusion of an llms.txt in its agent-to-agent (A2A) protocol.
Frequently asked questions
- Do ChatGPT and Perplexity read llms.txt?
- Not as an official signal. OpenAI, Anthropic and Perplexity document their crawlers through robots.txt and do not list llms.txt as a citation or discovery input. Some bots have been observed fetching the file, but fetching is not the same as using it to decide citations.
- Does llms.txt help my Google ranking or AI Overviews?
- No. Google has stated on the record that it does not use llms.txt; Gary Illyes confirmed this in July 2025 and John Mueller compared it to the discredited meta keywords tag. There is no evidence it influences AI Overviews or AI Mode.
- Is llms.txt the same as robots.txt?
- No. robots.txt controls which crawlers may access your site; llms.txt gives AI systems a curated summary of your content. llms.txt blocks nothing and does not control access.
- Should I create a markdown copy of every page?
- Be careful. Duplicating every page as indexable markdown can create large-scale duplicate content that dilutes crawl budget and can hurt your SEO. Keep /llms.txt as a curated index; use /llms-full.txt only deliberately.
- What is the strongest real use case for llms.txt today?
- Documentation for developer tools and agents. Tools like Cursor, GitHub Copilot and Claude retrieve external docs in real time, and a clean llms.txt pointing to your docs helps them. That is where the documented value is strongest right now.