llms.txt is a plain text file placed at the root of your website that tells AI language models what your site is about, lists your key pages, and declares your AI usage permissions. It follows a simple markdown-like format and is read by AI models when they crawl your site.

What is an MCP Server Card?

An MCP Server Card is a JSON file at /.well-known/mcp/server-card.json that describes your site's tools and capabilities to AI assistants using the Model Context Protocol. It helps AI assistants discover what your site can do and how to interact with it.

What is an A2A Agent Card?

An A2A Agent Card is a JSON file at /.well-known/agent-card.json that describes your site to other AI agents using the Agent-to-Agent protocol. It lists your agent's name, capabilities, skills and supported interfaces.

How do I allow AI crawlers in robots.txt?

Add separate User-agent blocks for each AI crawler followed by Allow: /. For example: User-agent: GPTBot followed by Allow: / on the next line. Do this for GPTBot, ClaudeBot, PerplexityBot, Google-Extended and anthropic-ai.

AI Agent Readiness: Complete Discovery Setup Guide

The web is changing. Alongside human visitors and search engine crawlers, a new category of visitor is emerging — AI agents. These are automated systems that browse, read, analyse and interact with websites on behalf of users. This guide covers everything you need to know to make your site fully discoverable and accessible to AI agents.

What is AI Agent Readiness?

AI Agent Readiness is the practice of publishing the right signals, files and headers that AI agents need to discover and interact with your site. It covers four areas:

Discovery FilesJSON and text files that tell agents who you are, what you offer and how to interact with your APIs.

Content AccessServing content in formats agents prefer — markdown negotiation, Content-Signal directives, llms.txt and llms.md.

Security HeadersCSP, HSTS and Link headers that signal trustworthiness and point agents to your discovery resources.

Crawler PermissionsExplicitly allowing AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) in your robots.txt.

Discovery Files

1. llms.txt

The most important file. Place it at /llms.txt at your domain root. It tells AI language models what your site is about, lists your most important pages, and declares how you want your content used.

# Your Site Name
> One sentence description of what your site does.

## Key pages
- [Page Title](https://example.com/page): Brief description.

## About
Two or three paragraphs about your site, mission and content.

## AI usage permissions
AI models are permitted to index, summarise and cite content from this site.

2. API Catalog (RFC 9727)

Create /.well-known/api-catalog returning application/linkset+json. This tells AI agents what APIs your site exposes.

{
  "linkset": [{
    "anchor": "https://example.com",
    "service-desc": [{"href": "https://example.com/llms.txt", "type": "text/plain"}],
    "service-doc": [{"href": "https://example.com/docs", "type": "text/html"}]
  }]
}

3. MCP Server Card

Create /.well-known/mcp/server-card.json. This enables AI assistants using the Model Context Protocol to discover your tools and capabilities.

{
  "serverInfo": {"name": "Your Site", "version": "1.0.0", "description": "What your site does"},
  "transport": {"type": "http", "endpoint": "https://example.com/api/"},
  "capabilities": {"tools": true, "resources": true},
  "documentation": "https://example.com/docs"
}

4. A2A Agent Card

Create /.well-known/agent-card.json. This enables agent-to-agent discovery and interaction.

{
  "name": "Your Site",
  "version": "1.0.0",
  "description": "What your agent does",
  "url": "https://example.com",
  "skills": [
    {"id": "main-skill", "name": "Skill Name", "description": "What this skill does"}
  ]
}

5. Agent Skills Index

Create /.well-known/agent-skills/index.json listing your site's capabilities per the Agent Skills Discovery RFC v0.2.0.

{
  "$schema": "https://agentskills.io/schema/v0.2.0/index.json",
  "skills": [
    {"name": "Skill Name", "type": "skill-md", "description": "What it does", "url": "https://example.com/guide"}
  ]
}

Content Access

llms.md

Create /llms.md — a markdown-formatted version of your llms.txt. Some agents prefer markdown over plain text.

Markdown Negotiation

Configure your server to return markdown content when an agent sends Accept: text/markdown. In NGINX:

location = / {
  if ($http_accept ~* "text/markdown") {
    rewrite ^ /llms.md last;
  }
  try_files /index.html @fallback;
}
location ~* \.md$ {
  default_type text/markdown;
  add_header Content-Type "text/markdown; charset=utf-8" always;
}

Content-Signal in robots.txt

Add this directive to your robots.txt to declare your AI content preferences:

Content-Signal: ai-train=no, search=yes, ai-input=no

Security Headers

Three headers matter most for agent readiness:

Header	Purpose	Example
`Content-Security-Policy`	Signals your site is security-conscious	`default-src 'self'; script-src 'self'`
`Strict-Transport-Security`	Enforces HTTPS — trusted sites use HTTPS	`max-age=31536000; includeSubDomains`
`Link`	Points agents to your discovery resources	`</llms.txt>; rel="service-doc"`

⚠️ Remove X-XSS-Protection if present — it is deprecated and modern agents flag it as a misconfiguration.

AI Crawler Permissions

Explicitly allow the major AI crawlers in your robots.txt:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: anthropic-ai
Allow: /

💡 A wildcard User-agent: * block with Allow: / does NOT automatically cover these crawlers if there are also specific disallow rules. Always add explicit blocks for each AI crawler.

Frequently Asked Questions

Do I need all 16 checks to pass?

Not all checks carry equal weight. llms.txt (15 points), markdown negotiation (10 points), API catalog, MCP card and A2A card (10 points each) are the highest value. The crawler checks (2-3 points each) matter but are lower priority. Aim for 80+ to be well positioned.

How long does it take to implement?

llms.txt, robots.txt changes and crawler permissions can be done in under an hour. The /.well-known/ JSON files take another hour. Markdown negotiation requires server config changes — typically 30 minutes for a developer. Total: half a day for a complete implementation.

Will this affect my regular SEO?

Positively. Allowing AI crawlers, adding structured discovery files and improving security headers are all signals that Google and other traditional search engines also value. Agent readiness and traditional SEO are complementary.

What is the difference between llms.txt and robots.txt?

robots.txt tells crawlers what they can and cannot access. llms.txt tells AI language models what your site is about and how you want your content used. They serve different purposes and you need both.

🛰️ Check your AI Agent Readiness score

Run a free audit on your domain. See which of the 16 checks you pass and get specific fix instructions for any that fail.

Run free audit →