The Robots & Sitemap Tester validates your robots.txt and XML sitemaps β the two files that control how search engines crawl your site. Misconfigurations here silently block indexing of entire sections. A single line in robots.txt can deindex thousands of pages. This index covers every finding the tester raises.
Findings fall into these categories. Pick yours:
Disallow: / blocks the entire site. Staging environments left in production. Disallow: /wp-admin/ patterns that accidentally include public content. How to audit before disaster.User-agent without trailing colon, comments mixed with directives. Google's robots.txt parser is forgiving but quietly ignores broken rules. The validator patterns.Sitemap: directive in robots.txt should point at the absolute URL of your sitemap or sitemap index. Common bugs: relative URL, missing protocol, pointing at sitemap.html instead of .xml.lastmod must reflect actual content change, not file regeneration. priority is mostly ignored by Google but inflated values look spammy. What's worth setting and what isn't.Disallow: stops the crawl but doesn't deindex. noindex deindexes but only if the crawler can read the meta. Common bug: disallow and noindex on same URL means Google can't see the noindex, page stays indexed with the disallow message.Where these files live varies by stack:
app/sitemap.ts and app/robots.ts Metadata API, dynamic generation, the SSG-build-time sitemap pattern for large sites.The tester fetches your robots.txt, validates syntax, identifies blocked content, fetches every declared sitemap, validates XML, checks each URL responds 200, flags lastmod inconsistencies, and tests crawler-trap risk. For the full reference, see the Robots & Sitemap Guide.
Run the tester. One bad line in robots.txt can deindex entire site sections β confirm yours doesn't.
Run Robots & Sitemap Tester β