/ Robots & Sitemap Fixes / Sitemap Declaration

How to Fix Sitemap Declaration

The Sitemap directive in robots.txt tells crawlers where to find your XML sitemap. Common bugs: relative URL instead of absolute, missing protocol, pointing at an HTML sitemap page instead of XML, or no declaration at all. Search engines can sometimes find sitemaps without help, but declaring them in robots.txt AND submitting via Search Console is best practice. This guide covers the correct format, multi-sitemap patterns, and submission workflow.

1. Confirm the sitemap exists and is valid

Step 1
Fetch the sitemap
curl -I https://example.com/sitemap.xml

# Expected:
# HTTP/2 200
# Content-Type: application/xml or text/xml
Step 2
Validate XML structure
curl -s https://example.com/sitemap.xml | xmllint --noout -

# No output = valid XML
# Errors listed if malformed

2. Add Sitemap to robots.txt

# Correct: absolute URL with protocol
User-agent: *
Disallow: /admin/

Sitemap: https://example.com/sitemap.xml

Multiple sitemaps

# One Sitemap line per file
Sitemap: https://example.com/sitemap-pages.xml
Sitemap: https://example.com/sitemap-posts.xml
Sitemap: https://example.com/sitemap-products.xml

# Better: single sitemap index
Sitemap: https://example.com/sitemap-index.xml
# The index file then references all child sitemaps

3. Common mistakes

Mistake 1: Relative URL

# BAD: silently invalid
Sitemap: /sitemap.xml

# RIGHT: absolute URL
Sitemap: https://example.com/sitemap.xml

Mistake 2: Missing protocol

# BAD
Sitemap: example.com/sitemap.xml

# RIGHT
Sitemap: https://example.com/sitemap.xml

Mistake 3: Wrong host (HTTP vs HTTPS, www vs non-www)

# Must match the actual canonical host
# Site uses https://www.example.com canonical:

# BAD
Sitemap: http://example.com/sitemap.xml

# RIGHT
Sitemap: https://www.example.com/sitemap.xml

Mistake 4: Pointing at HTML

# BAD: HTML sitemap page, not XML
Sitemap: https://example.com/sitemap.html

# RIGHT: machine-readable XML
Sitemap: https://example.com/sitemap.xml

Mistake 5: Sitemap blocked by robots.txt

# BAD: sitemap.xml is itself blocked
User-agent: *
Disallow: /sitemap

Sitemap: https://example.com/sitemap.xml

# Some crawlers may refuse to fetch
# RIGHT: ensure sitemap path is allowed (default is allowed)

4. The sitemap index pattern

For sites with multiple sitemaps, use an index file:

<!-- /sitemap.xml (the index) -->
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-pages.xml</loc>
    <lastmod>2024-01-15</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-posts-2024.xml</loc>
    <lastmod>2024-01-20</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap-products.xml</loc>
    <lastmod>2024-01-20</lastmod>
  </sitemap>
</sitemapindex>
# robots.txt only references the index
Sitemap: https://example.com/sitemap.xml

5. Submit to Search Console

Step 1
Google Search Console
  1. Open Search Console → property → Sitemaps
  2. "Add a new sitemap" → enter sitemap URL path (e.g. sitemap.xml)
  3. Click Submit
  4. Status updates within hours: "Success" with URL count, or error message
Step 2
Bing Webmaster Tools
  1. Bing Webmaster Tools → site → Sitemaps
  2. Submit sitemap URL
  3. Same idea — status updates over time
Step 3
Yandex (optional)
Yandex Webmaster has similar sitemap submission. Worth using if you target Russian-speaking markets.

6. IndexNow protocol (modern alternative)

For sites with frequent updates, IndexNow lets you notify search engines instantly when URLs change. Bing, Yandex, Seznam support it; Google does not (Google has its own Indexing API for select use cases).

# POST to IndexNow endpoint
POST https://api.indexnow.org/indexnow
Content-Type: application/json

{
  "host": "example.com",
  "key": "your-key-here",
  "urlList": [
    "https://example.com/new-page",
    "https://example.com/updated-page"
  ]
}

# Place key verification file at:
# https://example.com/your-key-here.txt
# containing the key value as plain text

7. Verify the declaration

Step 1
curl robots.txt
curl -s https://example.com/robots.txt | grep -i Sitemap

# Expected output:
# Sitemap: https://example.com/sitemap.xml
Step 2
Search Console status
Search Console → Sitemaps → "Submitted sitemaps". Each listed with URL count, status (Success/Couldn't fetch/Has issues), last read date.
Step 3
Re-run Robots & Sitemap Tester
Declaration findings cleared. Sitemap parses successfully. URL count matches expectations.

8. Common platform patterns

WordPress (Yoast / Rank Math)

# Yoast generates: /sitemap_index.xml
# Rank Math generates: /sitemap_index.xml

# Add to robots.txt manually if plugin doesn't:
Sitemap: https://example.com/sitemap_index.xml

Next.js

// app/robots.ts (App Router)
export default function robots() {
  return {
    rules: { userAgent: '*', allow: '/' },
    sitemap: 'https://example.com/sitemap.xml',
  };
}

Astro

// astro.config.mjs
import sitemap from '@astrojs/sitemap';
export default defineConfig({
  site: 'https://example.com',
  integrations: [sitemap()],
});
// Generates /sitemap-index.xml and /sitemap-0.xml
💡 The single rule: absolute URL with protocol, on its own line in robots.txt, AND submit via Search Console. Doing both ensures every crawler — including ones you didn't think of — finds the sitemap. Two minutes of setup, prevents weeks of indexing delays.

🤖 Re-run the Robots & Sitemap Tester

Verify sitemap declaration is correct.

Run Tester →
Related Guides: Robots & Sitemap Fixes  ·  Fix Sitemap 404s  ·  Fix Sitemap Size  ·  Robots & Sitemap Guide
💬 Got a problem?