AIWebPageSEO Ecommerce Crawl Fixes Fix Shopify Crawl Issues

How to Fix Shopify Crawl Issues

Shopify generates more crawlable URLs than non-technical merchants realise: every collection × every tag = a unique URL, every variant under each product, /collections/all/, every blog tag combination. Without management, Google wastes crawl budget on low-value URLs and misses your real products. This guide covers Shopify-specific crawl management. Pair with crawl guide.

Step-by-step: How to fix Shopify crawl issues

  1. Audit current crawl. Search Console → Coverage. Note 'Excluded' URLs especially 'Crawled — currently not indexed' and 'Discovered — not indexed'. Both indicate crawl budget waste.
  2. Identify URL pattern problems. Collection-tag URLs (/collections/COLLECTION/TAG) multiply quickly: 50 collections × 20 tags = 1000 URLs. Most are low-value. Variant URLs (/products/SLUG?variant=12345) duplicate parent content.
  3. Handle collection tags. Two approaches: A) noindex via robots.txt.liquid 'User-agent: * / Disallow: /collections/*/' (be careful — also blocks legitimate collection pages); B) more selective: edit robots.txt.liquid to disallow specific tag patterns; C) canonical tag-filtered URLs back to parent collection.
  4. Handle /collections/all/. This shows every product in random order. Usually low-value for crawl. Add noindex meta tag via theme template if you don't use it: {% if template == 'collection' and collection.handle == 'all' %}<meta name='robots' content='noindex'>{% endif %}.
  5. Handle variant URLs. Variants are usually canonicalled to parent product URL by Shopify automatically. Verify in view-source. If you have many high-value variant URLs (size/colour with distinct demand), consider separate products instead.
  6. Configure robots.txt.liquid. Online Store → Themes → Edit code → create templates/robots.txt.liquid. Block: /search/, /cart/, /checkout/, internal-search-result patterns. Don't block /collections/ wholesale — that kills your category pages.
  7. Monitor crawl stats. Search Console Settings → Crawl Stats. Watch trend. After interventions, total crawl requests should stabilise; indexed product count should increase.
Tip. Document your Shopify configuration choices in a single internal doc (theme version, installed apps, custom code edits). When something breaks after a theme or app update, you have a baseline to compare against.

🕷️ Audit Shopify crawl

Find crawl traps, parameter URLs and indexation issues.

Run Crawl Audit →

Frequently Asked Questions

How many crawlable URLs is too many for a Shopify store?

Rule of thumb: total URLs should be no more than 5x product count. A 500-product store with 8000 indexable URLs has a crawl problem. Most often from collection-tag combinations multiplying.

Should I noindex Shopify tag pages?

Usually yes — collection-tag URLs (e.g., /collections/dresses/red) often have thin content and duplicate the parent collection. Exception: if a tag combination has distinct search demand (e.g., 'red wedding dresses'), keep it indexable and add unique content. Most tag URLs should be noindex.

Does Shopify automatically canonical variant URLs?

Yes. /products/SLUG?variant=X canonicals to /products/SLUG. Verify in view-source. This handles most variant duplication. Exception: if you've intentionally created separate product pages per variant (separate handle), they don't auto-canonical.

Why does Search Console show thousands of 'Discovered — not indexed' on my Shopify store?

Google found the URLs but decided not to index. Common reasons: thin content, duplicate content (failed canonical), quality threshold not met. For Shopify this usually means: collection-tag URLs, low-engagement old products, blog tag pages. Audit, prune, noindex where appropriate.

Can I increase Shopify crawl budget?

Not directly. Google determines crawl budget by site quality and PageRank, not request. Indirectly: improve site quality (better content, faster pages, fewer redirects), prune low-value URLs (noindex), keep sitemap current. Google rewards efficient sites with more frequent crawls.

Got a problem?