Question

I have been going through different forums and was wondering if this is correct. I am trying to disable bots from crawling queries only in specific subpages (e.g. www.website.com/subpage/?query=sample). I am trying to make sure /subpage/ does not get disallowed also. Please correct me if I am wrong.

File: robots.txt

User-agent: *
Disallow: /subpage/*?
Was it helpful?

Solution

According to what I see here, you are very close

User-agent: *
Disallow: /subpage/*?*
Allow: /subpage$

You can test this from the comfort of your own browser by using the appropriate add-on or extension.

OTHER TIPS

I do not think you can specify query string in the Disallow. The value you set for Disallow is referenced as Directory in the documentation (not as URI or URL).

You can however achieve your objective by using Sitemap.xml. You can exclude the URL from sitemap that you do not want indexed.

Google Webmaster tools also gives a some amount of granular control over how query string parameters should be interpreted. Not sure if that serves your purpose

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top