Domanda

I have been going through different forums and was wondering if this is correct. I am trying to disable bots from crawling queries only in specific subpages (e.g. www.website.com/subpage/?query=sample). I am trying to make sure /subpage/ does not get disallowed also. Please correct me if I am wrong.

File: robots.txt

User-agent: *
Disallow: /subpage/*?
È stato utile?

Soluzione

According to what I see here, you are very close

User-agent: *
Disallow: /subpage/*?*
Allow: /subpage$

You can test this from the comfort of your own browser by using the appropriate add-on or extension.

Altri suggerimenti

I do not think you can specify query string in the Disallow. The value you set for Disallow is referenced as Directory in the documentation (not as URI or URL).

You can however achieve your objective by using Sitemap.xml. You can exclude the URL from sitemap that you do not want indexed.

Google Webmaster tools also gives a some amount of granular control over how query string parameters should be interpreted. Not sure if that serves your purpose

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top