Your rule Disallow: /classifieds/search*/
does not do what you want it to do.
First, note that the *
character has no special meaning in the original robots.txt specification. But some parsers, like Google’s, use it as a wildcard for pattern matching. Assuming that you have this rule for those parsers only:
From your example, this rule would only block http://example.com/classifieds/search/
. The three other URLs don’t have a /
after search
.
Disallow: /classifieds/search
→ blocks all URLs whose paths start with/classifieds/search
Disallow: /classifieds/search/
→ blocks all URLs whose paths start with/classifieds/search/
Disallow: /classifieds/search*/
→ for parsers following the original spec: blocks all URLs whose paths start with/classifieds/search*/
→ for parsers that use*
as wildcard: blocks all URLs whose paths start with/classifieds/search
, followed by anything, followed by/
For blocking the four example URLs, simply use the following:
User-agent: *
Disallow: /classifieds/search
This will block, for example:
http://example.com/classifieds/search?filter=4
http://example.com/classifieds/search/
http://example.com/classifieds/search/foo
http://example.com/classifieds/search
http://example.com/classifieds/search.html
http://example.com/classifieds/searching