문제

I try to find how to block crawlers to access my links that are something like this:

site.com/something-search.html

I want to block all /something-*

Can someone help me?

도움이 되었습니까?

해결책

User-agent: *
Disallow: /something-

This blocks all URLs whose path starts with /something-, for example for a robots.txt accessible from http://example.com/robots.txt:

  • http://example.com/something-
  • http://example.com/something-foo
  • http://example.com/something-foo.html
  • http://example.com/something-foo/bar

The following URLs would still be allowed:

  • http://example.com/something
  • http://example.com/something.html
  • http://example.com/something/

다른 팁

In your robots.txt

User-agent: *
Disallow: site.com/something-(1st link)
.
.
.
Disallow: site.com/somedthing-(last link)

Add entry for each page that you don't want to be seen!

Though regex are not allowd in robots.txt some intelligent crawlers can understand it!

have a look here

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top