Question

Via google analytics I noticed that there is website which is scrapping my content automatically.. His content 100% matches mine. is there any way I could block that website host from accesing my server at all? Any solutions what I could do about this? Im running LAMP web host on CentOS.

Was it helpful?

Solution

If the IP address of the scraping host is static, you can use .htaccess to block this IP, like:

order allow,deny
deny from 111.111.111.111
allow from all

If the IP address is variable, but the user agent is constant, you can use agent blocking:

BrowserMatchNoCase SpammerRobot bad_bot
BrowserMatchNoCase SecurityHoleRobot bad_bot
Order Deny,Allow
Deny from env=bad_bot
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top