Frage

My requirement is to know whether a request to my webpage is a genuine request (through browser ) or a automated request that is generated through some Java program. Where can I differentiate the request type?

Actually I need to block all the requests that are generated from program hence searching for the difference.

War es hilfreich?

Lösung

There is no fool proof way of doing this. The most effective solution for me was:

  1. Implement a User Agent check at the web server level (Yes this is not fool proof). Target to block out the known / common programs that people use to hit URLs. Like libperl, httpclient etc. You should be able to build such a list from your access logs.

  2. Depending on your situation, you may or may not want search engine spiders to crawl your site. Add robots.txt to your server accordingly. Not all spiders / crawlers follow instructions from robots.txt, but most do.

  3. Use a specialized tool to detect abnormal access to your site. Something like https://www.cloudflare.com/ which can track all access to your site, and match it with an ever growing database of known and suspected bots.

Note: I am in no way affiliated to cloudflare :)

Andere Tipps

There is no 100%, fool proof solution to this. Many suggest using the User-Agent header, but it can very easily be faked. And you can add an IP-filter, when you (probably manually) detect fake clients. But it'll just be a cat and mouse game. If you want to restrict access to your website, maybe you're better off building in some real authorization?

Just check the "User-Agent" header and compare it to the most common ones (http://www.user-agents.org/)! Something like that:

request.getHeader("User-Agent").contains(...)

You can check for user agent. Search for java user agent detection.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top