Frage

I was thinking a lot for last few days on how to protect the web form that Bots uses. The usage is kindly abuse, around 800k bot's queries in ~8hours.

Let's take a quick situation overview, any missing info - please ask for.

The bot:

  1. The bot have different IPs.
  2. The bot changes it's user agents to the really existing ones.
  3. The unknown point whether the bot loads js and have a cookie or not.

The problems:

  1. The form couldn't use hidden token field as may be submit from outside resources. The resourses such as different websites, that doesn't know about CSRF tokens, and can't generate them. Making impossible to use CSRF.
  2. The website MUST be cached in browser and the cache maybe reset only under exceptional situations, like suspected behavior.
  3. Database can't be used intensively(!).

The way it is now:

  1. Cookie counter with expiration hashed into something + additional chars only systems knows when they inserted.
  2. If browser couldn't handle cookies, database logging used. Here is some difficulty with browser cache, when user doesn't reach the server - result: verification is not running, counter is not incremented.
  3. reCaptcha applied for user who exceeds attempts limit during X time.

The ideas that came up:

  1. Serving iframe with some content and expires 0. iframe making simple cookie logic.
  2. Iframe : if cookie is not set - set it, if cookie is set, verify. if user didn't exceed the limit - set counter +1, if exceeded - send to specific page, that will show the warning with cache reset.

The difficulty here, what if bot doesn't support cookie and the content being served from cache... the db doesn't write anything as the user doesn't reach the server. However if user changes keyword, it will reset the cache and the logic behind will work.

The second difficulty: what if bot doesn't support JS (he will be thrown out when he switch keyword). but, cant be redirected while content served from cache.

The third difficulty: What if bot deciphers ReCAPTCHA ? :)

*The Questions: *

What you were doing in this situation ? Please describe the steps you are thinking. Really appreciate your point of view on the things. Every idea will may be refined with other ideas and we can come up with a great protection scheme! Thank you, guys!

War es hilfreich?

Lösung 2

So My Idea to fight the user's cache was:

Using iframe 1x1.

it's content being sent with Expires: 0, the iframe is served every time even when page loaded from cache.

Another idea i just came up is to record mouse events. the onmousemove and onkeydown, these two catchs even F5 keydown. report to a server and set the flag.

FINAL RESULT It is decided to use cloaked CSS, that sets system's variable that the user is loading contents normally. However, if "user" loading content normally, exctra protection is to implement javascript's events tracking (onmousemove,onkeydown,onclick) and receive send it to server to flag it. The request sends to server only once, when event first occurred and then doesn't track.

Andere Tipps

Facts: if you have to accept any form submission from anywhere anytime by anyone, you have basically no recourse at all against bots. Because that's all bots do: they submit data to your server from anywhere, anytime.

CSRF tokens have the job of requiring the user to get something from the server before submitting data to it. That gives the server something to distinguish random submissions from "real" submissions. It can also throttle the rate at which it gives out these tokens. That's only really to protect against JavaScript, browser based cross site attacks, it doesn't do much against bots that can fetch such a token at any time.

If you tie the tokens to a user and require form submissions to come from authenticated users, it give you a much better handle on limiting submissions. You can control the rate at which a user can submit data and you can control who is allowed to sign up and how. So it gives you a handle on who is allowed to submit data and how often.

Without all this, you don't really have a handle on anything regarding the validity or frequency of submissions. You have mentioned tracking a user's mouse movements... I'm not sure how you want to implement this, but if all that's required for a bot is to submit some extra data that "looks like mouse movements", that's easily circumvented too. "Mouse movements" is just data submitted to the server after all, you have no idea whether that data was generated by a mouse or not.

In short: protecting a web form against bots is possible through various techniques, including hidden honeypot fields, authentication tokens and captchas. If you require to have an open-for-all API anybody can submit to though, all this is pretty pointless.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top