Question

I have a web form which the users fill and the info send to server and stored on a database. I am worried that Robots might just fill in the form and I will end up with a database full of useless records. How can I prevent Robots from filling in my forms? I am thinking maybe something like Stackoverflow's robot detection, where if it thinks you are a robot, it asks you to verify that you are not. Is there a server-side API in Perl, Java or PHP?

Was it helpful?

Solution

There are several solutions.

  1. Use a CAPTCHA. SO uses reCAPTCHA as far as I know.

  2. Add an extra field to your form and hide it with CSS (display:none). A normal user would not see this field and therefore will not fill it. You check at the submission if this field is empty. If not, then you are dealing with a robot that has carefully filled out all form fields. This technique is usually referred to as a "honeypot".

  3. Add a JavaScript timer function. At the page load it starts a value at zero and then increases it as time passes. A normal user would read and fill out your form for some time and only then submit it. A robot would just fill out and submit the form immediately upon receiving it. You check if the value has gone much from zero at the submission. If it has, then it is likely a real user. If you see just a couple of seconds (or even no value at all due to the robots not executing JavaScript) then it is likely a robot. This will however only work if you decide you will require your users have JavaScript on in order to perform "write" operations.

There are other techniques for sure. But these are quite simple and effective.

OTHER TIPS

You can use reCAPTCHA (same as stackoverflow) - they have libraries for a number of programming languages.

I've always preferred Honeypot captcha (article by phil haack), as its less invasive to the user.

Captchas bring accessibility problems and will be ultimately defeated by software recognition.

I recommand the reading of this short article about bot traps, which include hidden fields, as Matthew Vines and New in town already suggested.

Anyway, you are still free to use both captcha and bot traps.

CAPTCHA is great. The other thing you can do that will prevent 99% of your robot traffic yet not annoy your users is to validate fields.

My site, I check for text in fields like zip code and phone number. That has removed all of the non-targeted robot misinformation.

You could create a two-step system in which a user fills the form, but then must reply to an e-mail to "activate" the record within a set period of time - say 24 hours.

In the back end, instead of populating your current table with all the form submissions, you could put them into a temporary table that automatically deletes any row that is older than your time allotment. Unless you have a serious bot problem, then I would think that the table wouldn't get that big, especially if the first form is just a few fields.

A benifit of this approach is that you don't have to use captcha or some other technology like that that might create some accessibility problems.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top