Question

I am working on a new captcha script and it is almost completed except I would like to have a list of words for example lets say I have a list of 300 5 letter words that I would like to use for the captcha image text.

What would be the best way for performance on a high traffic site to deal with this list for it?

Read the words from a text file on every load
Store in an array
other?

Was it helpful?

Solution

Using a fixed list of words could make your Captcha weak since it restricts the number of variations to just n! / (n - k)! options. With n = 300 words and k=2 different words per captcha it would be just 89700 options no matter how long the words are.

If you would use a sequence of four random letters (a-z) you would get more options (exactly n^k = 26^4 = 456976).

OTHER TIPS

If you just want 300 hundred words to choose from, I'd just put them all in an array in straight php code and pull one out randomly. That would be the best performance.

Best option for performance

  1. It would be best, to put list of random numbers in memory (APC or Memcache => google/stackoverflow search for APC or Memcache) to get the best performance, because disc IO is what will make your site slow most of the time. For this you should have a box with enough memory(>= 128MB) and you can install software (APC/Memcache). If you want good performance on a high traffic site, you should be willing to pay for !!!

  2. If you are on a shared hosting provider (but then you won't get best performance), then it would be best to put the words in an array in the same file, because every require statement will fetch the file from disc.

return random word

Like lucky said you can fetch a random number, by a simple rand function call

return ($words[rand(0, count($words)-1);

Where $words is the array with all the words.

VPS hosting

These are some cheap VPS hosting I found using google, but I think you should do some more research finding the best VPS hosting for your high performance site.

Instead of 300 words, you could simply generate a random number and display that. No need for a list, or loading a list, or managing the list, ....

Just how many logons per second do you need to handle? This doesn't seem like the right place to spend time in optimization. Just about any way you find the random word should be fine, especially if your word list is only 300 words.

I'd start with a simple text file, one word per line, and just do something simple like

$words = file("wordlist.txt");
return ($words[rand(0, count($word)-1);

and only if it really proved to be a bottleneck would I change it to do a random fseek() or some other "high performance" trick.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top