Question

This is a two part question:

Part 1

First, dealing with calculating the entropy of a password in PHP. I have been unable to find any code examples that are empirically sound and would really like some help in finding the 'right' way to calculate a final number. A lot of folks on the net have their own home-baked weighting algorithm, but I am really looking for the scientific answer to the equation.

I will be using the password entropy as just one part of a larger security system and as a way to analyze our overall data security based on information accessible if a user's password is compromised and how easily a password may be broken by brute force.

Part 2

The second part of this question is: how useful will this number really be? My end goal is to generate a 'score' for each password in the system that we can use to monitor our overall system security as a dynamic entity. I will probably have to work in another algorithm or two for dictionary attacks, l33t replacement passwords, etc--but I do feel that entropy will play an important role in such an 'overall' system rating. I do welcome suggestions for other approaches though.

What I Know

I have seen some mention of logarithmic equations to calculate said entropy, but I have yet to see a good example that isn't actually written as a mathematical equation. I could really use a code example (even if not strictly in PHP) to get me going.

Extension

In making a comment I realized that I can better explain the usefulness of this calculation. When I am working on legacy systems where users have extremely weak passwords I have to have some concrete evidence of that weakness before I can make a case for forcing all users to change their passwords to a new (enforced) strong password. By storing a password strength score for each user account in the system I can build several different metrics to show overall system weakness and make a case for stronger passwords.

TIA

Was it helpful?

Solution

Entropy of a string has a formal definition specified here: http://en.wikipedia.org/wiki/Entropy_(information_theory)

How useful that value is going to be? It depends. Here's a method (in Java) to calculate entropy I made for an assignment:

public static double entropy() {
   double h = 0, p;
   for (int i = 0; i < count.size(); i++){
      p = count.get(i)/(totalChars*1.0);
      h -= p*Math.log(p)/Math.log(2);
   }
   return h;
}

count is a Map where (key, value) corresponds to (char, countForChar). This obviously means you have to process the string before you call this method.

EDIT 2: Here's the same method, rewritten in PHP

function entropy($string) {
   $h=0;
   $size = strlen($string);
   foreach (count_chars($string, 1) as $v) {
      $p = $v/$size;
      $h -= $p*log($p)/log(2);
   }
   return $h;
}

EDIT 3: There's a lot more to password strength than entropy. Entropy is about uncertainty; which doesn't necessarily translate to more security. For example:

Entropy of "akj@!0aj" is 2.5, while the entropy of "password" is 2.75

OTHER TIPS

Forcing a certain level of entropy is a requirement of CWE-521.

(1) Minimum and maximum length;
(2) Require mixed character sets (alpha,numeric, special, mixed case);
(3) Do not contain user name;
(4) Expiration;
(5) No password reuse.

To use entropy you need to not just get the Shannon Entropy of a single password, but as an element in a list of common passwords. If a password is very much like other passwords then its entropy will be low compared to other passwords. If its very unique it will be higher.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top