Mnemonic Password Generation Algorithm for QWERTY Keyboards

https://stackoverflow.com/questions/2209084

18-09-2019
|

Question

I've a "mnemonic" password generation function that goes something like this:

function Mnemonic($mnemonic)
{
    $result = null;
    $charset = array(str_split('aeiou', 1), str_split('bcdfghjklmnpqrstvwxyz', 1));

    for ($i = 1; $i <= $mnemonic; $i++)
    {
        $result .= $charset[$i % 2][array_rand($charset[$i % 2])];
    }

    return $result;
}

Basically this generates a string with $mnemonic length where every odd character is a consonant and every even character is a vowel. While I understand this reduces the password complexity it's usually much more easier to remember. Now I want to improve it by generating strings that are easy to type.

For instance, while a *nix newbie I always prefer RHEL based distributions over Debian ones, the main reason is the ease of typing yum versus the ease of typing apt[-get], just try it for yourself.

How should I implement the logic to generate strings that are easy to type on QWERTY keyboards?

Solution

You may wanna take a look at the principles used in the Dvorak keyboard,

Those principles applied in a password-generating algorithm would be:

Letters should be typed by altering hands.
Use easy to type combinations. Take a look at the Dvorak layout and see the common digraphs and the positions of their letters.
Use only one letter from the bottom row, or not. Make it random!
You can make the ratio 2 to 1 (2 letters typed by the right hand to 1 letter typed by the left hand).
Since the ratio is 2 to 1, you're gonna have 2 consecutive letters typed by the same hand so you're gonna have to make sure they are typed from the outside of the keyboard to the inside. This principle is applied to the digraphs.

I know you said it's a QWERTY keyboard but using these principles on a QWERTY keyboard can give you some very good results, like:

ktrd,ogkdo ("typewriter" in dvorak)

kjg;g;akd;k (using only the home row)

pjedoildupsk (just a random password following the principles)

All Dvorak haters, shush it!

I hope this helps.

OTHER TIPS

Carpalx has a lot of research on calculating typing effort, which incorporates:

finger travel distance

hand, finger and row penalties

stroke path

The outcome of their research is the Colemak keyboard layout, which claims to be better than Dvorak.

However, it's written backward from what you want - their goal is to find a better keyboard layout based on input, but you're trying to find easy input based on the keyboard layout.

So - even though you might not be able to use it directly, I thought you might find it interesting (and who knows, if your Perl-fu is strong, you might be able to extract and reverse the algorithm, since it's GPL'd).

You could eliminate all characters that are typed with the ring and pinky finger (q,w,x,z,p), then spit the characters that are typed by the left and the right hands and alternate between these letters.

Perhaps you can use some heuristic to measure the 'ease of typing'.

For instance, consider the cost of moving a finger when going to the next character. This can be a function of how far the finger needs to move, which direction etc.

You could also add extra costs, when it is required to switch fingers, or hands.

After playing around with the costs a bit, you will probably hit upon a satisfactory solution.

Hope that helps.

Great question - taking the above suggestions, here's a formulae for the distance from key i to key j:

Weight = distance * a + switch * b + same * c + shift * d + weird * e + start * f

Distance is a value, the others are 0/1 values.

Distance - get by superimposing a fine grid over a QWERTY keyboard, lookup the x,y and calculate the distance. Distance has a positive weight. If the letter combination is with the use of different hands (e.g aj, sk, wu...), the distance is zero.

Switch - negative weight; switching is good

Same - aq, qa, az, za use the same finger. Same is positive

Shift - anything with a shift is positive and real bad

Weird - I dunno $ or ~ is bad because you have to look at the keyboard.

Start - asdfjkl starting or ending. Probably negative & good since your fingers are there at rest.

Coefficients - just make 'em up to start as long as the relative values seem reasonable. If you REALLY want to get fancy - get someone to type in several dozen sets of numbers, use a stop watch and fit a regression model.

Implementation - say we have a six character password.

Now I need the lowest value for six characters starting with each letter. Imagine an array of your N keys in a columns. Now imagine six columns. Your shortest password is the shortest path through the six columns (with cycles allowed). You might need to add some logic to eliminate cycles, but this should be a good first pass. (I'm getting lazy here - there's probably a graph theoretic formulation that handles this problem.)

I'll bet someone has done this before - especially the keystroke part.

I smashed together the following. It's a hack job but it seems to work pretty good.

<?
function Mnemonic($mnemonic)
{
    $result = null;
    $charset = array(str_split('@a3e!1i0ou', 1), str_split('#$*bcdfghjklmnpqrstvwxyz', 1));

    $lastchar = ' ';
    for ($i = 1; $i <= $mnemonic; $i++)
    {
      do {
        $char = $charset[$i % 2][array_rand($charset[$i % 2])];
      } while (!nextkey($lastchar, $char));
      $result .= $char;
    }

    return $result;
}

function nextkey($lastchar, $requestchar)
{
  $map = array();
  $map[] = '!qaz'; // ll
  $map[] = @#wsx1'; // lr
  $map[] = 'ed23'; // lm
  $map[] = '$%^rtfgcvb456'; // li
  $map[] = '&yhnujm7'; // ri
  $map[] = '*()ik89'; // rm
  $map[] = 'olp,.'; // rr
  $map[] = ';[]'; // rl
  $map[] = '!@#$%^&*()[]'; // special chars, don't follow
  $map[] = 'pbvcnmq'; // consonant clusters, don't follwo

  if($lastchar == $requestchar) return true;
  foreach($map as $string)
    if(strpos($string, $requestchar) && strpos($string, $lastchar)) return false;
  return true;
}

printf("%s\n", Mnemonic(8));
?>

Build a data structure that represents a keyboard and codes the row, column, hand, and finger used to type each character. Write a function that, when presented with a character, provides a list of "easy to type next" characters, based on flexible rules that you develop. It could rely on another function that calculates the distance between keys.

Personally, I don't find typing letters with the same hand twice to be slow: only if a previous letter used a finger that was too close is it difficult. For example, XQ is hard to type because my hand has to move upward to handle the adjacent fingers required to type them. But I don't find BQ hard to type at all, because while my forefinger is still working on the B, my pinkie finger can head for the Q.

It is also much easier to type AW than QS, because the ring finger is longer and so naturally fits on the W while the pinkie is on A, in a near-resting position, while QS requires a stretch of the pinkie and a simultaneous, conflicting muscular crunch of the ring finger.

If you start to build up a map of each letter against each other letter, you will soon find a reasonable way to represent various aspects of ease or difficulty. Generalizing my XQ/BQ example, you could make one-row changes require a distance of 2 or more fingers, 2-row changes require a distance of 3 fingers, and 3-row changes (numbers, perhaps) require alternate hands.

I'm also noticing that the slightly longer distance between WD and IL than SE and KO also alters difficulty, because of the slightly jagged placement of keys.

With some analysis (I recommend using Excel to "map out" typing difficulty) I'm sure you can come up with an algorithm that helps you construct easy-to-type words.

If possible, try throwing in at least one number, and consider using spaces as well.

If you implement this, please take the user's locale into account when determining the "cost" of moving from one character to another. An easy-to-type password might become rather cumbersome if the user is using a different keyboard layout. Some keys that might be easy to access on one language's keyboard might not be available on another language's keyboard without requiring extra modifier keys (shift, meta, etc).

To keep this idea universal, I would recommend ignoring what character belongs to what key and instead treating the keys as an array with rows and columns. Each row is typically offset from the previous by roughly 1/3 of a key width. With this in mind, it shouldn't be difficult to calculate the distance between any two arbitrary keys:

# Key at top left corner is {0, 0}
key1 @ {x1, y1}
key2 @ {x2, y2}

xdistance = absolute_value(x2 - x1)
ydistance = absolute_value(y2 - y1)

if y1 > y2
  xdistance += (1/3 * ydistance)
else
  xdistance -= (1/3 * ydistance)

total_distance = square_root(xdistance^2 + ydistance^2)

Generate a series of key positions meeting your length and "ease of typing" requirements, then use the user's current keymap to re-map those indices into characters.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow