Question

I'm looking for a reasonably simple algorithm to determine how difficult it is to type a word on the QWERTY layout.

The words would not necessarily be dictionary words, so a list of commonly mistyped words or the like is not an option. I'm sure there must be an existing, well-tested algorithm, but I can't find anything.

Can anyone offer any help or advice? I'm coding the algorithm in python, but any other language or pseudo-code is welcome.

Was it helpful?

Solution

There is this comparison between QWERTY, Colemak and Dvorak layouts, which calculates the distance between the keys typed, the percentage of keys on the same hand, etc. with source code in Java. These metrics in combination should give a very good estimate of the 'typeability' of a word.

OTHER TIPS

Take out your Scrabble set, note down the scores for each letter, total the scores for a word, hey presto you have your algorithm. Not sure it entirely satisfies your requirements, but it might point you in a useful direction. You might, for instance, want to assign scores not only to individual letters but also to di- and tri-grams.

I'm not aware of any existing source of the information you need, perhaps you could come up with your own letter scores by examining the keyboard and assigning higher scores to the more difficult letters: so 1 for 'a', 8 for 'q', 2 for 'm', and so on.

EDIT: I seem to have confused people more than I usually do when I reply on SO. Here's the barebones of my proposal:

a) List all trigrams and digrams which occur in English (or your language). To each of them assign a difficulty-of-typing score. Do the same for individual letters (after all a 4 letter word might be composed of a trigram and a letter rather than two digrams).

b) Score the difficulty of typing a word as the sum of the difficulty of typing its components.

As for the difficulty scores, I haven't a clue, but you could start from 1 for a letter on the home keys on a keyboard, 2 for a letter which uses the index fingers but is not a home key, 3 for a letter which uses the 2nd or 3rd fingers on your hand, and so on. Then for digrams, score low for easy letters on left and right (or right and left) in sequence, high for difficult letters on one hand in sequence (eg qz, though that's perhaps not valid for English). And on you go.

I don't have any algorithms to propose, but a few hints:

  • I use both hands to type, meaning that the keyboard is roughly split in 2 halves, it is frequent that I have coordination issues between the two hands, meaning that each type the letters in the "right" order but the interleaving is wrong. This is especially true if one hand has more letters to type than the other, typical: "the" because the left hand type t and e and the right hand types h.

  • "slips" are frequent, meaning that often time one is going to miss the key and hit another key instead; "addition" / "deletion" are frequent too, ie typing a supplementary key or not pushing hard enough --> this mean that (obviously) the more letters there is, the harder it is to get the word right.

  • mix case makes it harder, it requires synchronization between pushing CAPS and hitting the keys, so it's likely that the nearby keys won't have the right upper/lower case.

Hope this helps...

I think, manhatten distances algorithm could be closest of what you are looking at. That algorithm takes into account the distance of the target from the source in the quadrangular fashion.

As for the implementation in python, for your specific need of difficulty in QWERTY, you will have to write one for yourself, otherwise few manhatten distances implementation can be found if you google for "n puzzle solver in python"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top