Question

I'm developing a spam detection system and have been alerted to find that it can't detect strings like this - "asdfsdf".

My solution to this involves detecting if the previous keys were near the other keys on the keyboard. I am not getting the input (to detect spam from) from the keyboard, I'm getting it in the form of a string.

All I want to know is whether a character is one key, two keys or more than two keys away from another character.

For example, on a modern QWERTY keyboard, the characters 'q' and 'w' would be 1 key away. Same would the chars 'q' and 's'. Humans can figure this out logically, how could I do this in code?

Was it helpful?

Solution

You could simply create a two-dimensional map for the standard qwerty keyboard. Basically it could look something like this:

map[0][0] = 'q';
map[0][1] = 'a';
map[1][0] = 'w';
map[1][1] = 's';

and so on.

When you get two characters, you simply need to find their x, and y in the array 'map' above, and can simply calculate the distance using pythagoras. It would not fill the requirement you had as 'q' and 's' being 1 distance away. But rather it would be sqrt(1^2 + 1^2) approx 1.4

The formula would be:

  • Characters are c1 and c2
  • Find coordinates for c1 and c2: (x1,y1) and (x2,y2)
  • Calculate the distance using pythagoras: dist = sqrt((x2-x1)^2 + (y2-y1)^2).
  • If necessary, ceil or floor the result.

For example:

Say you get the characters c1='q', and c2='w'. Examine the map and find that 'q' has coordinates (x1,y1) = (0, 0) and 'w' has coordinates (x2,y2) = (1, 0). The distance is

sqrt((1-0)^2 + (0-0)^2) = sqrt(1) = 1

OTHER TIPS

Build a map from keys to positions on an idealized keyboard. Something like:

'q' => {0,0},
'w' => {0,1},
'a' => {1,0},
's' => {1,1}, ...

Then you can take the "distance" as the mathematical distance between the two points.

Well, let's see. That's a tough one. I always take the brute-force method and I stay away from advanced concepts like that guy Pythagoras tried to foist on us, so how about a two-dimensional table? Something like this. maybe:

+---+---+---+---+---+---+---
|   | a | b | c | d | f | s ...
+---+---+---+---+---+---+---
| a | 0 | 5 | 4 | 2 | 4 | 1 ...
| b | 5 | 0 | 3 | 3 | 2 | 4 ...
| c | 4 | 3 | 0 | 1 | 2 | 2 ...
| d | 2 | 3 | 1 | 0 | 1 | 1 ...
| f | 3 | 2 | 2 | 1 | 0 | 2 ...
| s | 1 | 4 | 2 | 1 | 2 | 0 ...
+---+---+---+---+---+---+---

Could that work for ya'? You could even have negative numbers to show that one key is to the left of the other. PLUS you could put a 2-integer struct in each cell where the second int is positive or negative to show that the second letter is up or down from the first. Get my patent attorney on the phone, quick!

The basic idea is to create a map of characters and their positions on the keyboard. You can then use a simple distance formula to determine how close they are together.

For example, consider the left side of the keyboard:

  1 2 3 4 5 6
  q w e r t
  a s d f g
  z x c v b

Character a has the position [2, 0] and character b has the position [3, 4]. The formula for their distance apart is:

sqrt((x2-x1)^2 + (y2-y1)^2);

So the distance between a and b is sqrt((4 - 0)^2 + (3 - 2)^2)

It'll take you a little bit of effort to map the keys into a rectangular grid (my example isn't perfect, but it gives you the idea). But after that you can build a map (or dictionary), and lookup is simple and fast.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top