If the set of all possible characters is known in advance, let's say their number is n
, with n
being not too high (e.g. 10, if you're doing only digits), you can do this by creating m
boolean arrays of length n
, with m
being the number of positions, or digits in input strings and n
. The n-th position in m-th array will be true
, if the n-th character is present in m-th position in any of input strings. False
will denote, that no such character was in m-th position before.
Then you can iterate over each string, and when you encounter character n
in position m
you mark down true
in n-th position of m-th array. At the end, you will have m
arrays, each describing the content of m-th group
pos[0] = {true, true, false, false, false, true, true, false, true, false}
pos[1] = {true, false, false, false, false, false, false, false, false, false}
pos[2] = {false, false, true, true, false, false, false, false, false, true}
translates to
[0,1,5,6,8] [0] [2,3,9]
As all the structures are direct access arrays, there is no lookup involved, all access is in constant time and you only need to visit every character once, with no comparison involved. Hope this helps.