Question

I am looking for an efficient word search algorithm. If you can help me come up with it it would be even better

a h c k 
x r j i
b v t l
c y q s

and I want to find 'arts' . If 'stra' was also a valid word I want that to be found as well. (vertical, horizontal, diagonal and reverse). I came up with a few algorithms but thy don't seem efficient and consume long coding. First one included using find() to get the first letter and look at that column or rows.

Was it helpful?

Solution

Here's one way:

%// Example word grid
C = [
    'a' 'h' 'c' 'k' 'r'
    'x' 'r' 'j' 'i' 'p'
    'b' 'v' 't' 'l' 'q'
    'a' 'y' 'q' 's' 'o'];

%// Your search term
target = 'arts';

%// Optionally, do some obvious checks here. 
%//  - length of the search string may exceeds the grid's dimensions
%//  - there are no matches for the first letter
%//  - and similar

%// Form single cellstring containing all search directions
allDirections = [
    %{
    // horizontal, horizontal-reversed
    %}
    cellstr([C ; C(:,end:-1:1)])
    %{
    // vertical, vertical-reversed
    %}
    cellstr([C'; C(:,end:-1:1)']) 
    %{
    // Diagonal, top-left to bottom-right, and reversed
    %}
    arrayfun(@(ii){diag(C,ii)'}, -size(C,1)+2:size(C,2)-2)';
    arrayfun(@(ii){diag(C(end:-1:1,end:-1:1),ii)'}, -size(C,1)+2:size(C,2)-2)';
    %{
    // Diagonal, top-right to bottom-left, and reversed
    %}
    arrayfun(@(ii){diag(C(:,end:-1:1),ii)'}, -size(C,1)+2:size(C,2)-2)';
    arrayfun(@(ii){diag(C(end:-1:1,:),ii)'}, -size(C,1)+2:size(C,2)-2)';
];

%// And now just find the string
strfind(allDirections , target)

Of course, you could to improve (memory) efficiency by

  • doing the strfind on all the directions separately
  • doing 2 × strfind on the same direction, but with the target inverted
  • etc.

But aside from these relatively minor optimizations, I don't think you can do much better in MATLAB in practice.

A theoretically more efficient recursive, branch-and-bound type search roughly goes like this:

  • find all occurrences of the first letter
  • eliminate all of those occurrences that cannot satisfy the length of the target based on the dimensions of the grid
  • Search the neighborhoods of all hits for occurrences of the second letter
  • Eliminate occurrences based on length, etc.

(Don't forget to filter on direction after the second letter, as the hits for the second letter fixes the search directions)

As far as I can see, this would need a lot less reads and comparisons than my version. However, my guess would be that using canned routines (as I did) is going to be faster and less complicated than using (nested) loops. But I could be wrong.

Try. Profile. Learn. Smile. Correct me :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top