Question

String 1: quick brown fox jumps over a lazy dog

String 2: jumps over a lazy

i will pass these strings to a subroutine which returns boolean value.

My present solution is to remove spaces from both the strings and pattern match .

Is there a better solution .

Was it helpful?

Solution 2

Removing the whitespace would match

ju mps over a laz y

in

quick brown fox jumps over a lazy dog

But I assume, that this is not what you want. So, it would be a good idea to convert the whitespace to patterns before you try to match:

sub substr_ignore_whitespace
{
    my $string      = shift;
    my $sub_string  = shift;

    $sub_string =~ s/\s+/\\s+/g;

    return $string =~ /$sub_string/;
}

OTHER TIPS

The trick is to normalize the input so that a direct comparison can take place. Here, we could replace all sequences of white space characters by a single space.

sub fuzzy_contains {
    my ($haystack, $needle) = @_;
    # fold the spaces
    s/\s+/ /g for $haystack, $needle;
    return -1 < index $haystack, $needle;
}

You may want to apply further normalizations, such as case-folding the strings for case independent matching (use fc from Unicode::CaseFold or from feature 'fc').

I would suggest this approach:

In the needle string using this search and replace:

's/ +/ .*?/g'

Which is basically replacing each space by .*? (space followed by 0 or more any character) this will give you:

jumps .*?over .*?a .*?lazy

Then you can do regex match between your string data and needle.

...
my $str1 = "quick brown fox jumps over a lazy dog";
$str1 =~ s|\s+||g;

my $substr = "jumps over a lazy";
$substr =~ s|\s+||g;

my $result = index($str1, $substr);
...
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top