Question

my login.txt file contains following entries

abc def
abc 123
def abc
abc de
tha ewe

when i do the positive lookahead using perl, i'm getting the following result

cat login.txt | perl -ne 'print if /(?)abc\s(?=def)/'
abc def

when i use grep i'm getting the following result

cat login.txt | grep -P '(?<=abc)\s(?=def)'
abc def

negative lookahed results as follows from perl and grep.

 cat login | perl -ne 'print if /(?)abc\s(?!def)/'
abc 123
def abc
abc de

grep result

cat login.txt | grep -P '(?<=abc)\s(?!def)'
abc 123
abc de

perl matched the def abc for the negative lookahead. but it shouldn't matched def abc, as i'm checking abc then def pattern. grep returning the correct result.

is something missing in my perl pattern ?

Was it helpful?

Solution

grep does not include the newline in the string it checks against the regex, so abc\s does not match when abc is at the end of the line. chomp in perl or use the -l command line option and you will see similar results.

I'm not sure why you were making other changes between the perl and grep regexes; what was the (?) supposed to accomplish?

OTHER TIPS

I would try anchoring your regex like so:

/(^abc\s+(?!def).+)/

This would capture:

abc 123
abc de

The (?) at the beginning of your negative lookahead regex is redundant

In your perl -ne 'print if /(?)abc\s(?!def)/' you asking perl to find abc, then space, then string shouldn't be def. This is successfully matches with def abc, because there is no def after abc here and \s matches with newline.

perl -ne 'print if /(?)abc\s(?!def)/'

To begin, as fugi stated, the (?) is an empty non-capturing group, and matches anything, so it does nothing.

Therefore as written, this regex matches the literal string abc followed by a single [:space:OR:tab:OR:newline], not followed by the literal string def.

Because \s matches a newline character and you did not chomp the trailing newline characters as you processed each line, def abc matches because (?)abc\s in the regex matches abc[:newline:] which is followed by $ (the end-of-line anchor, not def).

The corrected regex (accounting for the redundant (?)) would be:

perl -ne 'print if /(?<=abc)\s(?!def)/'

...which matches a single [:space:OR:tab:OR:newline] which is preceded by abc and not followed by def.

This still will match def abc, because once again, \s matches the [:newline:], which is preceded by abc and followed by $ (the end-of-line anchor, not def).

Either chomp the [:newline:] before evaluating the regex in Perl, or use the character class [ \t] (if you need to account for tab characters) instead of \s:

perl -ne 'print if /(?<=abc)[ \t](?!def)/'

Or simply

perl -ne 'print if /(?<=abc) (?!def)/'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top