Question

I am trying to match a substring with the regex but the ^ changes the logic as I did not expect. The regex ...

      ^(?!My Words).*$

rejects the the sentence which starts with My Words. In this case the negative lookahead checks for the whole word. However, if I drop the ^ then

    (?!My Words).*$

the regex just returns the y Words lskjdf in the sentence containing My Words lskjdf. Why the negative lookahead is not treaded as a whole in this case? Why It picks only the M for the match? How the ^ and (?1 is working collectively?

Was it helpful?

Solution

^(?!My Words).*$

^(?! means look for the start of a string, or line with the multiline flag, which is not followed by...

This is why any sentence you have starting with "My Words" do not match when using the ^ anchor.

The reason why y Words is matched when removing the ^ anchor is that now you are just looking for any point in the string that is not followed by My Words, which matches the position after M.

Let's look at the string My Words and how (?!My Words).*$ is applied to it.

Keep in mind that ^ is the start of string, even though you did not put it in your regex, the regex engine will still start in that position. I'll simplify a little bit how it works.

First regex engine step:

^My Words
|
Regex engine starts here, and looks if the current position onwards
matches (?!My Words), which it does not.   

Second step:

^My Words
 |
 Regex engine evaluates the 'M', and finds that from this position
 onwards also fails to match (?!My Words)

Third step:

^My Words
  |
  Standing at 'y', it finds that the lookahead now does not match 'My Words'.
  This allows the rest of the pattern '.*$' to be applied, which matches
  from 'y' till end of string.

OTHER TIPS

This is because with ^ you are making it compulsory to match from the beginning of string.

Without ^ it would match anywhere in between the string and so it would match y Words lskjdf

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top