You can use a negative lookbehind in REGEX to check if something DOES NOT precede what you want to match.
Say that you had this expression for matching URLs:
((?:http(?:s)?://)(?:www\.)?[-A-Z0-9.]+(?:\.com)[-A-Z0-9_./]?(?:[-A-Z0-9#?/]+)?)
This expression would match the following types of URLs:
http://www.example.com
http://example.com/
https://www.example.com/seconday/somepage#hashes?parameters
https://www.example.com/seconday/
http://www.example.com/seconday
http://example.com/seconday
http://example.com/seconday/
So then you could just add a negative lookbehind to the front of it to check for a quote, tick or equal sign. If it finds one of those, then it won't make a match.
Here is what the negative lookbehind would look like:
(?<!(?:"|'|=))
And you can just put that in front of the other REGEX. Here is what this means:
(?<! (?: "|'|= ) )
1 2 3 4 5
(?<!
Negative Lookbehind - This says make sure that whatever is coming up next cannot be present in front of the string.(?:
Non-Capturing Parenthesis - We are going to be putting a group consisting of a quote"
, tick'
or equal sign=
, but we don't want to capture it. We just want to check for any one of those. By default, REGEX remembers anything inside of a parenthesis(
, so we add the?:
to tell it not to."|'|=
Look for either a quote"
, a tick'
or an equal sign=
.)
Closing parenthesis for the "or" grouping of"|'|=
)
Closing parenthesis for the negative lookbehind.
Okay, putting it all together, the REGEX would look like this:
(?<!(?:"|'|=))((?:http(?:s)?://)(?:www\.)?[-A-Z0-9.]+(?:\.com)[-A-Z0-9_./]?(?:[-A-Z0-9#?/]+)?)
Here is a link to a demo of the REGEX
Here is a link to a demo of the REGEX in a PHP script
Really, the only thing I had to do to the REGEX to get this to work in PHP was to escape the tick, since I was using ticks to enclose my expression.