Question

I have a text and a regex pattern

text is something like

foo https://www.google.hu <img ... src="http://a-page.com/foobar.jpg" ...> bar

the regex

/(http|https|ftp)\:\/\/(www\.)?([a-zA-Z0-9\-\_\.]+)\.([a-z]{1,5}+)\/([a-zA-Z0-9\.\?\=\&\-\_\~\/\%\+\;]+)?(\#([a-zA-Z0-9\_]+))?/i

and i'd update it with a special case

if url starting with src=" it would be great if regex matches dont contains the image url only other urls

i tried this

/(?!src\=\")(http|https|ftp)\:\/\/(www\.)?([a-zA-Z0-9\-\_\.]+)\.([a-z]{1,5}+)\/([a-zA-Z0-9\.\?\=\&\-\_\~\/\%\+\;]+)?(\#([a-zA-Z0-9\_]+))?/

but it doesnt work

Could you help me, please?

I know I could add (^|\s) to pattern, but it won't work in case when I want to hide urls cause user can write any char before url and the url is no longer hidden and some other regex codes are in source too and one of them is a img bb tag code, and I dont want to hide (replace) it's url

(Sorry for my english)

Was it helpful?

Solution

To be honest I had difficulties to understand what exactly you want, but I guess you mean that you have a text with various URLs inside and you don't want to match those which are included in a html img tag. If so, try this:

/(?<!src\=\")(https?|ftp):\/\/(www\.)?([\w\-\.]+)\.([a-z]{1,5}+)\/?([\w\.\?\=\&\-\~\/\%\+\;]+)?(\#(\w+))?/

Notes:

  • You can replace [A-Za-z0-9_] with character class \w (read more in perlre).
  • The (?!pattern) assertion you tried is a negative look-ahead assertion. In your case you want a negative look-behind (?<!pattern) (again you can read perlre for more info).
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top