Matching dashes in a URL regex
I have used the following regex to get the urls from text (e.g.
"this is text http://url.com/blabla possibly some more text").
This works for all URLs but I just found out it doesn't work for URLs shortened like:
"blabla bla http://ff.im/-bEnA blabla" becomes
http://ff.im/ after the match.
I suspect it has to do with the dash
- after the slash
No correct solution
[\w/_\.] doesn't match
- so make it
@ - delimiter ( - start of group https?:// - http:// or https:// ([-\w.]+)+ - capture 1 or more hyphens, word characters or dots, 1 or more times.. this seems odd - don't know what the second + is for (:\d+)? - optionally capture a : and some numbers (the port) ( - start of group / - leading slash ( - start of group [\w/_\.] - any word character, underscore or dot - you need to add hyphen to this list or just make it [^?\S] - any char except ? or whitespace (the path + filename) (\?\S+)? - optionally capture a ? followed by anything except whitespace (the querystring) )? - close group, make it optional )? - close group, make it optional ) - close group @