Question

Let's say I have a url such as...

http://www.example.com/random-garbage-here-i-dont-want-12392/video2983439

Is there a program where I can just put this test string in, highlight/select the parts I want to keep, then get rid of the rest and turn it into a regex expression to use? I just can't figure out regex for the life of me.

I am trying to scrape URLs on a website but they are all unique except for a few consistent characteristics. The consistent characteristics are highlighted in bold above that I want to keep, while ignoring all the non-bold...that way when I'm crawling the website it will follow URLs that are similar to the bolded parts.

Was it helpful?

Solution

The following code worked for me in TCL

%  regexp -- {http://www.example.com/[a-zA-Z0-9-]*/video[0-9]*} http://www.example.com/random-garbage-here-i-dont-want-1
2392/video2983439
1
%
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top