Question

I have a regular expression to match filenames which look like this:

name - subname goes here v4 03.txt
name - subname long 03.txt
name - subname v4 #03.txt

I want to extract the name and subname, without any addintional data. I'm able to extract the data just fine, the problem that is giving me errors is the v4 part (it's a version marker which is a v and a digit after it and it's not included everywhere), I want to exclude it but it extracts it along with the subname...

My regex looks like this:

^([\w \.]+)(?:-)?([\w \.-]+)? #?\d+

I tried doing something like this, but it only works without the ? at the end of "(?:v\d+ )?", and then it can't match filenames without the version:

^([\w \.]+)(?:-)?([\w \.-]+)? (?:v\d+ )?#?\d+

How do I make it work?

Was it helpful?

Solution

try this:

/^([\w \.]+?) - ([\w \.-]+?)(?: v\d+)? #?\d+/

I think you need to understand what is the difference of (\w+?) and (\w+)?

OTHER TIPS

I would do this in two stages, first remove the parts that you don't want

a = str.sub /\s* (?: v\d+)? \s* \d+ \.[^.]*? $/x, ''

And then split the string on ' - '

a.split /\s*-\s*/
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top