Question

I have three different types of strings (the dots indicate any alphanumeric character):

  1. Won 1, 2, 3, 4, Lost 5, 6, 7, ...
  2. 5 Wins, ...
  3. Winner

How would I create regexes to match the win numbers only? I tried something like Won (?:(\d)[, ]?)+, but it only matched the first number, but if I take out "Won", it will match all the numbers.

Thanks.

Was it helpful?

Solution

You don't need a regex for this:

>>> foo="Won 1, 2, 3, 4, Lost 5, 6, 7, 8"
>>> [x for x in foo if x.isdigit()]
['1', '2', '3', '4', '5', '6', '7', '8']
>>>

That wouldn't work if you want to capture multi-digit numbers, but for the examples you cite, and given that your title references digits, not numbers, it would work.

This would get multi-digit numbers that don't have punctuation attached - you would modify the call to split() as necessary to get your desired results given your input:

>>> foo="This 23 is not a string with 32 numbers"
>>> [x for x in foo.split() if x.isdigit()]
['23', '32']

OTHER TIPS

Do you have to use a single regex? It would be easier to split the string and then get the numbers.

This is a .NET example:

// replace everything after Lost with a blank string (would be bad if Lost came before Won)
string text = Regex.Replace( inputString, @"Lost.+", "" );

gives: "Won 1, 2, 3, 4, "

and then

Regex.Matches( inputString, @"\d+" );

This will do the trick:

(?<=Won).*(?=Lost)|\d.*(?=Wins)

Tested it in phytex as you gave the python tag for the question.. Fortunately python enables lookafter and lookbehind for regex.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top