Question

I would like to extract a money value when using IMDbPY to extract datas from IMDB.

My problem is that IMDbPY returns results in the following format, what is a unicode string:

In : movie['business']['gross'][0]
Out: u'$134,966,411 (USA) (11 May 1997)'

Also, the date is sometimes present, sometimes not.

Can you help me how to extract the number from this string, without accidentally recognising the date/year part?

The currency symbol and the country code are not important.

Was it helpful?

Solution

re.match with this pattern:

r"\$([1-9][0-9,]+)"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top