Extracting number from unicode string with regex

https://stackoverflow.com/questions/20873869

23-09-2022
|

Question

I have the following dictionary which contains some product data:

dictionary = {'price': [u'3\xa0590 EUR'],
              'name': [u'Product name with unicode chars]}

All values are in unicode. As you can see I'm using lists as dictionary values because sometimes I need to concatenate the information from several different sources.

I'm looking for a way to extract the digits from the price value without the non-breaking space (\xa0) and currency at the end (EUR) by using a regex.

In this case I would like to see the following as a result:

3590

Can you please suggest a solution?

[SOLUTION]

Adding the solution here because the comments field wrapped my code unexpectedly:

I used .sub() method from Python's re module which is a replace function. Here is the final code that gives me the expected result:

p = re.compile( '(\xa0| EUR|)')
result = p.sub( '', dictionary['price'][0])

Solution

Not sure about python, but here's a regex:

p = /\D/g;
s.replace(p, '');

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow