Regex match multibyte numbers

https://stackoverflow.com/questions/18134489

24-06-2022
|

Question

I need to match multi-byte ０１２３４５６７８９ characters from Japanese using a regular expression.

[0-9] does not work in this case. How can I got about making this regex? This is my first foray into matching multi-byte strings.

UPDATE

Matching a 4 digit string, such as birth year, was successful with both UTF-8 and non UTF-8 using the following regex

^([0-9]{4}||[\uFF10-\uFF19]{4})$

Solution

The regex equivalent to /[0-9]/ for these multi-byte numbers in Javascript is

/[\uff10-\uff19]/

OTHER TIPS

var str = '０１２３４５６７８９';
console.log(
    str.match(new RegExp('[０-９]', 'g')),
    str.match(/[\uff10-\uff19]/g) 
);
//returns ["０", "１", "２", "３", "４", "５", "６", "７", "８", "９"] both ways

Make sure to save the .js file with the proper encoding (UTF-8) if using the unescaped version.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow