Question

I need to match multi-byte 0123456789 characters from Japanese using a regular expression.

[0-9] does not work in this case. How can I got about making this regex? This is my first foray into matching multi-byte strings.

UPDATE

Matching a 4 digit string, such as birth year, was successful with both UTF-8 and non UTF-8 using the following regex

^([0-9]{4}||[\uFF10-\uFF19]{4})$

Was it helpful?

Solution

The regex equivalent to /[0-9]/ for these multi-byte numbers in Javascript is

/[\uff10-\uff19]/

OTHER TIPS

var str = '0123456789';
console.log(
    str.match(new RegExp('[0-9]', 'g')),
    str.match(/[\uff10-\uff19]/g) 
);
//returns ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"] both ways

Make sure to save the .js file with the proper encoding (UTF-8) if using the unescaped version.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top