문제

In my project, users can register with a publicly viewable nickname. I would like to allow that name to contain characters from any script (arabic, latin, cyrillic, japanese, etc) but prevent control characters, punctuation, and non-alphabetic characters such as ✇ or ✈.

I've found a lot of examples for filtering alphanumeric characters from various individual scripts, but I don't want to have to spend days digging through encoding tables to try and allow every script through manually.

Any recommendations?

도움이 되었습니까?

해결책

In JavaScript, when you want to deal with Unicode in regular expressions, the usual solution is to give up.

The next most usual solution is to use xregexp which does happen to have the classes you seem to need :

var unicodeWord = XRegExp('^\\p{L}+$');
unicodeWord.test('Русский'); // -> true
unicodeWord.test('日本語'); // -> true
unicodeWord.test('العربية'); // -> true

다른 팁

I've used \p{Latin} before in Perl to select all Latin characters. There is a whole list of options about half-way down on this page: http://www.regular-expressions.info/unicode.html.

It seems that this could carry over to Javascript since it uses XRegExp.

Edit 2: OR - make up a list of NON-allowed characters to check against - then \p{common} would be a starting point.

Edit: apparently my memory of doing this is from many eons ago. I cannot get it to work with my current Perl build (which is a special case). So - it may be completely off-base.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top