Question

I tried:

mb_strlen('普通话');
strlen('普通话');

both of them output 9,while in fact there are only 3 characters.

What's the right way to count characters?

Was it helpful?

Solution

you should make sure to specify the encoding in the second parameter

ie

mb_strlen('普通话', 'UTF-8');

see the manual

OTHER TIPS

If you don't have access to the mb string extension this also works (and I believe it's faster):

strlen(utf8_decode('普通话')); // 3

One Chinese character doesn't equal to one ascii character. mb_strlen is the right way to count multi-byte characters if the string in UTF-8 encoded.

see here: http://www.herongyang.com/PHP-Chinese/Multibyte-UTF-8-mb_strlen.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top