How to get the exact number of multibyte characters?

Question

I tried:

mb_strlen('普通话');
strlen('普通话');

both of them output 9,while in fact there are only 3 characters.

What's the right way to count characters?

Solution

you should make sure to specify the encoding in the second parameter

mb_strlen('普通话', 'UTF-8');

see the manual

OTHER TIPS

If you don't have access to the mb string extension this also works (and I believe it's faster):

strlen(utf8_decode('普通话')); // 3

One Chinese character doesn't equal to one ascii character. mb_strlen is the right way to count multi-byte characters if the string in UTF-8 encoded.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow