How to get the exact number of multibyte characters?
Question
I tried:
mb_strlen('普通话');
strlen('普通话');
both of them output 9,while in fact there are only 3 characters.
What's the right way to count characters?
Solution
you should make sure to specify the encoding in the second parameter
ie
mb_strlen('普通话', 'UTF-8');
see the manual
OTHER TIPS
If you don't have access to the mb string extension this also works (and I believe it's faster):
strlen(utf8_decode('普通话')); // 3
One Chinese character doesn't equal to one ascii character. mb_strlen is the right way to count multi-byte characters if the string in UTF-8 encoded.
see here: http://www.herongyang.com/PHP-Chinese/Multibyte-UTF-8-mb_strlen.html
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow