Please have a look on following code:

<?php
    function unicode_decode($str){

            return preg_replace("/\\\u([0-9A-F]{4})/ie", "iconv('utf-16', 'utf-8',hex2str(\"$1\"))", $str);    

    }

function hex2str($hex) {

    $r = '';

    for ($i = 0; $i < strlen($hex) - 1; $i += 2)

    $r .= chr(hexdec($hex[$i] . $hex[$i + 1]));

    return $r;

}
$var="\u092e\u0941\u0930\u0932\u0940 \u0938\u093e\u0930";
$var =  unicode_decode($var);
echo $var;
?>

This code works perfectly in windows hosting and output is "मुरली सार". However, in linux hosting its output is random, showing like chinese characters"⸉䄉〉㈉䀉 㠉㸉". It seems like linux hosting doesn't work with inconv function of php.

How to solve this problem in linux hosting? Thanks in advance.

有帮助吗?

解决方案

UTF-16 has two variations: big-endian and little-endian. They differ in the order of the bytes in the code units: the character U+1234 would be encoded as '\x12\x34' in big-endian, but as '\x34\x12' in little endian.

It looks like iconv is assuming different versions on different systems. You make it use the big-endian version on all systems by using utf-16be:

return preg_replace("/\\\u([0-9A-F]{4})/ie", "iconv('utf-16be', 'utf-8',hex2str(\"$1\"))", $str);    
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top