Question

I'm sending over a gzipped string from C# (using SharpZipLib) to PHP where I decompress with readgzfile. This works, however each character in the string is followed by two strange characters (using vim in the console those are displayed as ^@). I also tried with gzopen/gzread but with the same results.

When I clean the non-ASCII characters from the string with $clean= preg_replace('/[^(\x20-\x7F)]*/','', $string); the $clean string is identical to the one in C#.

While this works, I would like to know what is happening and why so I can make sure this will always work or come up with a better solution.

Was it helpful?

Solution

Given that the string is created on Windows, it's likely that some multibyte encoding is being used.

You can verify this yourself by using bin2hex($string) and check the hexadecimal representation instead of relying on vim.

If either UTF-16 or UCS2 are being used, you can convert them like so:

// iconv($from, $to, $str)
$clean = iconv('UTF-16', 'UTF-8', $string);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top