Question

I'm trying to import CSVs or Unicode Text with Thai Characters into MySQL. There's no problem with MySQL saving Thai Characters. The problem is, when I use fgetcsv or fgets, I get garbage in exchange for the Thai Characters. Like for example, these characters, ตู้เซฟเหล็ก becomes 9I@ @+%G.

Is there another way I can read from CSV files? A function maybe that can read them correctly?

Was it helpful?

Solution

Functions fgets and fgetcsv uses the system locale setting to make assumptions about character encoding. In my opinion changing locale settings for that purpose isn't clear solution. There is another way. You can use only utf-8 and explicity convert unicode to utf-8:

Example code (php >= 5.3):

<?php
//set internal encoding to utf8
mb_internal_encoding('utf8');

$fileContent = file_get_contents('thai_unicode.csv');

//convert content from unicode to utf
$fileContentUtf = mb_convert_encoding($fileContent, 'utf8', 'unicode');

echo "parse utf8 string:\n";
var_dump(str_getcsv($fileContentUtf, ';'));

and the result is:

php load.php
parse utf8 string:
array(2) {
  [0]=>
  string(36) "ตู้เซฟเหล็ก"
  [1]=>
  string(1) "1"
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top