Question

We are running a cron script which takes new users from a csv and inserts them into our database. It is failing whenever it comes across a user with a special character in their name, but I can't see why, as far as I can see everything is set so that it should work.

Here is an example of a name it's failing on:

Siobhán

Error message:

!! Incorrect string value: '\xE1n' for column 'firstname' at row 1

And then the var_dump of the data it's trying to insert has the name as: Siobh so it's cut off the special characters

Here is the output of show variables like 'char%' on our database:

character_set_client utf8
character_set_connection utf8
character_set_database utf8
character_set_filesystem binary
character_set_results utf8
character_set_server utf8
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/

The collation of the users table is utf8_unicode_ci

The collation of the firstname and lastname columns is utf8_unicode_ci

The header of the php script sets:

content-type: text/html; charset=utf-8

And when I run mb_detect_encoding() on the variable before it's passed into the query, it comes back as UTF-8

So I am out of ideas here as to why it's failing...

Does anyone have any ideas as to where we are going wrong?

Thanks

Was it helpful?

Solution

As we already found the problem in the comments, here is the solution:

The program creating your csv file, does create the file in ANSI format. It needs to write in UTF-8.

This implies that the actual source of the data iself needs to be UTF-8, too and all your PHP files should be UTF-8, too.

See this for help: http://www.php.net/manual/de/function.fopen.php#104325

Wherever your scripts takes the data from and into the csv, the data must be converted or already be in utf-8.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top