Page with UTF-8 encoding sends data to MySQL with UTF-8 encoding but entry is scrambled

StackOverflow https://stackoverflow.com/questions/17552986

  •  02-06-2022
  •  | 
  •  

Question

I realize there's a dozen similar questions, but none of the solutions suggested there work in this case.

I have a PHP variable on a page, initialized as:

$hometeam="Крылья Советов";    //Cyrrilic string

When I print it out on the page, it prints out correctly. So echo $hometeam displays the string Крылья Советов, as it should.

The content meta tag in the header is set as follows:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

And, at the very beginning of the page, I have the following (as suggested in one of the solutions found in my search):

ini_set('default_charset', 'utf-8');

So that should be all good.

The MySQL table I'm trying to save this to, and the column in question, have utf8_bin as their encoding. When I go to phpMyAdmin and manually enter Крылья Советов, it saves properly in the field.

However, when I try to save it through a query on the page, using the following basic query:

mysql_query("insert into tablename (round,hometeam) values ('1','$hometeam') ");

The mysql entry looks like this:

c390c5a1c391e282acc391e280b9c390c2bbc391c592c391c28f20c390c2a1c390c2bec390c2b2c390c2b5c391e2809ac390c2bec390c2b2

So what's going on here? If everything is ok on the page, and everything is ok with MySQL itself, where is the issue? Is there something I should add to the query itself to make it keep the string UTF-8 encoded?

Note that I have set mysql_set_charset('utf8'); after connecting to the database (at the top of the page).

EDIT: Running the query SHOW VARIABLES LIKE "%character_set%" gives the following:

Variable_name   Value
character_set_client    utf8
character_set_connection    utf8
character_set_database  latin1
character_set_filesystem    binary
character_set_results   utf8
character_set_server    latin1
character_set_system    utf8
character_sets_dir  /usr/share/mysql/charsets/

Seems like there could be something here, since there are 2 latin1's in that list. What do you think?

Also, when I type a Cyrillic string directly into phpMyAdmin, it appears fine at first (it displays correctly after I save it). But reloading the table, it displays in HEX like the inserted ones. I apologize for the misinformation regarding this in the question. As it turns out, this should mean the problem is with phpMyAdmin or the database itself.

EDIT #2: this is what show create table tablename returns:

CREATE TABLE `tablename` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `round` int(11),  `hometeam` varchar(32) COLLATE utf8_bin NOT NULL,  `competition` varchar(32) CHARACTER SET latin1 NOT NULL DEFAULT 'Russia',  PRIMARY KEY (`id`)) ENGINE=MyISAM AUTO_INCREMENT=119 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
Était-ce utile?

La solution 2

Also, when I type a Cyrillic string directly into phpMyAdmin, it appears fine at first (it displays correctly after I save it). But reloading the table, it displays in HEX like the inserted ones.

This almost certainly looks like there is a problem in your table! Run show create table tablename. I bet there is latin1 instead of utf8, because you have it set as the default in the character_set_database variable.

To change this, run the following commmand:

ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;

This will convert all your varchar fields to utf8. But be careful with the records you already have in the table, as they are already malformed, if you converted them to UTF8 they will stay malformed. Maybe the best idea is to create the database again, just add the following commands at the end of table definition:

CREATE TABLE `tablename` (
    ....
) ENGINE=<whatever you use> DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci

Autres conseils

Do you get this hex string in phpMyAdmin? I suppose when you SELECT the inserted value by e.g. PHP or the MySQL console client, you would be given the expected cyrillic UTF8 string.

If so, it's a configuration issue with phpMyAdmin, see e.g. here: http://theyouri.blogspot.ch/2010/12/phpmyadmin-collated-db-in-utf8bin-shows.html

phpMyAdmin collated db in utf8_bin shows hex data instead of UTF8 text

$cfg['DisplayBinaryAsHex'] = false;

Moreover, please don't use mysql_query that way, since you're totally open to SQL injections. I'm also not sure if you really want to use utf8_bin, see e.g. this discussion: utf8_bin vs. utf_unicode_ci or this: UTF-8: General? Bin? Unicode?

EDIT There's something weird going on. If you translate the given hex string to UTF8 characters, you get this: "ÐšÑ€Ñ‹Ð»ÑŒÑ Ð¡Ð¾Ð²ÐµÑ‚Ð¾Ð²" (see e.g. http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder). If you utf8_decode this, you get the desired "Крылья Советов". So, it seems that it's at least utf8 encoded twice (besides the problem that it somewhere shows up as hex characters).

Could you please provide the complete script? Do you utf8_encode your string anywhere? If your script is this and only this (besides a valid, opened MySQL connection):

<?php
$hometeam="Крылья Советов";    //Cyrrilic string
// open mysql connection here
mysql_set_charset('utf8');
mysql_query("INSERT INTO tablename (round, hometeam) VALUES ('1', '$hometeam')");
$result = mysql_query("SELECT * FROM tablename WHERE round = '1'");
$row = mysql_fetch_assoc($result);
echo $row['hometeam'];
?>

And you call the page, what is the result (in the page source of the browser, not what is displayed in the browser)?

Also, please check what happens if you change the collation to utf8_unicode_ci, as suggested in another answer here. That at least covers phpMyAdmin issues when displaying binary data and is propably anyway what you'll want (since you probably want ORDER BY clauses to perform as expected, see discussions in the SO questions I linked above).

EDIT2 Perhaps you could also provide some snippets like SHOW CREATE TABLE tablename or SHOW VARIABLES LIKE "%character_set%". Might help.

1) Try to save the entry to the database with the PhpMyAdmin and then also look at the result in PhpMyAdmin. Does it look OK? If yes, database is created and set up properly.

2) Try to use utf8_general_ci instead. This shouldn't matter, but give it a try.

3) Tune all necessary settings on the PHP side - follow this post: http://blog.loftdigital.com/blog/php-utf-8-cheatsheet . Especially try this trick:

echo htmlentities($hometeam, ENT_QUOTES, 'UTF-8')

As I saw in the comments, you don't seam to be able to update your database configuration isn't it?

I guess you have a misconfiguration of the encoding because I saw that in the official documentation MySQL Documentation

I can propose you a PHP solution. Because of a lot of encoding problem you can transform the string before inserting it inside database. You have to find a common language to talk between PHP and the database.

The one I tried in an other project consist in transform string using url_encode($string) and url_decode($string).

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top