Character Set Encoding in a Table
-
16-10-2019 - |
문제
I have the following data:Trollhättan
If I print out and setting up the header to
utf-8
, then this is the output in the browser.If I'm not setting up the
utf-8
in the header,Trollhättan
. However,when I'm storing the data in the database and checking throughphpmyadmin
, I am getting the following string : Trollhättan.When I am setting up the header to
utf-8
, I'm getting this string :Trollhättan
beside thisTrollhättan
.
Since the table is latin1_swedish_ci
, do I have to use in the table UTF-8
?
해결책
Here is the list of collations under UTF-8
mysql> select * from information_schema.collations where CHARACTER_SET_NAME = 'utf8';
+--------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |
+--------------------+--------------------+-----+------------+-------------+---------+
| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 |
| utf8_bin | utf8 | 83 | | Yes | 1 |
| utf8_unicode_ci | utf8 | 192 | | Yes | 8 |
| utf8_icelandic_ci | utf8 | 193 | | Yes | 8 |
| utf8_latvian_ci | utf8 | 194 | | Yes | 8 |
| utf8_romanian_ci | utf8 | 195 | | Yes | 8 |
| utf8_slovenian_ci | utf8 | 196 | | Yes | 8 |
| utf8_polish_ci | utf8 | 197 | | Yes | 8 |
| utf8_estonian_ci | utf8 | 198 | | Yes | 8 |
| utf8_spanish_ci | utf8 | 199 | | Yes | 8 |
| utf8_swedish_ci | utf8 | 200 | | Yes | 8 |
| utf8_turkish_ci | utf8 | 201 | | Yes | 8 |
| utf8_czech_ci | utf8 | 202 | | Yes | 8 |
| utf8_danish_ci | utf8 | 203 | | Yes | 8 |
| utf8_lithuanian_ci | utf8 | 204 | | Yes | 8 |
| utf8_slovak_ci | utf8 | 205 | | Yes | 8 |
| utf8_spanish2_ci | utf8 | 206 | | Yes | 8 |
| utf8_roman_ci | utf8 | 207 | | Yes | 8 |
| utf8_persian_ci | utf8 | 208 | | Yes | 8 |
| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 |
| utf8_hungarian_ci | utf8 | 210 | | Yes | 8 |
| utf8_sinhala_ci | utf8 | 211 | | Yes | 8 |
+--------------------+--------------------+-----+------------+-------------+---------+
22 rows in set (0.03 sec)
latin1_swedish_ci
belongs to latin1
mysql> select * from information_schema.collations where COLLATION_NAME = 'latin1_swedish_ci';
+-------------------+--------------------+----+------------+-------------+---------+
| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |
+-------------------+--------------------+----+------------+-------------+---------+
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 |
+-------------------+--------------------+----+------------+-------------+---------+
1 row in set (0.03 sec)
The closest utf8 for your collation is ID 200
mysql> select * from information_schema.collations where ID = 200;
+-----------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |
+-----------------+--------------------+-----+------------+-------------+---------+
| utf8_swedish_ci | utf8 | 200 | | Yes | 8 |
+-----------------+--------------------+-----+------------+-------------+---------+
1 row in set (0.00 sec)
CAVEAT
You may have to experiment with latin1, change the character set and collation of the individual column in the table, or possibly both. At the very least, use latin1 for display in phpmyadmin.
You could tweek and experiment with making the entire database a specific character set and collation using ALTER DATABASE.
ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;
Make sure you make a backup of the database and load it into a dev/staging DB and then run ALTER DATABASE against dev/staging.
다른 팁
You need to make sure that when you are "storing the data in the database", the client you use is using utf8. Then setting the http header to utf8 completes the round trip.
The surprise is that it doesn't matter much what character set is used in the database (although it might well be convenient for that to be utf8 too).