Question

I have two database with a table with utf8 table, I wanted to change the encoding of a column, I used two scenarios but a problem happened to me. at first the table was like this :

CREATE TABLE `spool` (
  `username` varchar(250) NOT NULL,
  `xml` text  NOT NULL,
  `seq` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY `seq` (`seq`),
  KEY `i_despool` (`username`) USING BTREE,
  KEY `i_spool_created_at` (`created_at`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8

First scenario

in the first one I change the row encoding with bellow command :

ALTER TABLE spool MODIFY xml TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL;

and then I use show create table spool; like this one :

 CREATE TABLE `spool` (
  `username` varchar(250) NOT NULL,
  `xml` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `seq` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY `seq` (`seq`),
  KEY `i_despool` (`username`) USING BTREE,
  KEY `i_spool_created_at` (`created_at`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8

second scenario

In the second table I first changed the table encoding and then I changed the xml column encoding like this :

ALTER TABLE spool 
CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

and then I changed column encoding like below :

ALTER TABLE spool MODIFY xml TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL;

and then we can see table like this :

CREATE TABLE `spool` (
  `username` varchar(250) COLLATE utf8mb4_unicode_ci NOT NULL,
  `xml` text COLLATE utf8mb4_unicode_ci NOT NULL,
  `seq` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY `seq` (`seq`),
  KEY `i_despool` (`username`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=30849368 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

as you can see in the second scenario in front of the "`xml` text" we don't see the CHARACTER SET utf8mb4 and I think it causes error for me because when Ejabberd wants to insert a query in the second one I see bellow error but in the other table I don't have this error:

HY000Incorrect string value: '\\xF0\\x9F\\x98\\x8F\\xF0\\x9F...' for column 'xml' at row 1"
** Stacktrace: [{ejabberd_odbc,sql_query_t,1,[{file,"src/ejabberd_odbc.erl"},{line,173}]},{lists,foreach,2,[{file,"lists.erl"},{line,1336}]},{ejabberd_odbc,outer_transaction,3,[{file,"src/ejabberd_odbc.erl"},{line,443}]},{ejabberd_odbc,run_sql_cmd,4,[{file,"src/ejabberd_odbc.erl"},{line,380}]},{p1_fsm,handle_msg,10,[{file,"src/p1_fsm.erl"},{line,582}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]

in the first scenario we don't see this error but in the second one the error exists. how to solve this problem?

Was it helpful?

Solution

Interesting question. It looks like in second scenario you firstly convert default charset for table:

To change only the default character set for a table, use this statement: ALTER TABLE tbl_name DEFAULT CHARACTER SET charset_name;

So then you try to MODIFY charset of your xml column. But database may think that this column is already in the desired charset (remember you already change default charset)

Try to use one query like this (see the same page of mysql manual):

ALTER TABLE spool 
CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

To change the table default character set and all character columns (CHAR, VARCHAR, TEXT) to a new character set, use a statement like this: ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;

The statement also changes the collation of all character columns. If you specify no COLLATE clause to indicate which collation to use, the statement uses default collation for the character set. If this collation is inappropriate for the intended table use (for example, if it would change from a case-sensitive collation to a case-insensitive collation), specify a collation explicitly.

OTHER TIPS

You are saying (in both cases) that an INSERT of xml containing 😏 was performed. And that it worked in the first case, but gave an error in the second case?

Describe the client -- its connection parameters, etc.

In the first all went well. I suspect it was correctly saying SET NAMES utf8mb4 or the equivalent in the connection string.

But the second case seemed to have been made with utf8.

Or are you saying that 😏 was in the table, and an ALTER elicited the error message?

Answer:

guys answered this issue but I add some points here too:

as guys said, when we set the default character set of the table like bellow:

ALTER TABLE spool
CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

and after we set the xml column character set like this :

ALTER TABLE spool MODIFY xml TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL;

MySQL thinks that the xml column character set has been set so any changes would not happen and the character set will be shown like :

  `xml` text COLLATE utf8mb4_unicode_ci NOT NULL,

this cause some errors at insertion to the table because really xml character set has not been set, and I think this can be a bug. I rolled back the spool table default character set.

ALTER TABLE spool CHARACTER SET utf8;

This caused that xml character set is now corrrect :

show create table spool

CREATE TABLE `spool` (
  `username` varchar(250) NOT NULL,
  `xml` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `seq` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY `seq` (`seq`),
  KEY `i_despool` (`username`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=32533032 DEFAULT CHARSET=utf8

and I don't get the database Error from Ejabberd neither.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top