How to force mysqldump to produce /*!40101 SET character_set_client = utf8mb4 */;
-
22-02-2021 - |
Question
I am attempting to dump my database, which is set to, and only contains, utf8mb4
and appropriate collations. I use the following line:
mysqldump --add-drop-table --add-locks --allow-keywords --complete-insert=TRUE --compress --create-options --databases [database] --default-character-set=utf8mb4 --disable-keys --dump-date --extended-insert=FALSE --hex-blob --lock-tables=FALSE --no-autocommit --no-data --set-charset --order-by-primary --quote-names --routines --single-transaction --skip-quick --triggers -u [user] -h [server] > dump.sql
This works well and produces the output I want, with one exception. My dump is littered with this:
/*!40101 SET character_set_client = utf8 */;
/*!50001 SET character_set_results = utf8 */;
/*!50001 SET collation_connection = utf8_general_ci */;
I have read up on conditional comments, and as I understand it, this will always get executed on our clients/servers (Server Version: 5.6.46 - MySQL Community Server (GPL)). I don't understand why this is set to utf8
instead of utf8mb4
, and would like it to reflect our proper charset.
None of the existing answers to this problem have helped me.
Edit: I originally had an incorrect version in my post
Solution
I have checked out mysql-5.4.46, and searched client/mysqldump.c for "SET character_set_client".
It finds
[:~/Source/mysql-server/client]↥ 9c3a49ec84b* ± grep 'SET character_set_client' mysqldump.c
"/*!50003 SET character_set_client = %s */ %s\n"
"/*!50003 SET character_set_client = @saved_cs_client */ %s\n"
"SET character_set_client = utf8;\n"
"SET character_set_client = @saved_cs_client;\n");
"/*!40101 SET character_set_client = utf8 */;\n"
"/*!40101 SET character_set_client = @saved_cs_client */;\n",
"/*!50001 SET character_set_client = %s */;\n"
"/*!50001 SET character_set_client = @saved_cs_client */;\n"
From this, it seems that some instances of "SET character_set_client" are fixed at utf8 in this version and are not even variable.
One of them is the /*!40101 SET character_set_client = utf8 */;
line you quote as being faulty.
After checking out current 8.0 HEAD, I get
[:~/Source/mysql-server/client]↥ 8.0* ± grep 'SET character_set_client' mysqldump.cc
"/*!50003 SET character_set_client = %s */ %s\n"
"/*!50003 SET character_set_client = @saved_cs_client */ %s\n"
"/*!50503 SET character_set_client = utf8mb4 */;\n"
"SET character_set_client = @saved_cs_client;\n");
"/*!50503 SET character_set_client = utf8mb4 */;\n"
"/*!40101 SET character_set_client = @saved_cs_client */;\n",
"/*!50001 SET character_set_client = %s */;\n"
"/*!50001 SET character_set_client = @saved_cs_client */;\n"
So while there are still fixed SET character_set_client
statements, they now default to utf8mb4. Note that all other statements are %s-ed, and further checking shows they defer to whatever you --default-character-set=utf8mb4
in your commandline.
My suggestion is you dump, possibly remotely, with a newer mysql 8.0 binary. In general, upgrading is adviseable, as 5.6 is an outdated unsupported version of the database (yes, I know, it's complicated. It always is.)
OTHER TIPS
Looks like there has been a bug on the issue:
----- 2019-07-22 8.0.17 General Availability -- -- -----
mysqldump failed to wrap SET NAMES utf8mb4 and SET character_set_client = utf8mb4 statements within version-specific comments, which could cause compatibility problems. (Bug #29007506, Bug #93450)
----- 2018-11-20 MariaDB 10.3.11 -- Release Note -- -----
mysqldump now uses utf8mb4 as a default character set, instead of utf8.