Question

I am attempting to dump my database, which is set to, and only contains, utf8mb4 and appropriate collations. I use the following line:

mysqldump --add-drop-table --add-locks --allow-keywords --complete-insert=TRUE --compress --create-options --databases [database] --default-character-set=utf8mb4 --disable-keys --dump-date --extended-insert=FALSE --hex-blob --lock-tables=FALSE --no-autocommit --no-data --set-charset --order-by-primary --quote-names --routines --single-transaction --skip-quick --triggers -u [user] -h [server] > dump.sql

This works well and produces the output I want, with one exception. My dump is littered with this:

/*!40101 SET character_set_client  = utf8 */;
/*!50001 SET character_set_results = utf8 */;
/*!50001 SET collation_connection  = utf8_general_ci */;

I have read up on conditional comments, and as I understand it, this will always get executed on our clients/servers (Server Version: 5.6.46 - MySQL Community Server (GPL)). I don't understand why this is set to utf8 instead of utf8mb4, and would like it to reflect our proper charset.

None of the existing answers to this problem have helped me.

Edit: I originally had an incorrect version in my post

Was it helpful?

Solution

I have checked out mysql-5.4.46, and searched client/mysqldump.c for "SET character_set_client".

It finds

[:~/Source/mysql-server/client]↥ 9c3a49ec84b* ± grep 'SET character_set_client' mysqldump.c
          "/*!50003 SET character_set_client  = %s */ %s\n"
          "/*!50003 SET character_set_client  = @saved_cs_client */ %s\n"
                  "SET character_set_client = utf8;\n"
                  "SET character_set_client = @saved_cs_client;\n");
                "/*!40101 SET character_set_client = utf8 */;\n"
                "/*!40101 SET character_set_client = @saved_cs_client */;\n",
            "/*!50001 SET character_set_client      = %s */;\n"
            "/*!50001 SET character_set_client      = @saved_cs_client */;\n"

From this, it seems that some instances of "SET character_set_client" are fixed at utf8 in this version and are not even variable.

One of them is the /*!40101 SET character_set_client = utf8 */; line you quote as being faulty.

After checking out current 8.0 HEAD, I get

[:~/Source/mysql-server/client]↥ 8.0* ± grep 'SET character_set_client' mysqldump.cc
          "/*!50003 SET character_set_client  = %s */ %s\n"
          "/*!50003 SET character_set_client  = @saved_cs_client */ %s\n"
                  "/*!50503 SET character_set_client = utf8mb4 */;\n"
                  "SET character_set_client = @saved_cs_client;\n");
              "/*!50503 SET character_set_client = utf8mb4 */;\n"
              "/*!40101 SET character_set_client = @saved_cs_client */;\n",
        "/*!50001 SET character_set_client      = %s */;\n"
        "/*!50001 SET character_set_client      = @saved_cs_client */;\n"

So while there are still fixed SET character_set_client statements, they now default to utf8mb4. Note that all other statements are %s-ed, and further checking shows they defer to whatever you --default-character-set=utf8mb4 in your commandline.

My suggestion is you dump, possibly remotely, with a newer mysql 8.0 binary. In general, upgrading is adviseable, as 5.6 is an outdated unsupported version of the database (yes, I know, it's complicated. It always is.)

OTHER TIPS

Looks like there has been a bug on the issue:

----- 2019-07-22 8.0.17 General Availability -- -- -----

mysqldump failed to wrap SET NAMES utf8mb4 and SET character_set_client = utf8mb4 statements within version-specific comments, which could cause compatibility problems. (Bug #29007506, Bug #93450)

----- 2018-11-20 MariaDB 10.3.11 -- Release Note -- -----

mysqldump now uses utf8mb4 as a default character set, instead of utf8.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top