Question

Is there a SQL or PHP script that I can run that will change the default collation in all tables and fields in a database?

I can write one myself, but I think that this should be something that readily available at a site like this. If I can come up with one myself before somebody posts one, I will post it myself.

Was it helpful?

Solution

Be careful! If you actually have utf stored as another encoding, you could have a real mess on your hands. Back up first. Then try some of the standard methods:

for instance http://www.cesspit.net/drupal/node/898 http://www.hackszine.com/blog/archive/2007/05/mysql_database_migration_latin.html

I've had to resort to converting all text fields to binary, then back to varchar/text. This has saved my ass.

I had data is UTF8, stored as latin1. What I did:

Drop indexes. Convert fields to binary. Convert to utf8-general ci

If your on LAMP, don’t forget to add set NAMES command before interacting with the db, and make sure you set character encoding headers.

OTHER TIPS

Can be done in a single command (rather than 148 of PHP):

mysql --database=dbname -B -N -e "SHOW TABLES" \
| awk '{print "SET foreign_key_checks = 0; ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1; "}' \
| mysql --database=dbname &

You've got to love the commandline... (You might need to employ the --user and --password options for mysql).

EDIT: to avoid foreign key problems, added SET foreign_key_checks = 0; and SET foreign_key_checks = 1;

I think it's easy to do this in two steps runin PhpMyAdmin.
Step 1:

SELECT CONCAT('ALTER TABLE `', t.`TABLE_SCHEMA`, '`.`', t.`TABLE_NAME`,
 '` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;') as stmt 
FROM `information_schema`.`TABLES` t
WHERE 1
AND t.`TABLE_SCHEMA` = 'database_name'
ORDER BY 1

Step 2:
This query will output a list of queries, one for each table. You have to copy the list of queries, and paste them to the command line or to PhpMyAdmin's SQL tab for the changes to be made.

OK, I wrote this up taking into account what was said in this thread. Thanks for the help, and I hope this script will help out others. I don't have any warranty for its use, so PLEASE BACKUP before running it. It should work with all databases; and it worked great on my own.

EDIT: Added vars at the top for which charset/collate to convert to. EDIT2: Changes the database's and tables' default charset/collate

<?php

function MysqlError()
{
    if (mysql_errno())
    {
        echo "<b>Mysql Error: " . mysql_error() . "</b>\n";
    }
}

$username = "root";
$password = "";
$db = "database";
$host = "localhost";

$target_charset = "utf8";
$target_collate = "utf8_general_ci";

echo "<pre>";

$conn = mysql_connect($host, $username, $password);
mysql_select_db($db, $conn);

$tabs = array();
$res = mysql_query("SHOW TABLES");
MysqlError();
while (($row = mysql_fetch_row($res)) != null)
{
    $tabs[] = $row[0];
}

// now, fix tables
foreach ($tabs as $tab)
{
    $res = mysql_query("show index from {$tab}");
    MysqlError();
    $indicies = array();

    while (($row = mysql_fetch_array($res)) != null)
    {
        if ($row[2] != "PRIMARY")
        {
            $indicies[] = array("name" => $row[2], "unique" => !($row[1] == "1"), "col" => $row[4]);
            mysql_query("ALTER TABLE {$tab} DROP INDEX {$row[2]}");
            MysqlError();
            echo "Dropped index {$row[2]}. Unique: {$row[1]}\n";
        }
    }

    $res = mysql_query("DESCRIBE {$tab}");
    MysqlError();
    while (($row = mysql_fetch_array($res)) != null)
    {
        $name = $row[0];
        $type = $row[1];
        $set = false;
        if (preg_match("/^varchar\((\d+)\)$/i", $type, $mat))
        {
            $size = $mat[1];
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} VARBINARY({$size})");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} VARCHAR({$size}) CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "CHAR"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} BINARY(1)");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} VARCHAR(1) CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "TINYTEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} TINYBLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} TINYTEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "MEDIUMTEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} MEDIUMBLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} MEDIUMTEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "LONGTEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} LONGBLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} LONGTEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "TEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} BLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} TEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }

        if ($set)
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} COLLATE {$target_collate}");
    }

    // re-build indicies..
    foreach ($indicies as $index)
    {
        if ($index["unique"])
        {
            mysql_query("CREATE UNIQUE INDEX {$index["name"]} ON {$tab} ({$index["col"]})");
            MysqlError();
        }
        else
        {
            mysql_query("CREATE INDEX {$index["name"]} ON {$tab} ({$index["col"]})");
            MysqlError();
        }

        echo "Created index {$index["name"]} on {$tab}. Unique: {$index["unique"]}\n";
    }

    // set default collate
    mysql_query("ALTER TABLE {$tab}  DEFAULT CHARACTER SET {$target_charset} COLLATE {$target_collate}");
}

// set database charset
mysql_query("ALTER DATABASE {$db} DEFAULT CHARACTER SET {$target_charset} COLLATE {$target_collate}");

mysql_close($conn);
echo "</pre>";

?>

This PHP snippet will change the collation on all tables in a db. (It's taken from this site.)

<?php
// your connection
mysql_connect("localhost","root","***");
mysql_select_db("db1");

// convert code
$res = mysql_query("SHOW TABLES");
while ($row = mysql_fetch_array($res))
{
    foreach ($row as $key => $table)
    {
        mysql_query("ALTER TABLE " . $table . " CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci");
        echo $key . " =&gt; " . $table . " CONVERTED<br />";
    }
}
?> 

Another approach using command line, based on @david's without the awk

for t in $(mysql --user=root --password=admin  --database=DBNAME -e "show tables";);do echo "Altering" $t;mysql --user=root --password=admin --database=DBNAME -e "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;";done

prettified

  for t in $(mysql --user=root --password=admin  --database=DBNAME -e "show tables";);
    do 
       echo "Altering" $t;
       mysql --user=root --password=admin --database=DBNAME -e "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;";
    done

A more complete version of the script above can be found here:

http://www.zen-cart.com/index.php?main_page=product_contrib_info&products_id=1937

Please leave any feedback about this contribution here:http://www.zen-cart.com/forum/showthread.php?p=1034214

Charset and collation are not the same thing. A collation is a set of rules about how to sort strings. A charset is a set of rules about how to represent characters. A collation depends on the charset.

In scripts above all tables selected for convertation (with SHOW TABLES), but a more convenient and portable way to check the table collation before converting a table. This query does it:

SELECT table_name
     , table_collation 
FROM information_schema.tables

Use my custom shell collatedb, it should work :

collatedb <username> <password> <database> <collation>

Example :

collatedb root 0000 myDatabase utf8_bin

Thanks @nlaq for the code, that got me started on the below solution.

I released a WordPress plugin without realising that WordPress doesn't set the collate automatically. So a lot of people using the plugin ended up with latin1_swedish_ci when it should have been utf8_general_ci.

Here's the code I added to the plugin to detect the latin1_swedish_ci collate and change it to utf8_general_ci.

Test this code before using it in your own plugin!

// list the names of your wordpress plugin database tables (without db prefix)
$tables_to_check = array(
    'social_message',
    'social_facebook',
    'social_facebook_message',
    'social_facebook_page',
    'social_google',
    'social_google_mesage',
    'social_twitter',
    'social_twitter_message',
);
// choose the collate to search for and replace:
$convert_fields_collate_from = 'latin1_swedish_ci';
$convert_fields_collate_to = 'utf8_general_ci';
$convert_tables_character_set_to = 'utf8';
$show_debug_messages = false;
global $wpdb;
$wpdb->show_errors();
foreach($tables_to_check as $table) {
    $table = $wpdb->prefix . $table;
    $indicies = $wpdb->get_results(  "SHOW INDEX FROM `$table`", ARRAY_A );
    $results = $wpdb->get_results( "SHOW FULL COLUMNS FROM `$table`" , ARRAY_A );
    foreach($results as $result){
        if($show_debug_messages)echo "Checking field ".$result['Field'] ." with collat: ".$result['Collation']."\n";
        if(isset($result['Field']) && $result['Field'] && isset($result['Collation']) && $result['Collation'] == $convert_fields_collate_from){
            if($show_debug_messages)echo "Table: $table - Converting field " .$result['Field'] ." - " .$result['Type']." - from $convert_fields_collate_from to $convert_fields_collate_to \n";
            // found a field to convert. check if there's an index on this field.
            // we have to remove index before converting field to binary.
            $is_there_an_index = false;
            foreach($indicies as $index){
                if ( isset($index['Column_name']) && $index['Column_name'] == $result['Field']){
                    // there's an index on this column! store it for adding later on.
                    $is_there_an_index = $index;
                    $wpdb->query( $wpdb->prepare( "ALTER TABLE `%s` DROP INDEX %s", $table, $index['Key_name']) );
                    if($show_debug_messages)echo "Dropped index ".$index['Key_name']." before converting field.. \n";
                    break;
                }
            }
            $set = false;

            if ( preg_match( "/^varchar\((\d+)\)$/i", $result['Type'], $mat ) ) {
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` VARBINARY({$mat[1]})" );
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` VARCHAR({$mat[1]}) CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
                $set = true;
            } else if ( !strcasecmp( $result['Type'], "CHAR" ) ) {
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` BINARY(1)" );
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` VARCHAR(1) CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
                $set = true;
            } else if ( !strcasecmp( $result['Type'], "TINYTEXT" ) ) {
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` TINYBLOB" );
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` TINYTEXT CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
                $set = true;
            } else if ( !strcasecmp( $result['Type'], "MEDIUMTEXT" ) ) {
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` MEDIUMBLOB" );
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` MEDIUMTEXT CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
                $set = true;
            } else if ( !strcasecmp( $result['Type'], "LONGTEXT" ) ) {
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` LONGBLOB" );
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` LONGTEXT CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
                $set = true;
            } else if ( !strcasecmp( $result['Type'], "TEXT" ) ) {
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` BLOB" );
                $wpdb->query( "ALTER TABLE `{$table}` MODIFY `{$result['Field']}` TEXT CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
                $set = true;
            }else{
                if($show_debug_messages)echo "Failed to change field - unsupported type: ".$result['Type']."\n";
            }
            if($set){
                if($show_debug_messages)echo "Altered field success! \n";
                $wpdb->query( "ALTER TABLE `$table` MODIFY {$result['Field']} COLLATE $convert_fields_collate_to" );
            }
            if($is_there_an_index !== false){
                // add the index back.
                if ( !$is_there_an_index["Non_unique"] ) {
                    $wpdb->query( "CREATE UNIQUE INDEX `{$is_there_an_index['Key_name']}` ON `{$table}` ({$is_there_an_index['Column_name']})", $is_there_an_index['Key_name'], $table, $is_there_an_index['Column_name'] );
                } else {
                    $wpdb->query( "CREATE UNIQUE INDEX `{$is_there_an_index['Key_name']}` ON `{$table}` ({$is_there_an_index['Column_name']})", $is_there_an_index['Key_name'], $table, $is_there_an_index['Column_name'] );
                }
            }
        }
    }
    // set default collate
    $wpdb->query( "ALTER TABLE `{$table}` DEFAULT CHARACTER SET {$convert_tables_character_set_to} COLLATE {$convert_fields_collate_to}" );
    if($show_debug_messages)echo "Finished with table $table \n";
}
$wpdb->hide_errors();

A simple (dumb? :) solution, using multi-select feature of Your IDE:

  1. run "SHOW TABLES;" query and copy results column (table names).
  2. multi-select beginnings and add "ALTER TABLE ".
  3. multi-select endings and add " CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;"
  4. run created queries.

Here's an easy way to do this with just phpmyadmin if you don't have command line access or access to edit INFORMATION_SCHEMA.

First, listen to the advice of many of the other answers here - you can really screw things up here, so make a backup. Now make a backup of your backup. Also this is unlikely to work if your data is encoded differently than what you are changing it to.

Note that you will need to find the exact names of the offending schema and character encoding that you need to change from before starting.

  1. Export the database as SQL; Make a copy; Open it in a text editor of your choice
  2. Find and Replace the schema first, for example - find: latin1_swedish_ci, replace: utf8_general_ci
  3. Find and Replace the character encodings if you need to, for example - find: latin1, replace: utf8
  4. Create a new test database and upload your new SQL file into phpmyadmin

This is a super easy way to do it, but again, this will not change the encoding of your data, so it will only work in certain circumstances.

I think the fastest way is with phpmyadmin and some jQuery on console.

Go to table's structure and open chrome/firefox developer console (normally F12 on keyboard):

  1. run this code to select all fields with incorrect charset and start modify:

    var elems = $('dfn'); var lastID = elems.length - 1;
    elems.each(function(i) {
        if ($(this).html() != 'utf8_general_ci') { 
           $('input:checkbox', $('td', $(this).parent().parent()).first()).attr('checked','checked');
        }       
    
        if (i == lastID) {
            $("button[name='submit_mult'][value='change']").click();
        }
    });
    
  2. when page is loaded use this code on console to select correct encoding:

    $("select[name*='field_collation']" ).val('utf8_general_ci');
    
  3. save

  4. change the table's charset on "Collation" field on "Operation" tab

Tested on phpmyadmin 4.0 and 4.4, but I think work on all 4.x versions

I updated nlaq's answer to work with PHP7 and to correctly handle multicolumn indices, binary collated data (e.g. latin1_bin), etc., and cleaned up the code a bit. This is the only code I found/tried that successfully migrated my database from latin1 to utf8.

<?php

/////////// BEGIN CONFIG ////////////////////

$username = "";
$password = "";
$db = "";
$host = "";

$target_charset = "utf8";
$target_collation = "utf8_unicode_ci";
$target_bin_collation = "utf8_bin";

///////////  END CONFIG  ////////////////////

function MySQLSafeQuery($conn, $query) {
    $res = mysqli_query($conn, $query);
    if (mysqli_errno($conn)) {
        echo "<b>Mysql Error: " . mysqli_error($conn) . "</b>\n";
        echo "<span>This query caused the above error: <i>" . $query . "</i></span>\n";
    }
    return $res;
}

function binary_typename($type) {
    $mysql_type_to_binary_type_map = array(
        "VARCHAR" => "VARBINARY",
        "CHAR" => "BINARY(1)",
        "TINYTEXT" => "TINYBLOB",
        "MEDIUMTEXT" => "MEDIUMBLOB",
        "LONGTEXT" => "LONGBLOB",
        "TEXT" => "BLOB"
    );

    $typename = "";
    if (preg_match("/^varchar\((\d+)\)$/i", $type, $mat))
        $typename = $mysql_type_to_binary_type_map["VARCHAR"] . "(" . (2*$mat[1]) . ")";
    else if (!strcasecmp($type, "CHAR"))
        $typename = $mysql_type_to_binary_type_map["CHAR"] . "(1)";
    else if (array_key_exists(strtoupper($type), $mysql_type_to_binary_type_map))
        $typename = $mysql_type_to_binary_type_map[strtoupper($type)];
    return $typename;
}

echo "<pre>";

// Connect to database
$conn = mysqli_connect($host, $username, $password);
mysqli_select_db($conn, $db);

// Get list of tables
$tabs = array();
$query = "SHOW TABLES";
$res = MySQLSafeQuery($conn, $query);
while (($row = mysqli_fetch_row($res)) != null)
    $tabs[] = $row[0];

// Now fix tables
foreach ($tabs as $tab) {
    $res = MySQLSafeQuery($conn, "SHOW INDEX FROM `{$tab}`");
    $indicies = array();

    while (($row = mysqli_fetch_array($res)) != null) {
        if ($row[2] != "PRIMARY") {
            $append = true;
            foreach ($indicies as $index) {
                if ($index["name"] == $row[2]) {
                    $index["col"][] = $row[4];
                    $append = false;
                }
            }
            if($append)
                $indicies[] = array("name" => $row[2], "unique" => !($row[1] == "1"), "col" => array($row[4]));
        }
    }

    foreach ($indicies as $index) {
        MySQLSafeQuery($conn, "ALTER TABLE `{$tab}` DROP INDEX `{$index["name"]}`");
        echo "Dropped index {$index["name"]}. Unique: {$index["unique"]}\n";
    }

    $res = MySQLSafeQuery($conn, "SHOW FULL COLUMNS FROM `{$tab}`");
    while (($row = mysqli_fetch_array($res)) != null) {
        $name = $row[0];
        $type = $row[1];
        $current_collation = $row[2];
        $target_collation_bak = $target_collation;
        if(!strcasecmp($current_collation, "latin1_bin"))
            $target_collation = $target_bin_collation;
        $set = false;
        $binary_typename = binary_typename($type);
        if ($binary_typename != "") {
            MySQLSafeQuery($conn, "ALTER TABLE `{$tab}` MODIFY `{$name}` {$binary_typename}");
            MySQLSafeQuery($conn, "ALTER TABLE `{$tab}` MODIFY `{$name}` {$type} CHARACTER SET '{$target_charset}' COLLATE '{$target_collation}'");
            $set = true;
            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        $target_collation = $target_collation_bak;
    }

    // Rebuild indicies
    foreach ($indicies as $index) {
         // Handle multi-column indices
         $joined_col_str = "";
         foreach ($index["col"] as $col)
             $joined_col_str = $joined_col_str . ", `" . $col . "`";
         $joined_col_str = substr($joined_col_str, 2);

         $query = "";
         if ($index["unique"])
             $query = "CREATE UNIQUE INDEX `{$index["name"]}` ON `{$tab}` ({$joined_col_str})";
         else
             $query = "CREATE INDEX `{$index["name"]}` ON `{$tab}` ({$joined_col_str})";
         MySQLSafeQuery($conn, $query);

        echo "Created index {$index["name"]} on {$tab}. Unique: {$index["unique"]}\n";
    }

    // Set default character set and collation for table
    MySQLSafeQuery($conn, "ALTER TABLE `{$tab}`  DEFAULT CHARACTER SET '{$target_charset}' COLLATE '{$target_collation}'");
}

// Set default character set and collation for database
MySQLSafeQuery($conn, "ALTER DATABASE `{$db}` DEFAULT CHARACTER SET '{$target_charset}' COLLATE '{$target_collation}'");

mysqli_close($conn);
echo "</pre>";

?>

For Windows Users

In addition to @davidwinterbottom answer, windows users can use command below:

mysql.exe --database=[database] -u [user] -p[password] -B -N -e "SHOW TABLES" \
| awk.exe '{print "SET foreign_key_checks = 0; ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1; "}' \
| mysql.exe -u [user] -p[password] --database=[database] &

Replace [database], [user] and [password] placeholders with actual values.

Git-bash users can download this bash script and run it easily.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top