Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A script to change all tables and fields to the utf-8-bin collation in MYSQL

People also ask

How do I change MySQL encoding to UTF-8?

To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql> prompt. Replace dbname with the database name: Copy ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci; To exit the mysql program, type \q at the mysql> prompt.

How do I change the column collation in MySQL?

Goto PhpMyAdmin->Operations->Collation. No need of select collation while creating new columns. Show activity on this post. The following query will generate ALTER queries that change the collation for all appropriate columns in all tables to a certain type (utf8_general_ci in my example below).


Can be done in a single command (rather than 148 of PHP):

mysql --database=dbname -B -N -e "SHOW TABLES" \
| awk '{print "SET foreign_key_checks = 0; ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1; "}' \
| mysql --database=dbname &

You've got to love the commandline... (You might need to employ the --user and --password options for mysql).

EDIT: to avoid foreign key problems, added SET foreign_key_checks = 0; and SET foreign_key_checks = 1;


I think it's easy to do this in two steps runin PhpMyAdmin.
Step 1:

SELECT CONCAT('ALTER TABLE `', t.`TABLE_SCHEMA`, '`.`', t.`TABLE_NAME`,
 '` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;') as stmt 
FROM `information_schema`.`TABLES` t
WHERE 1
AND t.`TABLE_SCHEMA` = 'database_name'
ORDER BY 1

Step 2:
This query will output a list of queries, one for each table. You have to copy the list of queries, and paste them to the command line or to PhpMyAdmin's SQL tab for the changes to be made.


OK, I wrote this up taking into account what was said in this thread. Thanks for the help, and I hope this script will help out others. I don't have any warranty for its use, so PLEASE BACKUP before running it. It should work with all databases; and it worked great on my own.

EDIT: Added vars at the top for which charset/collate to convert to. EDIT2: Changes the database's and tables' default charset/collate

<?php

function MysqlError()
{
    if (mysql_errno())
    {
        echo "<b>Mysql Error: " . mysql_error() . "</b>\n";
    }
}

$username = "root";
$password = "";
$db = "database";
$host = "localhost";

$target_charset = "utf8";
$target_collate = "utf8_general_ci";

echo "<pre>";

$conn = mysql_connect($host, $username, $password);
mysql_select_db($db, $conn);

$tabs = array();
$res = mysql_query("SHOW TABLES");
MysqlError();
while (($row = mysql_fetch_row($res)) != null)
{
    $tabs[] = $row[0];
}

// now, fix tables
foreach ($tabs as $tab)
{
    $res = mysql_query("show index from {$tab}");
    MysqlError();
    $indicies = array();

    while (($row = mysql_fetch_array($res)) != null)
    {
        if ($row[2] != "PRIMARY")
        {
            $indicies[] = array("name" => $row[2], "unique" => !($row[1] == "1"), "col" => $row[4]);
            mysql_query("ALTER TABLE {$tab} DROP INDEX {$row[2]}");
            MysqlError();
            echo "Dropped index {$row[2]}. Unique: {$row[1]}\n";
        }
    }

    $res = mysql_query("DESCRIBE {$tab}");
    MysqlError();
    while (($row = mysql_fetch_array($res)) != null)
    {
        $name = $row[0];
        $type = $row[1];
        $set = false;
        if (preg_match("/^varchar\((\d+)\)$/i", $type, $mat))
        {
            $size = $mat[1];
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} VARBINARY({$size})");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} VARCHAR({$size}) CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "CHAR"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} BINARY(1)");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} VARCHAR(1) CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "TINYTEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} TINYBLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} TINYTEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "MEDIUMTEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} MEDIUMBLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} MEDIUMTEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "LONGTEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} LONGBLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} LONGTEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }
        else if (!strcasecmp($type, "TEXT"))
        {
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} BLOB");
            MysqlError();
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} TEXT CHARACTER SET {$target_charset}");
            MysqlError();
            $set = true;

            echo "Altered field {$name} on {$tab} from type {$type}\n";
        }

        if ($set)
            mysql_query("ALTER TABLE {$tab} MODIFY {$name} COLLATE {$target_collate}");
    }

    // re-build indicies..
    foreach ($indicies as $index)
    {
        if ($index["unique"])
        {
            mysql_query("CREATE UNIQUE INDEX {$index["name"]} ON {$tab} ({$index["col"]})");
            MysqlError();
        }
        else
        {
            mysql_query("CREATE INDEX {$index["name"]} ON {$tab} ({$index["col"]})");
            MysqlError();
        }

        echo "Created index {$index["name"]} on {$tab}. Unique: {$index["unique"]}\n";
    }

    // set default collate
    mysql_query("ALTER TABLE {$tab}  DEFAULT CHARACTER SET {$target_charset} COLLATE {$target_collate}");
}

// set database charset
mysql_query("ALTER DATABASE {$db} DEFAULT CHARACTER SET {$target_charset} COLLATE {$target_collate}");

mysql_close($conn);
echo "</pre>";

?>

Be careful! If you actually have utf stored as another encoding, you could have a real mess on your hands. Back up first. Then try some of the standard methods:

for instance http://www.cesspit.net/drupal/node/898 http://www.hackszine.com/blog/archive/2007/05/mysql_database_migration_latin.html

I've had to resort to converting all text fields to binary, then back to varchar/text. This has saved my ass.

I had data is UTF8, stored as latin1. What I did:

Drop indexes. Convert fields to binary. Convert to utf8-general ci

If your on LAMP, don’t forget to add set NAMES command before interacting with the db, and make sure you set character encoding headers.


This PHP snippet will change the collation on all tables in a db. (It's taken from this site.)

<?php
// your connection
mysql_connect("localhost","root","***");
mysql_select_db("db1");

// convert code
$res = mysql_query("SHOW TABLES");
while ($row = mysql_fetch_array($res))
{
    foreach ($row as $key => $table)
    {
        mysql_query("ALTER TABLE " . $table . " CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci");
        echo $key . " =&gt; " . $table . " CONVERTED<br />";
    }
}
?>