I'm getting an error try to connect to a remote mysql database from a Windows 7 client via python 2.7 + MySQLdb 1.2.5 + sqlalchemy 1.0.9. This is a result of recently changing the server's default character set to utf8mb4. The server is running MySQL 5.5.50.
I connect like this:
DB_ENGINE = sqlalchemy.create_engine("mysql+mysqldb://{user}:{pass}@{host}:{port}/{database}?charset=utf8mb4".format(**DB_SETTINGS))
Session = sqlalchemy.orm.sessionmaker(bind=DB_ENGINE)
The error is:
File "C:\Applications\Python27\lib\site-packages\sqlalchemy\engine\default.py", line 385, in connect
return self.dbapi.connect(*cargs, **cparams)
File "C:\Applications\Python27\lib\site-packages\MySQLdb\__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "C:\Applications\Python27\lib\site-packages\MySQLdb\connections.py", line 221, in __init__
self.set_character_set(charset)
File "C:\Applications\Python27\lib\site-packages\MySQLdb\connections.py", line 312, in set_character_set
super(Connection, self).set_character_set(charset)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2019, "Can't initialize character set utf8mb4 (path: C:\\mysql\\\\share\\charsets\\)")
The server's my.cnf contains the following:
init_connect = 'SET collation_connection = utf8mb4_unicode_ci'
init_connect = 'SET NAMES utf8mb4'
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
skip-character-set-client-handshake
I have no problem connecting to the database from an Ubuntu client, so I suspect the problem is with the Windows client and not the server's configuration.
The MySQL documentation suggests the error message could be due to the client being compiled without multibyte character set support:
http://dev.mysql.com/doc/refman/5.7/en/cannot-initialize-character-set.html
However, as this is Windows I'm simply downloading the client and don't have control over its compilation flags.
I've tried installing MySQLdb in a variety of ways:
Each of these results in a MySQLdb library that can't seem to handle the utf8mb4 character set.
Any help would be much appreciated!
MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character.
If you need a database, don't use MySQL or MariaDB. Use PostgreSQL. If you need to use MySQL or MariaDB, never use “utf8”. Always use “utf8mb4” when you want UTF-8.
Consider the following checklist:
Did you check your MySQL configuration file (/etc/my.cnf)? It should be:
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
And you can verify them via:
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)
-thanks to Mathias's blog post
Enfore enforce UTF-8 to be used between Python and MySQL:
# Connect to mysql.
dbc = MySQLdb.connect(host='###', user='###', passwd='###', db='###', use_unicode=True)
# Create a cursor.
cursor = dbc.cursor()
# Enforce UTF-8 for the connection.
cursor.execute('SET NAMES utf8mb4')
cursor.execute("SET CHARACTER SET utf8mb4")
cursor.execute("SET character_set_connection=utf8mb4")
# Do database stuff.
# Commit data.
dbc.commit()
# Close cursor and connection.
cursor.close()
dbc.close()
Official tip from MySQL regarding Can't initialize character set
:
This error can have any of the following causes:
The character set is a multibyte character set and you have no support for the character set in the client. In this case, you need to recompile the client by running CMake with the -DDEFAULT_CHARSET=charset_name
or -DWITH_EXTRA_CHARSETS=charset_name
option. See Section 2.9.4, “MySQL Source-Configuration Options”.
All standard MySQL binaries are compiled with -DWITH_EXTRA_CHARSETS=complex
, which enables support for all multibyte character sets. See Section 2.9.4, “MySQL Source-Configuration Options”.
The character set is a simple character set that is not compiled into mysqld, and the character set definition files are not in the place where the client expects to find them.
In this case, you need to use one of the following methods to solve the problem:
Recompile the client with support for the character set. See Section 2.9.4, “MySQL Source-Configuration Options”.
Specify to the client the directory where the character set definition files are located. For many clients, you can do this with the --character-sets-dir
option.
Copy the character definition files to the path where the client expects them to be.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With