Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Can't initialize character set utf8mb4" with Windows mysql-python

I'm getting an error try to connect to a remote mysql database from a Windows 7 client via python 2.7 + MySQLdb 1.2.5 + sqlalchemy 1.0.9. This is a result of recently changing the server's default character set to utf8mb4. The server is running MySQL 5.5.50.

I connect like this:

DB_ENGINE = sqlalchemy.create_engine("mysql+mysqldb://{user}:{pass}@{host}:{port}/{database}?charset=utf8mb4".format(**DB_SETTINGS))
Session = sqlalchemy.orm.sessionmaker(bind=DB_ENGINE)

The error is:

  File "C:\Applications\Python27\lib\site-packages\sqlalchemy\engine\default.py", line 385, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "C:\Applications\Python27\lib\site-packages\MySQLdb\__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "C:\Applications\Python27\lib\site-packages\MySQLdb\connections.py", line 221, in __init__
    self.set_character_set(charset)
  File "C:\Applications\Python27\lib\site-packages\MySQLdb\connections.py", line 312, in set_character_set
    super(Connection, self).set_character_set(charset)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2019, "Can't initialize character set utf8mb4 (path: C:\\mysql\\\\share\\charsets\\)")

The server's my.cnf contains the following:

init_connect                   = 'SET collation_connection = utf8mb4_unicode_ci'
init_connect                   = 'SET NAMES utf8mb4'
character-set-server           = utf8mb4
collation-server               = utf8mb4_unicode_ci
skip-character-set-client-handshake

I have no problem connecting to the database from an Ubuntu client, so I suspect the problem is with the Windows client and not the server's configuration.

The MySQL documentation suggests the error message could be due to the client being compiled without multibyte character set support:

http://dev.mysql.com/doc/refman/5.7/en/cannot-initialize-character-set.html

However, as this is Windows I'm simply downloading the client and don't have control over its compilation flags.

I've tried installing MySQLdb in a variety of ways:

  • Downloading and installing the MySQL Connector/Python .msi from dev.mysql.com
  • Downloading and installing the MySQLdb 1.2.5 .exe from pypi
  • Running "pip install mysql-python" from the Windows command prompt

Each of these results in a MySQLdb library that can't seem to handle the utf8mb4 character set.

Any help would be much appreciated!

like image 445
Jugdish Avatar asked Aug 07 '16 11:08

Jugdish


People also ask

Does MySQL support utf8mb4?

MySQL supports multiple Unicode character sets: utf8mb4 : A UTF-8 encoding of the Unicode character set using one to four bytes per character. utf8mb3 : A UTF-8 encoding of the Unicode character set using one to three bytes per character.

Should I use utf8mb4 or utf8?

If you need a database, don't use MySQL or MariaDB. Use PostgreSQL. If you need to use MySQL or MariaDB, never use “utf8”. Always use “utf8mb4” when you want UTF-8.


1 Answers

Consider the following checklist:

  1. Did you check your MySQL configuration file (/etc/my.cnf)? It should be:

    [client]
    default-character-set = utf8mb4
    
    [mysql]
    default-character-set = utf8mb4
    
    [mysqld]
    character-set-client-handshake = FALSE
    character-set-server = utf8mb4
    collation-server = utf8mb4_unicode_ci
    

    And you can verify them via:

    mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
    +--------------------------+--------------------+
    | Variable_name            | Value              |
    +--------------------------+--------------------+
    | character_set_client     | utf8mb4            |
    | character_set_connection | utf8mb4            |
    | character_set_database   | utf8mb4            |
    | character_set_filesystem | binary             |
    | character_set_results    | utf8mb4            |
    | character_set_server     | utf8mb4            |
    | character_set_system     | utf8               |
    | collation_connection     | utf8mb4_unicode_ci |
    | collation_database       | utf8mb4_unicode_ci |
    | collation_server         | utf8mb4_unicode_ci |
    +--------------------------+--------------------+
    10 rows in set (0.00 sec)
    

    -thanks to Mathias's blog post

  2. Enfore enforce UTF-8 to be used between Python and MySQL:

    # Connect to mysql.
    dbc = MySQLdb.connect(host='###', user='###', passwd='###', db='###', use_unicode=True)
    
    # Create a cursor.
    cursor = dbc.cursor()
    
    # Enforce UTF-8 for the connection.
    cursor.execute('SET NAMES utf8mb4')
    cursor.execute("SET CHARACTER SET utf8mb4")
    cursor.execute("SET character_set_connection=utf8mb4")
    
    # Do database stuff.
    
    # Commit data.
    dbc.commit()
    
    # Close cursor and connection.
    cursor.close()
    dbc.close()
    
    • thanks to Tomasz Nguyen's answer on stackoverflow
  3. Official tip from MySQL regarding Can't initialize character set:

    This error can have any of the following causes:

    • The character set is a multibyte character set and you have no support for the character set in the client. In this case, you need to recompile the client by running CMake with the -DDEFAULT_CHARSET=charset_name or -DWITH_EXTRA_CHARSETS=charset_name option. See Section 2.9.4, “MySQL Source-Configuration Options”.

    • All standard MySQL binaries are compiled with -DWITH_EXTRA_CHARSETS=complex, which enables support for all multibyte character sets. See Section 2.9.4, “MySQL Source-Configuration Options”.

    • The character set is a simple character set that is not compiled into mysqld, and the character set definition files are not in the place where the client expects to find them.

      In this case, you need to use one of the following methods to solve the problem:

      • Recompile the client with support for the character set. See Section 2.9.4, “MySQL Source-Configuration Options”.

      • Specify to the client the directory where the character set definition files are located. For many clients, you can do this with the --character-sets-dir option.

      • Copy the character definition files to the path where the client expects them to be.

like image 101
Dulguun Avatar answered Oct 15 '22 18:10

Dulguun