Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django MySQL 'utf8' is currently an alias for the character set UTF8MB3, which will be replaced by UTF8MB4

I am using Django 2.0.4, MySQL 8.0.11, mysqlclient-1.3.12 and Python 3.6.5 on Mac Sierra. I am receiving the following warning:

/lib/python3.6/site-packages/django/db/backends/mysql/base.py:71: Warning: (3719, "'utf8' is currently an alias for the character set UTF8MB3, which will be replaced by UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.")

I know it's just a warning, but I still don't like seeing it and have been searching for a solution to it. I have tried a number of things including dropping and recreating my Schema with a variety of options from UTF8 Collation UTF8-bin and UTF8MB4 Collation UTF8MB4-bin but nothing seems to work. This warning is coming from MySQL/base.py but I don't know who is making the call with 'utf8' that MySQL is objecting to.

Anyone have any ideas?

ADDITIONAL INFO

I got to thinking about this some more after the answer below and realized that I have so far only received this warning during the migrate command during what appears to be the initial setup of the auth app. I looked at all the sql with the sqlmigrate command and didn't see any mention of utf8 so I still don't know why it is happening

(CL) Mac-mini:mysite Lehrian$ python manage.py migrate Operations to perform: Apply all migrations: admin, auth, contenttypes, polls, sessions Running migrations: Applying contenttypes.0001_initial... OK Applying auth.0001_initial... OK Applying admin.0001_initial... OK Applying admin.0002_logentry_remove_auto_add... OK Applying contenttypes.0002_remove_content_type_name... OK Applying auth.0002_alter_permission_name_max_length... OK Applying auth.0003_alter_user_email_max_length... OK Applying auth.0004_alter_user_username_opts... OK Applying auth.0005_alter_user_last_login_null... OK Applying auth.0006_require_contenttypes_0002... OK Applying auth.0007_alter_validators_add_error_messages... OK /Users/Lehrian/Documents/Davelopment/CL/lib/python3.6/site-packages/django/db/backends/mysql/base.py:71: Warning: (3719, "'utf8' is currently an alias for the character set UTF8MB3, which will be replaced by UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.") return self.cursor.execute(query, args) Applying auth.0008_alter_user_username_max_length... OK Applying auth.0009_alter_user_last_name_max_length... OK Applying polls.0001_initial... OK Applying polls.0002_auto_20180425_1458... OK Applying sessions.0001_initial... OK (CL) Mac-mini:mysite Lehrian$

I also get it when running tests but I have concluded this is the same error as above as tests creates it's own database (also with character set utf8mb4, I preserved the test_polls database and looked at it) and it runs the same migration as above.

like image 623
Lehrian Avatar asked Apr 25 '18 22:04

Lehrian


5 Answers

UTF-8 is what the world outside MySQL calls the Unicode encoding for any number of bytes.

utf8 (no dash) is a CHARACTER SET in MySQL. It is (currently) limited to 3-byte characters, hence does not include some Chinese and Emoji characters.

utf8mb4 is the CHARACTER SET in MySQL that handles the 4-byte characters, too.

Although the Unicode standard allows for 5-byte characters, there will not be any in the near future.

Do not consider charsets utf16 or utf32 (UTF-16 or UTF-32).

https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-11.html says

The utf8 character set is currently an alias for utf8mb3, but will at that point become a reference to utf8mb4. To avoid ambiguity about the meaning of utf8, consider specifying utf8mb4 explicitly for character set references instead of utf8.

Since you are using MySQL 8.0, which nicely handles the differences between utf8mb3 and utf8mb4 (versions 5.5 and 5.6 had some annoying incompatibilities), I see the warning as not a very big deal.

MySQL 8.0 defaults to utf8mb4 and a newer collation than what 5.7 had. So, databases initially created in 8.0 should be better off than in older versions.

I recommend (to all MySQL users) to use utf8mb4. This should work "best" for the foreseeable future. Doing so will avoid the confusion that may ensue when utf8 changes from meaning utf8mb3 to utf8mb4.

like image 95
Rick James Avatar answered Oct 19 '22 00:10

Rick James


I've had the same problem, and even when my columns are set to utf8mb4, it was still failing to save things like certain emoji characters. Turns out, Django was not using the same character set when connecting to the database. To solve this, you can specify a new OPTIONS entry in the Django DATABASES setting, telling it which charset to use:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'xxxxx',
        'PASSWORD': 'xxxxx',
        'HOST': 'localhost',
        'OPTIONS': {
            'charset': 'utf8mb4',  # <--- Use this
        }
    }
}
like image 22
Dan Breen Avatar answered Oct 18 '22 22:10

Dan Breen


I encountered the exact same issue lately. I raised a bug request to Django, but Django do not accept it as their bug.

MySQL 8 has switched from UTF8MB3 to UTF8MB4 as the default charset. As of 8.0.11 if you access a table that was created with the previous version a warning is returned encouraging you to switch to UTF8MB4.

When you run the inspectdb the INFORMATION_SCHEMA tables are still in UTF8MB3 so you get the warning returned to Django, which Django is currently unable to ignore.

I have a fully worked example of how to get around this error on the Django bug ticket: https://code.djangoproject.com/ticket/29678

I have been able to fully use MySQL 8.0.12 as a backend for a robust Django application so once you get past this issue you should hopefully be okay.

I copied this text from another answer I added here, Apologies if that is bad etiquette

like image 23
Ciaran O'S Avatar answered Oct 18 '22 23:10

Ciaran O'S


It tell you that your database uses a type (UTF8), which it will changed in future.

So change the table settings, so that you specify the exact type.

[The reason in short: mysql now reserve 3 bytes encoded UTF-8 (UTF8MB3) per character, but you can force it to reserve 4 bytes (still encoded in UTF-8), using UTF8MB4. Considering that Unicode characters could require 4 bytes (in UTF-8 [and BTW also in UTF-16 and UTF-32]), the future default for 'utf-8' will be UTF8MB4. So the change and the warning.

Collation is used to compare equality and to order columns, but it is not the character set. People (and so answer) often confuse it, because it is displayed most prominently. (OTOH you should use a collation compatible with your character set).

This answer explain how to change character set and collation:

How to convert an entire MySQL database characterset and collation to UTF-8?

like image 29
Giacomo Catenazzi Avatar answered Oct 19 '22 00:10

Giacomo Catenazzi


Not sure If I am late, but in case anyone else gets stuck with this, here is something that worked for me.


Indexes in InnoDB tables can't be longer than 255 chars with utf8, but only 191 chars with utf8mb4. This means that the default indexes that Django makes for CharField(max_length=255) is too long.

You will need to update the VARCHAR length to less than 191 if it is set to 255 now.

Also set charset field to 'utf8mb4' specifically

DATABASES = {
  'default': {
  'USER': 'xxxxx',
  'PASSWORD': 'xxxxx',
  'HOST': 'localhost',
  'OPTIONS': {
      'charset': 'utf8mb4',  # The characterset you need
    }
  }
}
like image 20
Ajay Bisht Avatar answered Oct 18 '22 22:10

Ajay Bisht