I am trying to fix a character encoding issue - previously we had the collation set for this column utf8_general_ci which caused issues because it is accent insensitive..
I'm trying to find all the entries in the database that could have been affected.
set names utf8; select * from table1 t1 join table2 t2 on (t1.pid=t2.pid and t1.id != t2.id) collate utf8_general_ci;
However, this generates the error:
ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'
DEFAULT CHARACTER SET utf8
CHARSET=utf8
CHARACTER SET utf8 COLLATE utf8_bin NOT NULL
Question: Why am I getting an error about latin1 when latin1 doesn't seem to be present anywhere in the table / schema definition?
MariaDB [(none)]> SHOW VARIABLES LIKE '%char%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) MariaDB [(none)]> SHOW VARIABLES LIKE '%collation%'; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | utf8_general_ci | | collation_database | latin1_swedish_ci | | collation_server | latin1_swedish_ci | +----------------------+-------------------+
The MySQL latin1 character set is such a one-byte character set for Western Europe, and it is the default character set of MySQL up to and including 5.7. In spite of the name, the character set is actually Windows-1252 compliant, which is a superset of ISO-8859-1, also known as Latin-1.
If you're using MySQL 8.0, the default charset is utf8mb4. If you elect to use UTF-8 as your collation, always use utf8mb4 (specifically utf8mb4_unicode_ci).
utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters.
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set. Suppose that we have an alphabet with four letters: A , B , a , b .
Firstly run this query
SHOW VARIABLES LIKE '%char%';
You have character_set_server='latin1'
If so,go into your config file,my.cnf and add or uncomment these lines:
character-set-server = utf8 collation-server = utf8_unicode_ci
Restart the server. Yes late to the party,just encountered the same issue.
The same error is produced in MariaDB (10.1.36-MariaDB) by using the combination of parenthesis and the COLLATE statement. My SQL was different, the error was the same, I had:
SELECT * FROM table1 WHERE (field = 'STRING') COLLATE utf8_bin;
Omitting the parenthesis was solving it for me.
SELECT * FROM table1 WHERE field = 'STRING' COLLATE utf8_bin;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With