I have tried creating my mysql tables with both UTF-8 and Latin1 Char sets. When I use Latin1 my indexes are used, when I use UTF-8 indexes are not used when selecting/limiting records. Is there something I am missing with the Char sets that causes this to happen?
Cheers
Ke
The Chinese alphabet and others do not use Latin1 at all. These characters and symbols are part of a much larger encoding system called UTF8, which also includes Latin1.
what is the difference between utf8 and latin1? They are different encodings (with some characters mapped to common byte sequences, e.g. the ASCII characters and many accented letters). UTF-8 is one encoding of Unicode with all its codepoints; Latin1 encodes less than 256 characters.
Similarly, here's the command to change character set of MySQL table from latin1 to UTF8. Replace table_name with your database table name. mysql> ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci; Hopefully, the above tutorial will help you change database character set to utf8mb4 (UTF-8).
The MySQL latin1 character set is such a one-byte character set for Western Europe, and it is the default character set of MySQL up to and including 5.7. In spite of the name, the character set is actually Windows-1252 compliant, which is a superset of ISO-8859-1, also known as Latin-1.
Indexes can be used only when the expression's collation matches that on the indexed column.
If the expression's COERCIBILITY
is lower than that of the column (that is 2
), the column's collation is casted to that of the expression, and the index is not used.
Normally, literals have COERCIBILITY
of 4
and user variables that of 3
, so this should be not a problem.
However, if you mix different collations in a JOIN
or UNION
, the cast order is not guaranteed.
In this case you should provide explicit collation to the column your are casting (most probably, you want to cast latin1
to UTF8
), and this should be the collation of the column you are casting to:
SELECT *
FROM utf_table
JOIN latin_table
ON utf_column = latin_column COLLATE UTF8_GENERAL_CI
I understand this now, the tables i was joining were not of the same char set
DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Once I changed these the indexes worked fine.
The indexes themselves are stored with the same encoding as the columns they are indexing. Comparing a UTF-8 character to a latin1 character cannot make use of the index, because it would need to convert both to the same encoding, since the index optimizations are performed on a byte level (and ß
in latin1 has a different byte sequence than in UTF-8).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With