I am reading that MySQL 5.6 can only index the first 767 bytes of a varchar
(or other text-based types). My schema character set is utf-8
, so each character can be stored on up to 3 bytes. Since 767/3 = 255.66, this would indicate that the maximum length for a text column that needs to be indexed in 255 characters. Experience seems to confirm this as the following goes through:
create table gaga (
val varchar(255),
index(val)
) engine = InnoDB;
But changing the definition of val
to varchar(256)
yields an "Error Code: 1071. Specified key was too long; max key length is 767 bytes".
In this day in age, the limit to 255 characters seems awfully low, so: is this correct? If it is what is the best way to get larger pieces of text indexed with MySQL? (Should I avoid it? Store a SHA? Use another sort of index? Use another database character encoding?)
Though the limitation might seem ridiculous, it makes you think over if you really need the index for such a long varchar field. Even with 767 bytes the index size grows very fast and for a large table (where it is most useful) most probably won't fit into memory.
From the other side, the only frequent case at least in my experience where I needed to index a long varchar field was a unique constraint. And in all those cases a composite index of some group id and MD5 from the varchar field was sufficient. The only problem is to mimick the case-insensitive collation (which considers accented charactes and not-accented equal), though in all my cases I anyway used binary collation, so it was not a problem.
UPD. Another frequent case for indexing a long varchar is ordering. For this case I usually define a separate indexed sorter field which is a prefix of 5-15 characters depending on data distribution. For me, a compact index is more preferable than rarely inaccurate ordering.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With