I'm using a django application which does some 'startswith' ORM operations comparing longtext
columns with a unicode string. This results in a LIKE BINARY
comparison operation with a u'mystring'
unicode string. Is a LIKE BINARY likely to be any slower than a plain LIKE?
I know the general answer is benchmarking, but I would like to get a general idea for databases in general rather than just my application as I'd never seen a LIKE BINARY query before.
I happen to be using MySQL but I'm interested in the answer for SQL databases in general.
If performance seems to become a problem, it might be a good idea to create a copy of the first eg. 255 characters of the longtext, add an index on that and use the startswith
with that.
BTW, this page says: "if you need to do case-sensitive matching, declare your column as BINARY; don't use LIKE BINARY in your queries to cast a non-binary column. If you do, MySQL won't use any indexes on that column." It's an old tip but I think this is still valid.
For the next person who runs across this - in our relatively small database the query:
SELECT * FROM table_name WHERE field LIKE 'some-field-search-value';
... Result row
Returns 1 row in set (0.00 sec)
Compared to:
SELECT * FROM table_name WHERE field LIKE BINARY 'some-field-search-value';
... Result row
Returns 1 row in set (0.32 sec)
Long story short, at least for our database (MySQL 5.5 / InnoDB) there is a very significant difference in performance between the two lookups.
Apparently though this is a bug in MySQL 5.5: http://bugs.mysql.com/bug.php?id=63563 and in my testing against the same database in MySQL 5.1 the LIKE BINARY query still uses the index (while in 5.5 it does a full table scan.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With