Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which index should I use on binary datatype column mysql

I am writing a simple tool to check duplicate files(i.e. files having same data). The mechanism is to generate hashes for each file using sha-512 algorithm and then store these hashes in MYSQL database. I store hashes in binary(64) unique not null column. Each row will have a unique binary hash and used to check file is duplicate or not.

-- My questions are --

  1. Can I use indexes on binary column, my default table collation is latin1 - default collation?

  2. Which Indexing mechanism should I use Btree or Hash, for getting high performance? I need to update or add 100 of rows per seconds.

  3. What other things should I take care of to get best performance?

like image 908
Yogesh R.L Avatar asked Dec 12 '22 14:12

Yogesh R.L


1 Answers

  1. Can I use indexes on binary column, my default table collation is latin1 - default collation?

    Yes, you can; collation is only relevant for character datatypes, not binary datatypes (it defines how characters should be ordered)—also, be aware that latin1 is a character encoding, not a collation.

  2. Which Indexing mechanism should I use Btree or Hash, for getting high performance? I need to update or add 100 of rows per seconds.

    Note that hash indexes are only available with the MEMORY and NDB storage engines, so you may not even have a choice.

    In any event, either would typically be able to meet your performance criteria—although for this particular application I see no benefit from using B-Tree (which is ordered), whereas Hash would give better performance. Therefore, if you have the choice, you may as well use Hash.

    See Comparison of B-Tree and Hash Indexes for more information.

  3. What other things should I take care of to get best performance?

    Depends on your definition of "best performance" and your environment. In general, remember Knuth's maxim "premature optimisation is the root of all evil": that is, only optimise when you know that there will be a problem with the simplest approach.

like image 143
eggyal Avatar answered Dec 21 '22 08:12

eggyal