Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL performance gain by reducing index size?

Tags:

indexing

mysql

I have a table with ~1.2m rows in it. It has 6 columns indexed, including one varchar(255) field that contains urls.

I need to be able to scan the table to see whether a url exists in the table, hence the index, but I'm wondering whether I would see a performance gain by reducing the index size to around 50?

Of course this would mean that it may have to scan more rows when searching for a url in the database.. but I only have to do this query about once every 30 seconds, so I'm wondering if the smaller index size would be worth it. Thoughts?

like image 844
makeee Avatar asked Jan 21 '23 05:01

makeee


2 Answers

Two reasons why lowering maybe better - (Assuming your index is useful)

1) Indexes too get loaded in memory, so there maybe a rare possibility that your index size grows to an extent that it is not completely cacheable in memory. Thats when you will see a performance hit (with all the new hardware specs... hardly a possibility with 1.2M rows, but still worth noting).

2) Manytimes, just the first 'n' characters are good enough to be able to quickly identify each record. You may not need to index the whole 255 characters at all.

Two reason why you may not care -

1) As stated, you may never see your indexes growing to be out of your key buffer, so why worry.

2) You will need to determine the first 'n' characters, and even after that the performance will less than or equal to a full index... never more. Do you really need to spend time on that? Is it worth the possible loss of accuracy?

like image 76
Jai Avatar answered Jan 22 '23 18:01

Jai


From my SQL indexing tutorial (covers MySQL as well):

Tip: Always aim to index the original data. That is often the most useful information you can put into an index.

This is a general rule I suggest until there is a very strong reason to do something different.

Space is not the issue, in most cases.

Performance wise, the index tree depth grows logarithmically with the number of index leaf nodes. That means, cutting the index size half is probably not reducing the tree depth at all. Hence, the performance gain might be limited to the improved cache-hit-rate. But you mentioned you execute that query once every 30 seconds. On a moderately loaded machine, that means you index will not be cached at all (except, maybe, you search for the same URL every 30 seconds).

After all: I don't see any reason to act against the general advice mentioned above.

If you really want to save index space, try to find redundant indexes first (e.g., those starting with the same columns). These are typically the low-hanging fruits.

like image 22
Markus Winand Avatar answered Jan 22 '23 20:01

Markus Winand