Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should "duplicate" indices in MySQL be deleted?

I am aware that in MySQL indices on (A,B,C) benefit ANDed WHERE clauses with |A|, |A,B|, |A,B,C|. This makes it seem that having the index (A,B,C) means that there is no point in having a single index on (A) or a composite on (A,B).

1. Is this true?

2. Is it just a waste maintaining an index on (A) when you already have an index on (A,B,C)?

like image 516
Rage Avatar asked Oct 16 '22 02:10

Rage


2 Answers

I believe the answer to both your questions is the same: it's almost entirely true; it's almost always wasteful to have indexes on both (A, B, C) and (A).

As Danblack mentioned, the size could make a minor difference, although that's probably negligible.

More importantly, in my experience, note that (A) is actually (A, Primary), where Primary is those primary key columns that are not already explicitly included in the index. In practice, that often means (A, Id). The other index, then, is actually (A, B, C, Id). Note how this affects the order in which rows are encountered in the index.

Imagine doing this:

SELECT *
FROM MyTable
WHERE A = 'Whatever'
ORDER BY Id

Index (A), AKA (A, Id), is perfect for this. For any fixed value of A, corresponding rows are then ordered by Id. No sorting is needed - the results are in our desired order.

However, for index (A, B, C), AKA (A, B, C, Id), it's different. For any fixed value of A, corresponding rows are then ordered by B! This means that the above query will require sorting of the results.

EXPLAIN should confirm what I have described. A filesort will take place if only the (A, B, C) index is available, but not if (A) is available.

It should be easy to see that this matters very little if there are generally very few rows for a particular value of A. However, if there could be 100,000 rows for such a value, then the filesort starts to be impactful. In such a case, you might choose to have index (A) to optimize for this scenario.

Generally speaking, such prefix indexes are superfluous. It's good to analyze your indexes and queries to identify these scenario's, though. In a rare case, one may be worth adding. In the more common case, at least you'll be able to weigh such effects into your overall index choices.

like image 129
Timo Avatar answered Oct 19 '22 22:10

Timo


  1. true

  2. almost always

There is a very rare case that if:

  • A as a standalone index is used most frequently, and
  • that queries that use A,B or A,B,C are very rare, and
  • that the sizeof(A) is significantly less than sizeof(A,B,C), and
  • you are memory constrained such that normally index A,B,C usage is using a significant buffer pool size/key cache size to the determent of other queries;

then there may be a may small benefit having a small duplicate subset of a index A.

Note: possibly might include other conditions

like image 35
danblack Avatar answered Oct 19 '22 22:10

danblack