Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the best way to add multiple indexes to polymorphic table

let say I have polymorphic similar to this

| document_id | owner_type | owner_id |
| 1           | Client     | 1        |
| 1           | Client     | 2        |
| 2           | User       | 1        |

I know I'll be calling queries looking for owner_type and owner_type + owner_id

SELECT * FROM document_name_ownerships WHERE owner_type = 'Client`
SELECT * FROM document_name_ownerships WHERE owner_type = 'Client` and owner_id = 1

Lets ignore how to index document_id I would like to know what is the best way(performance) to index owner columns for this SQL scenarios

Solution 1:

CREATE INDEX do_type_id_ix ON document_ownerships (owner_type, owner_id)

this way I would have just one index that works for both scenarios

Solution 2:

CREATE INDEX do_id_type_ix ON document_ownerships (owner_id, owner_type)
CREATE INDEX do_type_ix ON document_ownerships (owner_type)

this way I would have indexes that totally match the way how I will use database. The only thing is that I have 2 indexes when I can have just one

Solution 3:

CREATE INDEX do_id_ix ON document_ownerships (owner_id)
CREATE INDEX do_type_ix ON document_ownerships (owner_type)

individual column indexes


From what I was exploring in MySQL console with explain I get really similar results and because Its a new project I don't have enought data to properly explore this so that I'll be 100% sure (even when I populated databese with several hundred records). So can anyone give me piece of advise from their experience ?

like image 474
equivalent8 Avatar asked Oct 21 '22 22:10

equivalent8


1 Answers

This is going to depend a lot on the distribution of your data - indexes only make sense if there is good selectivity in the indexed columns.

e.g. if there are only 2 possible values for owner_type, viz Client and User, and assuming they are distributed evenly, then any index only on owner_type will be pointless. In this case, a query like

SELECT * FROM document_name_ownerships WHERE owner_type = 'Client`;

would likely return a large percentage of the records in the table, and a scan is the best that is possible (Although I'm assuming your real queries will join to the derived tables and filter on derived table-specific columns, which would be a very different query plan to this one.)

Thus I would consider indexing

  1. Only on owner_id, assuming this gives a good degree of selectivity by itself,
  2. Or, on the combination (owner_id, owner_type) only if there is evidence that index #1 isn't selective, AND if the the combination of the 2 fields gives sufficient selectivity to warrant this the index.
like image 65
StuartLC Avatar answered Oct 23 '22 18:10

StuartLC