I have a table with a polymorphic reference that I generated via the following migration:
def change
add_reference :table_name, :thing, polymorphic: true, index: true
end
When I ran the migration it generated the following:
add_index "workflow_engine_task_bases", ["thing_type", "thing_id"], name: "index_workflow_engine_task_bases_on_thing_type_and_thing_id", using: :btree
Why is the left-most column the thing_type? To me it seems that this is suboptimal as it is less specific.
This is the commit in Rails by Derek Prior that updates add_reference
to use type
before id
when generating an index for a polymorphic association. The justification for the change is reproduced below:
Use type column first in multi-column indexes
add_reference
can very helpfully add a multi-column index when you use it to add a polymorphic reference. However, the first column in the index is theid
column, which is less than ideal.The [PostgreSQL docs][1] say:
A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns.
The [MySQL docs][2] say:
MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.
In a polymorphic relationship, the type column is much more likely to be useful as the first column in an index than the id column. That is, I'm more likely to query on type without an id than I am to query on id without a type.
[1]: http://www.postgresql.org/docs/9.3/static/indexes-multicolumn.html
[2]: http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html
I think in most scenarios, this order makes sense. It allows you to have a single index that performs well on queries including type
and id
or just type
.
Having said that, your use case may vary depending on the database you use, your dataset, and what queries you plan to run. Your best bet is to profile your most common use cases on a production dump and choose your indexing strategy accordingly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With