I'm trying to optimize some of the database queries in my Rails app and I have several that have got me stumped. They are all using an IN
in the WHERE
clause and are all doing full table scans even though an appropriate index appears to be in place.
For example:
SELECT `user_metrics`.* FROM `user_metrics` WHERE (`user_metrics`.user_id IN (N,N,N,N,N,N,N,N,N,N,N,N))
performs a full table scan and EXPLAIN
says:
select_type: simple type: all extra: using where possible_keys: index_user_metrics_on_user_id (which is an index on the user_id column) key: (none) key_length: (none) ref: (none) rows: 208
Are indexes not used when an IN
statement is used or do I need to do something differently? The queries here are being generated by Rails so I could revisit how my relationships are defined, but I thought I'd start with potential fixes at the DB level first.
You should always add an index on any field to be used in a WHERE clause (whether for SELECT, UPDATE, or DELETE). The type of index depends on the type of data in the field and whether you need each row to have a unique value.
The Benefits and Drawbacks of Using Indexes in MySQLIndexes consume disk space. Indexes degrade the performance of INSERT, UPDATE and DELETE queries – when data is updated, the index needs to be updated together with it. MySQL does not protect you from using multiple types of indexes at the same time.
The IN clause becomes an equality condition for each of the list and will use an index if appropriate.
The USE INDEX ( index_list ) hint tells MySQL to use only one of the named indexes to find rows in the table. The alternative syntax IGNORE INDEX ( index_list ) tells MySQL to not use some particular index or indexes.
See How MySQL Uses Indexes.
Also validate whether MySQL still performs a full table scan after you add an additional 2000-or-so rows to your user_metrics
table. In small tables, access-by-index is actually more expensive (I/O-wise) than a table scan, and MySQL's optimizer might take this into account.
Contrary to my previous post, it turns out that MySQL is also using a cost-based optimizer, which is very good news - that is, provided you run your ANALYZE
at least once when you believe that the volume of data in your database is representative of future day-to-day usage.
When dealing with cost-based optimizers (Oracle, Postgres, etc.), you need to make sure to periodically run ANALYZE
on your various tables as their size increases by more than 10-15%. (Postgres will do this automatically for you, by default, whereas other RDBMSs will leave this responsibility to a DBA, i.e. you.) Through statistical analysis, ANALYZE
will help the optimizer get a better idea of how much I/O (and other associated resources, such as CPU, needed e.g. for sorting) will be involved when choosing between various candidate execution plans. Failure to run ANALYZE
may result in very poor, sometimes disastrous planning decisions (e.g. millisecond-queries taking, sometimes, hours because of bad nested loops on JOIN
s.)
If performance is still unsatisfactory after running ANALYZE
, then you will typically be able to work around the issue by using hints, e.g. FORCE INDEX
, whereas in other cases you might have stumbled over a MySQL bug (e.g. this older one, which could have bitten you were you to use Rails' nested_set
).
Now, since you are in a Rails app, it will be cumbersome (and defeat the purpose of ActiveRecord
) to issue your custom queries with hints instead of continuing to use the ActiveRecord
-generated ones.
I had mentioned that in our Rails application all SELECT
queries dropped below 100ms after switching to Postgres, whereas some of the complex joins generated by ActiveRecord
would occasionally take as much as 15s or more with MySQL 5.1 because of nested loops with inner table scans, even when indices were available. No optimizer is perfect, and you should be aware of the options. Other potential performance issues to be aware of, besides query plan optimization, are locking. This is outside the scope of your problem though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With