SELECT IF(priority_date, priority_date, created_at) as created_at
FROM table
WHERE IF(priority_date , priority_date , created_at)
BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59';
What is the best way to execute this query, performance-wise?
I have a fairly large table that has two datetimes. created_at and priority_date.
priority_date doesn't always exist, but if it does, it should be what is queried upon, else it falls back to created_at. created_at is always generated upon creation of the row. The above query causes a (nearly) full table scan.
The explain plan for initial query:
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | table | ALL | NULL | NULL | NULL | NULL | 444877 | Using where |
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
I should also note that priority_date or created_at may not necessarily both be within the time frame in question on a single row. So doing something like:
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
Could give bad results if priority_date was 2017-10-04 23:10:43 and created_at was 2017-10-10 01:23:45
My current rows for said table: 582739
Count of WHERE priority_date BETWEEN... : 3908
Count of WHERE created_at BETWEEN...: 3437
Example explain of just one of the columns queried in WHERE BETWEEN:
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
| 1 | SIMPLE | table | range | table_created_at_index | table_created_at_index | 5 | NULL | 3436 | Using index condition |
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
Clearly the IF is not the most efficient. The columns are indexed and the explains of individual rows are matching to their counts for rows on the explain plan. How can I leverage having a priority/fallback query without the wild performance loss?
EDIT
The best I've been able to figure (But WOW, is that verbose and copy/paste-y feeling)
SELECT IF(priority_date, priority_date, created_at) as created_at, priority_date
FROM table
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
HAVING ((priority_date AND priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59')
OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59');
And its explain plan:
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | table | index_merge | table_priority_date_index,table_created_at_index | table_priority_date_index,table_created_at_index | 6,5 | NULL | 7343 | Using sort_union(table_priority_date_index,table_created_at_index); Using where |
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
First you need a compound index on (priority_date, created_at), then you can use a query like this:
SELECT IF(priority_date, priority_date, created_at) as created_at, priority_date
FROM table
WHERE priority_date BETWEEN '2017-10-10' AND '2017-10-10 23:59:59'
OR (priority_date IS NULL AND created_at BETWEEN '2017-10-10' AND '2017-10-10 23:59:59');
Having priority_date first in the compound index makes a big difference. No union is required.
Explain results on 400k rows with 2000 results:
Extra: Using where; Using index
key: priority_created_compound
rows: 2000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With