I have a table with at least a couple million rows and a schema of all integers that looks roughly like this:
start
stop
first_user_id
second_user_id
The rows get pulled using the following queries:
SELECT *
FROM tbl_name
WHERE stop >= M
AND first_user_id=N
AND second_user_id=N
ORDER BY start ASC
SELECT *
FROM tbl_name
WHERE stop >= M
AND first_user_id=N
ORDER BY start ASC
I cannot figure out the best indexes to speed up these queries. The problem seems to be the ORDER BY because when I take that out the queries are fast.
I've tried all different types of indexes using the standard index format:
ALTER TABLE tbl_name ADD INDEX index_name (index_col_1,index_col_2,...)
And none of them seem to speed up the queries. Does anyone have any idea what index would work? Also, should I be trying a different type of index? I can't guarantee the uniqueness of each row so I've avoided UNIQUE indexes.
Any guidance/help would be appreciated. Thanks!
Update: here are a list of the indexes, I didn't include this originally since I've taken a shotgun approach and added a ton of indexes looking for one that works:
start_index: [start, first_user_id, second_user_id]
stop_index: [stop, first_user_id, second_user_id]
F1_index: [first_user_id]
F2_index: [second_user_id]
F3_index: [another_id]
test_1_index: [first_user_id,stop,start]
test_2_index: [first_user_id,start,stop]
test_3_index: [start,stop,first_user_id,second_user_id]
test_4_index: [stop,first_user_id,second_user_id,start]
test_5_index: [stop,start]
And here is the EXPLAIN output.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: listing
type: index_merge
possible_keys: stop_index,F1_index,F3_index,test_1_index,test_2_index,test_4_index,test_5_index
key: F1_index,F3_index
key_len: 5,5
ref: NULL
rows: 238
Extra: Using intersect(F1_index,F3_index); Using where; Using filesort
Update for posterity
We ended up completely re-evaluating how we were querying the table and chose these indexes:
index_select_1: [first_user_id,start,stop]
index_select_2: [first_user_id,second_user_id,start,stop]
and then we select on the table with queries like these:
SELECT *
FROM tbl_name
WHERE first_user_id=N
AND start >= M
ORDER BY start ASC
SELECT *
FROM tbl_name
WHERE first_user_id=N
AND second_user_id=N
AND start >= M
ORDER BY start ASC
Thanks to everyone that answered, you really helped me think through the problem.
Yes, index will help you, when using ORDER BY. Because INDEX is a sorted data structure, so the request will be executed faster. Look at this example: table test2 with 3 rows.
Use of Indexes to Satisfy ORDER BY. In some cases, MySQL may use an index to satisfy an ORDER BY clause and avoid the extra sorting involved in performing a filesort operation.
Order of the columns on the index matter but not in not the order of anded values on the where clause.
Order by columns are used for ordering the result set, not filtering. An index on the columns mentioned in the order by clause is unlikely to change anything, especially if it's not used to filter the data.
Could you make your sample table and EXPLAIN results match? Because, obviously it is not a same situation and we don't know if you maybe made a mistake in abstracting your real query only by looking at the provided EXPLAIN results. If you don't want to show too much of a structure then reverse it and create the quoted table structure and provide EXPLAIN result on that (maybe you will catch the problem that way).
Now one thing is certain - sorting is using filesort, which is bad.
To simplify (we'll come back to it) - compound indexes useful for sorting need to have the sort field in front.
Example idx(ID, Start)
ID Start
1
5
8
8
10
25
2
3
9
10
40
41
42
42
...
In the above example the index is not of much help for sorting if you don't have where condition in which ID is limited to only one value.
But, this exception is important since you have single row selectivity on one or both id fields.
So from your indexes the only indexes that have start at the beginning are
start_index: [start, first_user_id, second_user_id]
test_3_index: [start,stop,first_user_id,second_user_id]
Mysql ignores the index
start_index: [start, first_user_id, second_user_id]
because it has better choices in terms of selectivity - it would need to do an index scan with this index and it has indexes that will allow it to do index intersect jumping directly to (unsorted) results. It expects better selectivity from the intersect and selectivity drives the planer.
Once the result is obtained mysql should realize that it could use another index to sort the results, but it seems that it can not see how cheap that would be.
So to help the planer you could create an index that will capitalize on your single value selectivity with index such as:
two_ids_with_sort: [first_user_id, second_user_id, start]
I assume that above would work very well on your second query where you have conditions on both id's giving you access to presorted start record pointers. The following query should do the same for the first query:
one_id_with_sort: [first_user_id, start]
Only if you end up with a lot of records in the result sets I would look into indexing it further.
There are two paths there a) adding the field stop to the end of the index b) creating two more similar indexes with stop instead of start (index intersect could be used there and wider range of queries could benefit from it)
But do test all of the above theories.
Couple of general suggestions
USE INDEX (index1) FOR ORDER BY
in your select to gauge a benefit for a certain index over planer, see more here (esp FORCE option; also - aim to leave only the useful indexes and see if planer will be able to utilize them then, if not, as a last resort only, force the indexes in your queries for which performance is crucial. keep in mind that this is a bad practice in terms of administration and design).If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With