Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum number of IN clauses to use index in MySQL

Tags:

sql

mysql

I'm currently trying to query a table with several (3) in clauses like:

 SELECT *
 FROM table
 WHERE
     a IN (2884,5320)
   AND 
     b IN ('a', 'b', 'c')
   AND 
     c IN (1, 2, 3)
   AND d='abcd'
   AND date BETWEEN 0 AND 1383177599

The table is indexed like index(a, b, c, d, date)

However, when I run an explain on the query the explainer shows that there is no appropriate index to use. This remains the case even if I FORCE INDEX.

If I change one of the above INs to an = such as

SELECT *
FROM table
WHERE
    a = 2884
  AND 
    b IN ('a', 'b', 'c')
  AND 
    c IN (1, 2, 3)
  AND d = 'abcd'
  AND date BETWEEN 0 AND 1383177599

MySQL will allow me to force it to use the index, but will choose another non-covering index otherwise. This is the case regardless of which of the INs are changed to =.

My Question:

Is there a limit to the number of in clauses you can use for an indexed query? Is there something obvious I'm missing here?

A couple things to know about the table:
9 GB, ~8,000,000 rows. It contains one text column that can be quite large (a JSON field), but this column is not any of the queried columns above. The in clauses shown above could be drastically bigger (200-300 items)

Thanks!

EDIT:
This is the output of explain on the query (with FORCE INDEX) 1,"SIMPLE","table","ALL","correct_index",NULL,NULL,NULL,6977553,"Using where" Where correct index is the one explained above (index(a, b, c, d, date))

like image 380
cmwright Avatar asked Oct 30 '13 21:10

cmwright


People also ask

What is the limit of in clause in MySQL?

In mysql, you should be able to put as many values in the IN clause as you want, only constrained by the value of "max_allowed_packet". Save this answer.

What is the maximum number of index per table in MySQL?

A table can contain a maximum of 64 secondary indexes. If innodb_large_prefix is enabled (the default), the index key prefix limit is 3072 bytes for InnoDB tables that use the DYNAMIC or COMPRESSED row format.

What is the maximum number of indexes on my table?

A table can contain a maximum of 64 secondary indexes. The index key prefix length limit is 3072 bytes for InnoDB tables that use DYNAMIC or COMPRESSED row format.

How many type of indexes can be use in MySQL?

MySQL has three types of indexes: INDEX, UNIQUE (which requires each row to have a unique value), and PRIMARY KEY (which is just a particular UNIQUE index).

How many number of index can be used per table?

Index usage information. SQL Server allows us to create up to 999 Non-clustered indexes and one Clustered indexes per each table. This huge number of allowed, but not recommended, indexes help us in covering and enhancing the performance of a large number of queries that try to retrieve data from the database table.


1 Answers

You can't expect more than one column to be searched via index for range predicates like IN.

Even if you have a multi-column index (a, b, c, d, date), the leftmost columns should be for equality predicates (=) and at most one column can be for a range predicate. Any subsequent columns in the index don't help.

Example:

WHERE a = 2884 AND b = 'b' AND c IN (1, 2, 3) AND d = 'abcd'

So a and b are equality predicates, c is a range predicate, and d is another equality predicate.

Run EXPLAIN on the query, and note the len and ref columns indicate that you're only using the first two columns of the index. The condition for d is done the hard way, by searching all the rows found by the index on the first three columns.

           id: 1
  select_type: SIMPLE
        table: t
         type: ref
possible_keys: a
          key: a
      key_len: 7              <--- two columns' length
          ref: const,const    <--- only two values for index columns `a` and `b`
         rows: 4
        Extra: Using where; Using index

Whereas changing c to an equality predicate allows all four columns to be used for index lookups:

WHERE a = 2884 AND b = 'b' AND c = 2 AND d = 'abcd'

           id: 1
  select_type: SIMPLE
        table: t
         type: ref
possible_keys: a
          key: a
      key_len: 25                      <--- four columns' length
          ref: const,const,const,const <--- four values
         rows: 2
        Extra: Using where; Using index

I talk about this more in my presentation How to Design Indexes, Really.


Re your comment:

Is there are any way around this without having to rewrite the code?

You've got the point that you can only have one range predicate that benefits from an index. You can still have other range predicates in your WHERE clause, but they don't get any benefit from an index.

But that's not a deal-breaker, because if the one range expression you index can help to narrow down the search by 99%, that's a win. Then applying the other expressions to the matched rows is a cost we can live with.

The optimizer will try to choose the most effective index if it can, and this is largely influenced by how selective the index is. Then the query uses the index to narrow down the search, and only the subset of rows that pass that search are tested against the other conditions.

Take another look at your query:

... WHERE
 a IN (2884,5320)
AND 
 b IN ('a', 'b', 'c')
AND 
 c IN (1, 2, 3)
AND d='abcd'
AND date BETWEEN 0 AND 1383177599

Suppose we know that only 1% of the rows match c IN (1,2,3), but the other terms match more like 20-40% of the rows on average.

We can index for the equality predicate, that's okay. Then we get to choose one other column for the index because all the other terms are range predicates. We choose the column that is most selective: c. Therefore the best index is on (d, c), and must be in that order.

You may have other queries in your app that have different choices for which columns are referenced in the WHERE clause, and what specific values we're searching for. So we might need another index with a different set of columns, or even the same columns in a different order. It's not uncommon to need multiple indexes, because as I mentioned in the presentation, the indexes you need to create depend on the queries you want to optimize.

like image 139
Bill Karwin Avatar answered Oct 20 '22 09:10

Bill Karwin