Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed up this SQL index query?

Tags:

Given the following SQL table :

Employee(ssn, name, dept, manager, salary)

You discover that the following query is significantly slower than expected. There is an index on salary, and you have verified that the query plan is using it.

SELECT * 
FROM Employee
WHERE salary = 48000

Please give a possible reason why this query is slower than expected, and provide a tuning solution that addresses that reason.

I have two ideas for why this query is slower than expected. One is that we are trying to SELECT * instead of SELECT Employee.salary which would slow down the query as we must search across all columns instead of one. Another idea is that the index on salary is non-clustered, and we want to use a clustered index, as the company could be very large and it would make sense to organize the table by the salary field.

Would either of those two solutions speed up this query? I.e. either change SELECT * to SELECT Employee.salary or explicitly set the index on salary to be clustered?

like image 505
ABlueCrayon Avatar asked Apr 22 '17 20:04

ABlueCrayon


People also ask

How do you index is faster in SQL?

Indexing makes columns faster to query by creating pointers to where data is stored within a database. Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it.

How do I make indexes faster?

- Use a partitioned index: You should always partition a very large table, and partition the index. This improves the performance of both index queries (partition pruning) and index creation in parallel.


1 Answers

What indexes do you have now?

Is it really "slow"? What evidence do you have?

Comments on "SELECT * instead of SELECT Employee.salary" --

  • * is bad form because tomorrow you might add a column, thereby breaking any code that is expecting a certain number of columns in a certain order.
  • Dealing with * versus salary does not happen until after the row(s) is located.
  • Locating the row(s) is the costly part.
  • On the other hand, if you have INDEX(salary) and only look at salary then the index is "covering". That means that the "data" (the other columns) does not need to be fetched. Hence, faster. But this is probably beyond what your teacher has told you about yet.

Comments on "the index on salary is non-clustered, and we want to use a clustered index" --

  • In MySQL (not necessarily in other RDBMSs), InnoDB has exactly one PRIMARY KEY and it is always UNIQUE and "clustered".
  • That is, "clustered" implies "unique", which seems inappropriate for "salary".
  • In InnoDB a "secondary key" implicitly includes the column(s) of the PK (ssn?), with which it can reach over into the data.

"verified that the query plan" -- Have you learned about EXPLAIN SELECT ...?

More Tips on creating the optimal index for a given SELECT.

like image 104
Rick James Avatar answered Sep 25 '22 10:09

Rick James