Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why spike in query time despite similar number of rows examined?

I am going through the slow query log to try to determine why some of the queries behave erratically. For the sake of consistency, the queries are not cached and flushing was done to clear system cache before running the test. The query goes something like this:

SELECT P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask' FROM Property P
 INNER JOIN Exchange E ON E.currency = P.currency
 WHERE P.floor_area >= k?
  AND P.closing_date >= CURDATE() // this and key_buffer_size=0 prevents caching
  AND P.type ='c'
  AND P.lat BETWEEN v? AND v?
  AND P.lng BETWEEN v? AND v?
  AND P.price * E.rate BETWEEN k? AND k?
 ORDER BY P.floor_area DESC LIMIT 100;

The k? are user defined constant values; v? are variables that change as user drag or zoom on a map. 100 results are pulled out from the table and sorted according to floor area in descending order.

A PRIMARY key on id and an INDEX on floor_area is set up only. No other index is created so that MySQL would consistently use floor_area as the only key. The query times and rows examined are recorded as follows:

query number              1    2    3    4    5    6    7    8    9    10
user action on map     start   >    +    +    <    ^    +    >    v    +
time in seconds          138  0.21 0.43 32.3 0.12 0.12 36.3 4.33 0.33 2.00
rows examined ('000)      43    43   43   60   43   43  111  139  133  176

The query execution plan is as follows:

+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                | rows    | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
|  1 | SIMPLE      | P     | range  | id_flA        | id_flA  | 3       | NULL               | 4223660 | Using where |
|  1 | SIMPLE      | E     | eq_ref | PRIMARY       | PRIMARY | 3       | BuySell.P.currency |       1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+

The test is being performed a few times and the results are quite consistent with the above. What could be the reason(s) for the spike in query times in query number 4 and number 7 and how do I bring it down?

UPDATE:

Results of removing ORDER BY as suggested by Digital Precision:

query number              1    2    3    4    5    6    7    8    9    10
user action on map     start   >    +    +    <    ^    +    >    v    +
time in seconds          255  3.10 3.16 3.08 3.18 3.21 3.32 3.18 3.17 3.80
rows examined ('000)     131  131  131  131  136  136  136  136  136  157

The query execution plan is the same as above though it seems more like a table scan. Note that I am using MyISAM engine, version 5.5.14.

AS requested, below is schema:

| Property | CREATE TABLE `Property` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `type` char(1) NOT NULL DEFAULT '',
  `lat` decimal(6,4) NOT NULL DEFAULT '0.0000',
  `lng` decimal(7,4) NOT NULL DEFAULT '0.0000',
  `floor_area` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `currency` char(3) NOT NULL DEFAULT '',
  `price` int(10) unsigned NOT NULL DEFAULT '0',
  `closing_date` date NOT NULL DEFAULT '0000-00-00',
  `name` char(25) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `id_flA` (`floor_area`)
) ENGINE=MyISAM AUTO_INCREMENT=5000000 DEFAULT CHARSET=latin1

| Exchange | CREATE TABLE `Exchange` (
  `currency` char(3) NOT NULL,
  `rate` decimal(11,10) NOT NULL DEFAULT '0.0000000000',
  PRIMARY KEY (`currency`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

2ND UPDATE:

I thought it would be appropriate to post the non-default parameters in the my.cnf configuration file since two of the answerers are mentioning about the parameters:

max_heap_table_size = 1300M
key_buffer_size = 0
read_buffer_size = 1300M
read_rnd_buffer_size = 1024M
sort_buffer_size = 1300M

I have 2GB of RAM on my test server.

like image 653
Question Overflow Avatar asked Jan 04 '12 05:01

Question Overflow


People also ask

Which shows how many rows were returned and how long the query took to execute?

mysql shows how many rows were returned and how long the query took to execute, which gives you a rough idea of server performance.

Why is MySQL query so slow?

Queries can become slow for various reasons ranging from improper index usage to bugs in the storage engine itself. However, in most cases, queries become slow because developers or MySQL database administrators neglect to monitor them and keep an eye on their performance.

Do Joins slow down query?

Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.


2 Answers

I guess I figure out the reason of spikes. Here is how it goes :

First I created the tables and load some randomly generated data on it:

Here is my query:

SELECT SQL_NO_CACHE P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask' 
FROM Property P
 INNER JOIN Exchange E ON E.currency = P.currency
 WHERE P.floor_area >= 2000
  AND P.closing_date >= CURDATE()
  AND P.type ='c'
  AND P.lat BETWEEN 12.00 AND 22.00
  AND P.lng BETWEEN 10.00 AND 20.00
  AND P.price BETWEEN 100 / E.rate AND 10000 / E.rate
 ORDER BY P.floor_area DESC LIMIT 100;

And here is the describe :

+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key    | key_len | ref  | rows    | Extra                                        |
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
|  1 | SIMPLE      | P     | range | id_flA        | id_flA | 3       | NULL | 4559537 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | E     | ALL   | PRIMARY       | NULL   | NULL    | NULL |       6 | Using where; Using join buffer               |
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+

it took between 3.5 ~ 3.9 sec every time I query the data (didn't make any difference which parameters I use). It didn't make sense so I researched Using join buffer

Then I wanted to try this query without "join buffer" so I inserted 1 more random data to Exchange table.

INSERT INTO Exchange(currency, rate) VALUES('JJ', 1);

Now I use the same sql and the it took 0.3 ~ 0.5 seconds for response. And here is the describe :

+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref             | rows    | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
|  1 | SIMPLE      | P     | range  | id_flA        | id_flA  | 3       | NULL            | 4559537 | Using where |
|  1 | SIMPLE      | E     | eq_ref | PRIMARY       | PRIMARY | 3       | test.P.currency |       1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+

So the problem (as far as I see), the optimizer trying to use "join buffer". The optimum solution of this problem would be to force optimizer not to use "join buffer". (which I couldn't find how to) or change the "join_buffer_size" value. I solve it by adding "dummy" values to Exchange table (so the optimizer wouldn't use join buffer) but it's not a exact solution, its just a stupid trick to fool mysql.

Edit : I researched in mysql forums/bugs about this "join buffer" behavior; then asked about it in official forums. I am going to fill a bug report about this irrational behavior of optimizer.

like image 66
frail Avatar answered Nov 10 '22 06:11

frail


Couple of things:

  1. Why are you calculating the product of P.price and E.rate in the SELECT and aliasing as 'ask', then doing the calculation again in the where clause? Should be able to do AND ask BETWEEN k? and k? -- Edit: This won't work due to the way MySQL works. Apparently MySQL evaluates the WHERE clause before any aliases (sourced).

  2. What kind of index do you have on Exchange.currency and Property.currency? If exchange is a lookup table, maybe you would be better off adding a pivot (linking) table with Property.Id and Exchange.Id

  3. The order by floor_area forces MySQL to create a temp table in order to do the sorting correctly, any chance you can do the sorting at the app layer?

  4. Adding an index on type column will help as well.

-- Edit

Not sure what you mean by the comment // this and key_buffer_size=0 prevents caching on the CURDATE where conditional, you can force no sql caching using the 'SQL_NO_CACHE' flag on your select statement.

What I would recommend now that you have removed the ORDER BY, is to update your query statement as follows (Added P alias to columns to reduce any confusion):

WHERE P.type ='condominium'
    AND P.floor_area >= k?
    AND P.closing_date >= CURDATE() // No longer necessary with SQL_NO_CACHE
    AND P.lat BETWEEN v? AND v?
    AND P.lng BETWEEN v? AND v?
    AND P.price * E.rate BETWEEN k? AND k?

Then add an index to the 'type' column and a composite index on the 'type' and 'floor_area' columns. As you stated, the type column is a low-cardinality column, but the table is large and should help. And even though floor_area appears to be a high-cardinality column, the composite index will help speed up your query times.

You may also want to research if there is a penalty using BETWEEN rather than range operators ( >, <, <= etc.)

like image 3
Mike Purcell Avatar answered Nov 10 '22 07:11

Mike Purcell