I am going through the slow query log to try to determine why some of the queries behave erratically. For the sake of consistency, the queries are not cached and flushing was done to clear system cache before running the test. The query goes something like this: <pre class="prettyprint"><code>SELECT P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask' FROM Property P INNER JOIN Exchange E ON E.currency = P.currency WHERE P.floor_area >= k? AND P.closing_date >= CURDATE() // this and key_buffer_size=0 prevents caching AND P.type ='c' AND P.lat BETWEEN v? AND v? AND P.lng BETWEEN v? AND v? AND P.price * E.rate BETWEEN k? AND k? ORDER BY P.floor_area DESC LIMIT 100; </code></pre> The <code>k?</code> are user defined constant values; <code>v?</code> are variables that change as user drag or zoom on a map. 100 results are pulled out from the table and sorted according to floor area in descending order. A PRIMARY key on <code>id</code> and an INDEX on <code>floor_area</code> is set up only. No other index is created so that MySQL would consistently use <code>floor_area</code> as the only key. The query times and rows examined are recorded as follows: <pre class="prettyprint"><code>query number 1 2 3 4 5 6 7 8 9 10 user action on map start > + + < ^ + > v + time in seconds 138 0.21 0.43 32.3 0.12 0.12 36.3 4.33 0.33 2.00 rows examined ('000) 43 43 43 60 43 43 111 139 133 176 </code></pre> The query execution plan is as follows: <pre class="prettyprint"><code>+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+ | 1 | SIMPLE | P | range | id_flA | id_flA | 3 | NULL | 4223660 | Using where | | 1 | SIMPLE | E | eq_ref | PRIMARY | PRIMARY | 3 | BuySell.P.currency | 1 | Using where | +----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+ </code></pre> The test is being performed a few times and the results are quite consistent with the above. What could be the reason(s) for the spike in query times in query number 4 and number 7 and how do I bring it down? UPDATE: Results of removing <code>ORDER BY</code> as suggested by Digital Precision: <pre class="prettyprint"><code>query number 1 2 3 4 5 6 7 8 9 10 user action on map start > + + < ^ + > v + time in seconds 255 3.10 3.16 3.08 3.18 3.21 3.32 3.18 3.17 3.80 rows examined ('000) 131 131 131 131 136 136 136 136 136 157 </code></pre> The query execution plan is the same as above though it seems more like a table scan. Note that I am using MyISAM engine, version 5.5.14. AS requested, below is schema: <pre class="prettyprint"><code>| Property | CREATE TABLE `Property` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `type` char(1) NOT NULL DEFAULT '', `lat` decimal(6,4) NOT NULL DEFAULT '0.0000', `lng` decimal(7,4) NOT NULL DEFAULT '0.0000', `floor_area` mediumint(8) unsigned NOT NULL DEFAULT '0', `currency` char(3) NOT NULL DEFAULT '', `price` int(10) unsigned NOT NULL DEFAULT '0', `closing_date` date NOT NULL DEFAULT '0000-00-00', `name` char(25) NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `id_flA` (`floor_area`) ) ENGINE=MyISAM AUTO_INCREMENT=5000000 DEFAULT CHARSET=latin1 | Exchange | CREATE TABLE `Exchange` ( `currency` char(3) NOT NULL, `rate` decimal(11,10) NOT NULL DEFAULT '0.0000000000', PRIMARY KEY (`currency`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 </code></pre> 2ND UPDATE: I thought it would be appropriate to post the non-default parameters in the <code>my.cnf</code> configuration file since two of the answerers are mentioning about the parameters: <pre class="prettyprint"><code>max_heap_table_size = 1300M key_buffer_size = 0 read_buffer_size = 1300M read_rnd_buffer_size = 1024M sort_buffer_size = 1300M </code></pre> I have 2GB of RAM on my test server.

Couple of things: <ol> <li>Why are you calculating the product of P.price and E.rate in the SELECT and aliasing as 'ask', then doing the calculation again in the where clause? Should be able to do <code>AND ask BETWEEN k? and k?</code> -- Edit: This won't work due to the way MySQL works. Apparently MySQL evaluates the WHERE clause before any aliases (sourced).</li> <li>What kind of index do you have on Exchange.currency and Property.currency? If exchange is a lookup table, maybe you would be better off adding a pivot (linking) table with Property.Id and Exchange.Id</li> <li>The order by floor_area forces MySQL to create a temp table in order to do the sorting correctly, any chance you can do the sorting at the app layer?</li> <li>Adding an index on type column will help as well.</li> </ol> -- Edit Not sure what you mean by the comment <code>// this and key_buffer_size=0 prevents caching</code> on the CURDATE where conditional, you can force no sql caching using the 'SQL_NO_CACHE' flag on your select statement. What I would recommend now that you have removed the ORDER BY, is to update your query statement as follows (Added P alias to columns to reduce any confusion): <pre class="prettyprint"><code>WHERE P.type ='condominium' AND P.floor_area >= k? AND P.closing_date >= CURDATE() // No longer necessary with SQL_NO_CACHE AND P.lat BETWEEN v? AND v? AND P.lng BETWEEN v? AND v? AND P.price * E.rate BETWEEN k? AND k? </code></pre> Then add an index to the 'type' column and a composite index on the 'type' and 'floor_area' columns. As you stated, the type column is a low-cardinality column, but the table is large and should help. And even though floor_area appears to be a high-cardinality column, the composite index will help speed up your query times. You may also want to research if there is a penalty using BETWEEN rather than range operators ( >, <, <= etc.)

Why spike in query time despite similar number of rows examined?

Tags:

performance

sql

indexing

mysql

testing

I am going through the slow query log to try to determine why some of the queries behave erratically. For the sake of consistency, the queries are not cached and flushing was done to clear system cache before running the test. The query goes something like this:

SELECT P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask' FROM Property P
 INNER JOIN Exchange E ON E.currency = P.currency
 WHERE P.floor_area >= k?
  AND P.closing_date >= CURDATE() // this and key_buffer_size=0 prevents caching
  AND P.type ='c'
  AND P.lat BETWEEN v? AND v?
  AND P.lng BETWEEN v? AND v?
  AND P.price * E.rate BETWEEN k? AND k?
 ORDER BY P.floor_area DESC LIMIT 100;

The k? are user defined constant values; v? are variables that change as user drag or zoom on a map. 100 results are pulled out from the table and sorted according to floor area in descending order.

A PRIMARY key on id and an INDEX on floor_area is set up only. No other index is created so that MySQL would consistently use floor_area as the only key. The query times and rows examined are recorded as follows:

query number              1    2    3    4    5    6    7    8    9    10
user action on map     start   >    +    +    <    ^    +    >    v    +
time in seconds          138  0.21 0.43 32.3 0.12 0.12 36.3 4.33 0.33 2.00
rows examined ('000)      43    43   43   60   43   43  111  139  133  176

The query execution plan is as follows:

+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                | rows    | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
|  1 | SIMPLE      | P     | range  | id_flA        | id_flA  | 3       | NULL               | 4223660 | Using where |
|  1 | SIMPLE      | E     | eq_ref | PRIMARY       | PRIMARY | 3       | BuySell.P.currency |       1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+

The test is being performed a few times and the results are quite consistent with the above. What could be the reason(s) for the spike in query times in query number 4 and number 7 and how do I bring it down?

UPDATE:

Results of removing ORDER BY as suggested by Digital Precision:

query number              1    2    3    4    5    6    7    8    9    10
user action on map     start   >    +    +    <    ^    +    >    v    +
time in seconds          255  3.10 3.16 3.08 3.18 3.21 3.32 3.18 3.17 3.80
rows examined ('000)     131  131  131  131  136  136  136  136  136  157

The query execution plan is the same as above though it seems more like a table scan. Note that I am using MyISAM engine, version 5.5.14.

AS requested, below is schema:

| Property | CREATE TABLE `Property` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `type` char(1) NOT NULL DEFAULT '',
  `lat` decimal(6,4) NOT NULL DEFAULT '0.0000',
  `lng` decimal(7,4) NOT NULL DEFAULT '0.0000',
  `floor_area` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `currency` char(3) NOT NULL DEFAULT '',
  `price` int(10) unsigned NOT NULL DEFAULT '0',
  `closing_date` date NOT NULL DEFAULT '0000-00-00',
  `name` char(25) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `id_flA` (`floor_area`)
) ENGINE=MyISAM AUTO_INCREMENT=5000000 DEFAULT CHARSET=latin1

| Exchange | CREATE TABLE `Exchange` (
  `currency` char(3) NOT NULL,
  `rate` decimal(11,10) NOT NULL DEFAULT '0.0000000000',
  PRIMARY KEY (`currency`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

2ND UPDATE:

I thought it would be appropriate to post the non-default parameters in the my.cnf configuration file since two of the answerers are mentioning about the parameters:

max_heap_table_size = 1300M
key_buffer_size = 0
read_buffer_size = 1300M
read_rnd_buffer_size = 1024M
sort_buffer_size = 1300M

I have 2GB of RAM on my test server.

653

asked Jan 04 '12 05:01

Question Overflow

2 Answers

I guess I figure out the reason of spikes. Here is how it goes :

First I created the tables and load some randomly generated data on it:

Here is my query:

SELECT SQL_NO_CACHE P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask' 
FROM Property P
 INNER JOIN Exchange E ON E.currency = P.currency
 WHERE P.floor_area >= 2000
  AND P.closing_date >= CURDATE()
  AND P.type ='c'
  AND P.lat BETWEEN 12.00 AND 22.00
  AND P.lng BETWEEN 10.00 AND 20.00
  AND P.price BETWEEN 100 / E.rate AND 10000 / E.rate
 ORDER BY P.floor_area DESC LIMIT 100;

And here is the describe :

+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key    | key_len | ref  | rows    | Extra                                        |
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
|  1 | SIMPLE      | P     | range | id_flA        | id_flA | 3       | NULL | 4559537 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | E     | ALL   | PRIMARY       | NULL   | NULL    | NULL |       6 | Using where; Using join buffer               |
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+

it took between 3.5 ~ 3.9 sec every time I query the data (didn't make any difference which parameters I use). It didn't make sense so I researched Using join buffer

Then I wanted to try this query without "join buffer" so I inserted 1 more random data to Exchange table.

INSERT INTO Exchange(currency, rate) VALUES('JJ', 1);

Now I use the same sql and the it took 0.3 ~ 0.5 seconds for response. And here is the describe :

+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref             | rows    | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
|  1 | SIMPLE      | P     | range  | id_flA        | id_flA  | 3       | NULL            | 4559537 | Using where |
|  1 | SIMPLE      | E     | eq_ref | PRIMARY       | PRIMARY | 3       | test.P.currency |       1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+

So the problem (as far as I see), the optimizer trying to use "join buffer". The optimum solution of this problem would be to force optimizer not to use "join buffer". (which I couldn't find how to) or change the "join_buffer_size" value. I solve it by adding "dummy" values to Exchange table (so the optimizer wouldn't use join buffer) but it's not a exact solution, its just a stupid trick to fool mysql.

Edit : I researched in mysql forums/bugs about this "join buffer" behavior; then asked about it in official forums. I am going to fill a bug report about this irrational behavior of optimizer.

answered Nov 10 '22 06:11

frail

Couple of things:

Why are you calculating the product of P.price and E.rate in the SELECT and aliasing as 'ask', then doing the calculation again in the where clause? Should be able to do AND ask BETWEEN k? and k? -- Edit: This won't work due to the way MySQL works. Apparently MySQL evaluates the WHERE clause before any aliases (sourced).
What kind of index do you have on Exchange.currency and Property.currency? If exchange is a lookup table, maybe you would be better off adding a pivot (linking) table with Property.Id and Exchange.Id
The order by floor_area forces MySQL to create a temp table in order to do the sorting correctly, any chance you can do the sorting at the app layer?
Adding an index on type column will help as well.

-- Edit

Not sure what you mean by the comment // this and key_buffer_size=0 prevents caching on the CURDATE where conditional, you can force no sql caching using the 'SQL_NO_CACHE' flag on your select statement.

What I would recommend now that you have removed the ORDER BY, is to update your query statement as follows (Added P alias to columns to reduce any confusion):

WHERE P.type ='condominium'
    AND P.floor_area >= k?
    AND P.closing_date >= CURDATE() // No longer necessary with SQL_NO_CACHE
    AND P.lat BETWEEN v? AND v?
    AND P.lng BETWEEN v? AND v?
    AND P.price * E.rate BETWEEN k? AND k?

Then add an index to the 'type' column and a composite index on the 'type' and 'floor_area' columns. As you stated, the type column is a low-cardinality column, but the table is large and should help. And even though floor_area appears to be a high-cardinality column, the composite index will help speed up your query times.

You may also want to research if there is a penalty using BETWEEN rather than range operators ( >, <, <= etc.)

answered Nov 10 '22 07:11

Mike Purcell

Related questions
                            
                                Problems displaying Japanese characters using PHP and MySQL
                            
                                How to order mysql results by column priority?
                            
                                Fatal error encountered during command execution
                            
                                Load data infile with blob field
                            
                                How do I remove a uniqueness constraint from a MySQL table?
                            
                                sql - Mysql : Left join on multiple rows and retrieve 1 row
                            
                                Write once, read many mysql field
                            
                                How can I store the '€' symbol in MySQL using PHP? [duplicate]
                            
                                json column vs multiple columns
                            
                                Benefits of migrating from MySQL to Oracle for Amazon RDS. Worth it? [A: no]
                            
                                Mass update simple product "status" via MySQL query
                            
                                How should I escape characters inside this LIKE query?
                            
                                Where can I find the file my.cnf or my.ini file? [closed]
                            
                                Are identical primary keys bad practice?
                            
                                MySQL count all NULL columns
                            
                                How to select ORDER BY column and RAND() both?
                            
                                How to "subtract" privileges in MySQL
                            
                                MySQL select in function
                            
                                In SQLAlchemy, how do I create a unique pair?
                            
                                LOAD DATA INFILE does not work

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With