Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance: What's the difference between the two queries

I originally had written the following

SELECT t1.TransactionNumber
FROM t1
    JOIN
    (
          SELECT MAX(id) id
          FROM t1
          WHERE Period BETWEEN '01-11-2013' and '01-12-2014'
          GROUP BY AccountNumber
    ) t2
        on t1.id= t2.id

But it was too slow. It took around 20 seconds, so as a test, I changed it to the following

 SELECT MAX(id) AS id
 INTO #t2
 FROM t1
 WHERE Period BETWEEN '01-11-2013' and '01-12-2014'
 GROUP BY AccountNumber

 SELECT t1.id 
 FROM t1
    JOIN #t2 t2
        ON t1.id= t2.id

The second query took only 1 second to run. The second query does a index seek using the PK key, whereas the first key does a scan.

Note: id is the primary key clustered on the t1 table.

like image 399
user172839 Avatar asked Nov 10 '22 12:11

user172839


1 Answers

This is a guess, but it could be because the statistics on the primary key are not working for you. If the query optimizer thinks you are only going to return 10 records from your inner join, but instead you return 100, it overflows the memory buffer and then ends up having to write the results of the subquery to disk. If you post your query execution plan results, it should become pretty obvious.

like image 79
attila Avatar answered Nov 14 '22 22:11

attila