Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Query - Need to improve performance

I have a data load scenario where I create dynamic sql query to pull data and cache in our service. There is 1 table that contains all product data : ProductHistory (47 columns, 200,000 records + and will keep growing)

What I need: Get the latest products by using the maximum id, maximum version and maximum changeid.

First Attempt:

SELECT distinct Product.* FROM ProductHistory product 
WHERE  product.version = 
(SELECT max(version) from ProductHistory p2 where product.Id = p2.Id 
  and product.changeId = 
(SELECT max(changeid) from ProductHistory p3 where p2.changeId = p3.changeId))

This took more than 2.51 minutes.

Other Failed Attempt:

select distinct product.* from ProductHistory product 
where CAST(CAST(id as nvarchar)+'0'+CAST(Version as nvarchar)+'0'+CAST(changeid as nvarchar) as decimal) = 
(select MAX(CAST(CAST(id as nvarchar)+'0'+CAST(Version as nvarchar)+'0'+CAST(changeid as nvarchar) as decimal)) from ProductHistory p2 
where product.Id = p2.Id)

It basically uses the same principle as when you order dates, concatenating the numbers ordered by relevance.

For example 11 Jun 2007 = 20070711
And in our case: Id = 4 , version = 127, changeid = 32   => 40127032
The zeros are there not to mix up the 3 different ids

But this one takes 3.10 minutes !!! :(

So, I basically need a way to make my first attempt query better by any chance. I was also wondering with such amount of data, is this the best speed of retrieval that I should expect ?

  1. I ran sp_helpindex ProductHistory and found out the indexes as below :

    PK_ProductHistoryNew - clustered, unique, primary key located on PRIMARY- Id, Version

  2. I wrapped the first query in a SP but still no change.

So, wondering by what other means we can improve the performance of this operation ?

Thanks, Mani p.s : I am just running these queries in SQL management stuido to see the time.

like image 698
Everything Matters Avatar asked Jul 20 '11 09:07

Everything Matters


1 Answers

Run the query from Sql Server Management Studio and look at the query plan to see where the bottle neck is. Any place you see a "table scan" or "index scan" it has to go through all data to find what it is looking for. If you create appropriate indexes that can be used for these operations it should increase performance.

like image 53
Anders Abel Avatar answered Sep 21 '22 12:09

Anders Abel