I have a table called Products
.
This table contains over 3 million entries. Every day there are approximately 5000 new entries. which only happens during the night in 2 minutes.
But this table gets queried every night maybe over 20 000 times with this query.
SELECT Price
FROM Products
WHERE Code = @code
AND Company = @company
AND CreatedDate = @createdDate
Table structure:
Code nvarchar(50)
Company nvarchar(10)
CreatedDate datetime
I can see that this query takes about a second to return a result from Products
table.
There is no productId
column in the table as it is not needed. So there is no primary key in the table.
I would like to somehow improve this query to return the result faster.
I have never used indexes before. What would be the best way to use indexes on this table?
If I provide a primary key do you think it would speed up the query result? Keep in mind that I will still have to query the table by providing 3 parameters as
WHERE Code = @code
AND Company = @company
AND CreatedDate = @createdDate.
This is mandatory.
As I mentioned that the table gets new entries in 2 minutes every day during the night. How would this affect the indexes?
If I use indexes, which column would be the best to use and whether I should use clustered or non-clustered indexes?
Indexing makes columns faster to query by creating pointers to where data is stored within a database. Imagine you want to find a piece of information that is within a large database. To get this information out of the database the computer will look through every row until it finds it.
Add the index to the new empty table. copy the data from the old table to the new table in chunks. drop the old table. My theory is that it will be less expensive to index the data as it is added than to dig through the data that is already there and add the index after the fact.
A clustered index may be the fastest for one SELECT statement but it may not necessarily be correct choice. SQL Server indices are b-trees. A non-clustered index just contains the indexed columns, with the leaf nodes of the b-tree being pointers to the approprate data page.
The best thing to do would depend on what other fields the table has and what other queries run against that table.
Without more details, a non-clustered index on (code, company, createddate) that included the "price" column will certainly improve performance.
CREATE NONCLUSTERED INDEX IX_code_company_createddate
ON Products(code, company, createddate)
INCLUDE (price);
That's because if you have that index in place, then SQL will not access the actual table at all when running the query, as it can find all rows with a given "code, company, createddate" in the index and it will be able to do that really fast as the index allows precisely for fast access when using the fields that define the key, and it will also have the "price" value for each row.
Regarding the inserts, for each row added, SQL Server will have to add them to the index as well, so performance for inserts will be impacted. In think you should expect the gains on SELECT performance to outweigh the impact on the inserts, but you should test that.
Also, you will be using more space as the index will store all those fields for each row besides the space used by the original table.
As others have noted in the comments, adding a PK to your table (even if that means adding a ProductId column you don't actually need) might be a good idea as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With