Let me explain by way of example. Consider the following table:
Customer Id | Food
------------+---------
1 | Pizza
1 | Burger
1 | Hot Dog
2 | Milkshake
2 | Burger
3 | Pizza
I want to delete all the records for customers who have NEVER ordered Pizza. So, I should be left with this (customer #2 deleted):
Customer Id | Food
------------+---------
1 | Pizza
1 | Burger
1 | Hot Dog
3 | Pizza
I know I can do a NOT IN
but the performance is horrible.
What is the most performant way to write this query to achieve this against 100,000+ records in SQL Server?
TRUNCATE TABLE is functionally identical to DELETE statement with no WHERE clause: both remove all rows in the table. But TRUNCATE TABLE is faster and uses fewer system and transaction log resources than DELETE.
If you need to remove 10 million rows and have 1 GB of log space available use Delete TOP(10000) From dbo. myTable (with your select clause) and keep running it till there are no more rows to delete.
A simple NOT EXISTS
should be efficient with proper indexes.
DELETE c1 FROM Customers c1
WHERE NOT EXISTS
(
SELECT 1 FROM Customers c2
WHERE c1.[Customer Id] = c2.[Customer Id]
AND c2.[Food] = 'Pizza'
);
Demo
Create an index on [Customer Id]
and a non-clustered index on Food
.
how about NOT EXISTS
DELETE a
FROM table1 a
WHERE NOT EXISTS
(
SELECT 1
FROM table1 b
WHERE a.customerID = b.customerID AND
b.Food = 'Pizza'
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With