Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server DELETE and SELECT Behave Differently with Same WHERE Clause

Tags:

sql

sql-server

I have a table which is populated by a daily scheduled job that deletes the last 7 days of data and then repopulates with the 7 most recent days worth of data from another source (mainframe).

Recently, users reported a number of duplicates going back to the beginning of October 2011. ...in the magnitude of hundreds of thousand of rows.

I noticed strange behavior with the delete that runs for each job:

DELETE FROM fm104d 
 WHERE location = '18'
   AND (CONVERT(datetime,CASE WHEN ISDATE(pull_date)=0 THEN '19000101' 
                 ELSE pull_date END)) >  DATEADD(day, -7, getdate())

The above returns "(0 row(s) affected)".

When I run the above after replacing the DELETE with a SELECT *, I get 32,000+ rows in return.

Why would the SELECT and DELETE behave differently?

UPDATE

Here is the Actual Execution Plan:

http://pastie.org/2869202

like image 468
Mark Bowytz Avatar asked Nov 14 '11 21:11

Mark Bowytz


People also ask

Can we use WHERE clause in delete?

DELETE Syntax Notice the WHERE clause in the DELETE statement. The WHERE clause specifies which record(s) should be deleted. If you omit the WHERE clause, all records in the table will be deleted!

How can we improve the performance of delete statement in SQL Server?

To resolve this issue, we can use the following methods: Using TABLOCK hint with the SQL delete statements. Using ALTER TABLE “heap table name” REBUILD command. Creating and dropping a clustered index on the heap table.

Can I use SELECT in WHERE clause in SQL?

You should use the WHERE clause to filter the records and fetching only the necessary records. The WHERE clause is not only used in the SELECT statement, but it is also used in the UPDATE, DELETE statement, etc., which we would examine in the subsequent chapters.

Can we use Nolock in delete statement?

A delete statement acquires an exclusive intent lock on the reference table; therefore, during that time, no other transactions can modify the data. You can use NOLOCK hint to read the data.


1 Answers

You won't believe this. I didn't in fact as it makes almost no logical sense, but in the end, the solution that worked...was to add an index.

Credit for this goes to my local DBA "Did think about adding an index? I just did to test and sure enough it works".

Here's the index as added:

CREATE  INDEX ixDBO_fir104d__SOURCE_LOCATION__Include
ON [dbo].[fir104d] ([SOURCE_LOCATION])
INCLUDE ([Transaction_Date],[PULL_DATE])
GO

I let the job run as scheduled and, sure enough, all is as it was.

My guess is that there is something in the explain plan to say it wasn't using an index / wrong index, but my developer mind can't make much sense of that level of detail.

Thanks to everybody for the time and effort you've all spent.

UPDATE

Received news from a different dev that the data in this table additionally corrupted to the point where it took "several hours of DBA involvement to resolve" along with the dev having to perform some other data fixes (read:data file reloads).

At the end of the day, while adding the index was probably a good thing considering the way the scheduled job runs, apparently, there was even more to the story!

like image 197
Mark Bowytz Avatar answered Sep 28 '22 11:09

Mark Bowytz