Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why DELETE TOP(X) statement is not deleting the same records that a previous SELECT TOP(X) returned?

Tags:

sql-server

As the title says, I've been running this command to delete some records from a table but I noticed that it wasn't deleting the same records listed if I ran a query selecting the TOP(X) records right before.

Shouldn't the DELETE command follow the same default ordering that SELECT does? If not, why?

I don't know if it matters, but here is the delete command I'm running:

DELETE TOP (100000) t
FROM Ticket AS t
    LEFT JOIN Series s on t.SeriesId = s.id
    LEFT JOIN Payment p on t.id = p.TicketId
WHERE s.EndDate <= DATEADD(MONTH, -1, GETDATE()) AND t.ExportDate is null AND p.TicketId is null

For the select, just replace the first line with SELECT TOP(100000) t.*

like image 337
Henrique Miranda Avatar asked Dec 03 '22 18:12

Henrique Miranda


2 Answers

Without an ORDER BY, the order of the data CANNOT be guaranteed.

  • No Seatbelt - Expecting Order without ORDER BY
  • Aaron Bertrand demo's No Default Order
like image 160
UnhandledExcepSean Avatar answered May 17 '23 14:05

UnhandledExcepSean


Nope. There is no "default ordering". If you don't specify an ORDER BY any ordering is valid.

The resulting order will be execution plan dependant and may not even be stable within a given execution plan (e.g. parallelism may distribute rows to different threads with the exact distribution depending on how busy each scheduler was at the time)

Moreover even ignoring the ORDER BY issue the semantics of the TOP are different for the SELECT and DELETE. If there are multiple matches in Series for a given Ticket (and matching the EndDate) the SELECT will count the joined rows whereas the DELETE will collapse these down to 1 before the delete and just count the rows deleted from Ticket.

To do the DELETE in a deterministic manner you can first construct a SELECT statement that is valid for this purpose (with all joins replaced with EXISTS/NON EXISTS) and then you can put that in a CTE and delete from that.

WITH CTE
     AS (SELECT TOP (100000) t.*
         FROM   Ticket AS t
         WHERE  t.ExportDate IS NULL
                AND EXISTS (SELECT *
                            FROM   Series s
                            WHERE  t.SeriesId = s.id
                                   AND s.EndDate <= DATEADD(MONTH, -1, GETDATE()))
                AND NOT EXISTS(SELECT *
                               FROM   Payment p
                               WHERE  t.id = p.TicketId)
         ORDER  BY t.id)
DELETE FROM CTE 
like image 23
Martin Smith Avatar answered May 17 '23 12:05

Martin Smith