Is it possible to build a single mysql query (without variables) to remove all records from the table, except latest N (sorted by id desc)?
Something like this, only it doesn't work :)
delete from table order by id ASC limit ((select count(*) from table ) - N)
Thanks.
The following is the syntax to get the last 10 records from the table. Here, we have used LIMIT clause. SELECT * FROM ( SELECT * FROM yourTableName ORDER BY id DESC LIMIT 10 )Var1 ORDER BY id ASC; Let us now implement the above query.
The SQL EXCEPT operator is used to return all rows in the first SELECT statement that are not returned by the second SELECT statement. Each SELECT statement will define a dataset. The EXCEPT operator will retrieve all records from the first dataset and then remove from the results all records from the second dataset.
1 Answer. ORDER BY id ASC; In the above query, we used subquery with the TOP clause that returns the table with the last 5 records sorted by ID in descending order. Again, we used to order by clause to sort the result-set of the subquery in ascending order by the ID column.
You cannot delete the records that way, the main issue being that you cannot use a subquery to specify the value of a LIMIT clause.
This works (tested in MySQL 5.0.67):
DELETE FROM `table` WHERE id NOT IN ( SELECT id FROM ( SELECT id FROM `table` ORDER BY id DESC LIMIT 42 -- keep this many records ) foo );
The intermediate subquery is required. Without it we'd run into two errors:
Fortunately, using an intermediate subquery allows us to bypass both of these limitations.
Nicole has pointed out this query can be optimised significantly for certain use cases (such as this one). I recommend reading that answer as well to see if it fits yours.
I know I'm resurrecting quite an old question, but I recently ran into this issue, but needed something that scales to large numbers well. There wasn't any existing performance data, and since this question has had quite a bit of attention, I thought I'd post what I found.
The solutions that actually worked were the Alex Barrett's double sub-query/NOT IN
method (similar to Bill Karwin's), and Quassnoi's LEFT JOIN
method.
Unfortunately both of the above methods create very large intermediate temporary tables and performance degrades quickly as the number of records not being deleted gets large.
What I settled on utilizes Alex Barrett's double sub-query (thanks!) but uses <=
instead of NOT IN
:
DELETE FROM `test_sandbox` WHERE id <= ( SELECT id FROM ( SELECT id FROM `test_sandbox` ORDER BY id DESC LIMIT 1 OFFSET 42 -- keep this many records ) foo );
It uses OFFSET
to get the id of the Nth record and deletes that record and all previous records.
Since ordering is already an assumption of this problem (ORDER BY id DESC
), <=
is a perfect fit.
It is much faster, since the temporary table generated by the subquery contains just one record instead of N records.
I tested the three working methods and the new method above in two test cases.
Both test cases use 10000 existing rows, while the first test keeps 9000 (deletes the oldest 1000) and the second test keeps 50 (deletes the oldest 9950).
+-----------+------------------------+----------------------+ | | 10000 TOTAL, KEEP 9000 | 10000 TOTAL, KEEP 50 | +-----------+------------------------+----------------------+ | NOT IN | 3.2542 seconds | 0.1629 seconds | | NOT IN v2 | 4.5863 seconds | 0.1650 seconds | | <=,OFFSET | 0.0204 seconds | 0.1076 seconds | +-----------+------------------------+----------------------+
What's interesting is that the <=
method sees better performance across the board, but actually gets better the more you keep, instead of worse.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With