I have a table with a composite primary key (ID
, Date
) like below.
+------+------------+-------+ | ID | Date | Value | +------+------------+-------+ | 1 | 1433419200 | 15 | | 1 | 1433332800 | 23 | | 1 | 1433246400 | 41 | | 1 | 1433160000 | 55 | | 1 | 1432900800 | 24 | | 2 | 1433419200 | 52 | | 2 | 1433332800 | 23 | | 2 | 1433246400 | 39 | | 2 | 1433160000 | 22 | | 3 | 1433419200 | 11 | | 3 | 1433246400 | 58 | | ... | ... | ... | +------+------------+-------+
There is also a separate index on Date
column. The table is of moderate size, currently ~600k row and growing by ~2k everyday.
I want to do a single SELECT query that returns the latest 3 records (ordered by Date
timestamp) for each ID
. For each given ID
, the Date
values are always unique, so no need to worry about ties for Date
here.
I've tried a self-join approach, inspired by this answer, but it took quite a few seconds to run and returned nothing:
SELECT p1.ID, p1.Date, p1.Value FROM MyTable AS p1
LEFT JOIN MyTable AS p2
ON p1.ID=p2.ID AND p1.Date<=p2.Date
GROUP BY p1.ID
HAVING COUNT(*)<=5
ORDER BY p1.ID, p1.Date DESC;
What would be a fast solution here?
This one-liner is the simplest query in the list, to get the last 3 number of records in a table. The TOP clause in SQL Server returns the first N number of records or rows from a table. Applying the ORDER BY clause with DESC, will return rows in descending order. Hence, we get the last 3 rows.
We can use the ORDER BY statement and LIMT clause to extract the last data. The basic idea is to sort the sort the table in descending order and then we will limit the number of rows to 1. In this way, we will get the output as the last row of the table. And then we can select the entry which we want to retrieve.
In SQL Server, we can easily select the last 10 records from a table by using the “SELECT TOP” statement. The TOP clause in SQL Server is used to control the number or percentage of rows from the result. And to select the records from the last, we have to arrange the rows in descending order.
SELECT id, category_id, post_title FROM posts WHERE id IN ( SELECT MAX(id) FROM posts GROUP BY category_id ); This will return the posts with the highest IDs in each group. Save this answer.
You could look up the three most recent dates for each ID:
SELECT ID, Date, Value
FROM MyTable
WHERE Date IN (SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 3)
Alternatively, look up the third most recent date for each ID, and use it as a limit:
SELECT ID, Date, Value
FROM MyTable
WHERE Date >= IFNULL((SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 1 OFFSET 2),
0)
Both queries should get good performance from the primary key's index.
First, here is the correct query for the inequality method:
SELECT p1.ID, p1.Date, p1.Value
FROM MyTable p1 LEFT JOIN
MyTable AS p2
ON p1.ID = p2.ID AND p2.Date <= p1.Date
--------------------------^ fixed this condition
GROUP BY p1.ID, p1.Date, p1.Value
HAVING COUNT(*) <= 5
ORDER BY p1.ID, p1.Date DESC;
I'm not sure if there is a fast way to do this in SQLite. In most other databases, you can use the ANSI standard row_number()
function. In MySQL, you can use variables. Both of these are difficult in SQLite. Your best solution may be to use a cursor.
The above can benefit from an index on MyTable(Id, Date)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With