Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLITE equivalent of ROW_NUMBER

I have a table in SQLite

/* Create a table called NAMES */
CREATE TABLE EVENTS(Id integer , Eventtype integer,value integer,Timestamp DATETIME);

/* Create few records in this table */
INSERT INTO EVENTS VALUES(1,2,1,'2009-01-01 10:00:00');  --ROW1
INSERT INTO EVENTS VALUES(1,2,2,'2007-01-01 10:00:00');  --ROW2
INSERT INTO EVENTS VALUES(2,2,3,'2008-01-01 10:00:00’);  --ROW3

What is needed from the query is ROW1 and ROW3. The query should take most latest row based timestamp for duplicate ID & Eventtype combination. ROW1 and ROW2 are having same eventtype and id but ROW1 is latest so it should be picked.

like image 809
Puneet Avatar asked Jan 10 '23 22:01

Puneet


2 Answers

In SQLite 3.7.11 or later, you can use GROUP BY with MAX() to select which row in a group to return:

SELECT *, MAX(timestamp)
FROM events
GROUP BY id, eventtype

In earlier versions, you have to look up some unique ID of the largest row in a group with a subquery (as in you answer).

like image 190
CL. Avatar answered Jan 16 '23 19:01

CL.


I'm a bit late to this question, but I wasn't satisfied with the current answers as they mostly use correlated subqueries which can seriously ruin performance.

In many situations, single-column analytical functions can simulated using a standard join:

SELECT e.*
FROM events e
JOIN
(
    -- Our simulated window with analytical result
    SELECT id, eventtype, MAX(timestamp) AS timestamp
    FROM events
    GROUP BY id, eventtype
) win
USING (id, eventtype, timestamp)

In general, the pattern is:

SELECT main.*
FROM main
JOIN
(
    SELECT
        partition_columns,
        FUNCTION(analyzed_column) AS analyzed_column
    FROM main
    GROUP BY partition_columns
) win
USING (partition_columns, analyzed_column)

These simulated windows aren't perfect:

  1. If your data has ties for your analyzed column result then you may need to remove duplicates from your result set. Otherwise you'll select every row from your partition that matches your analyzed column result.
  2. If you analytical function requires ordering by more than one column, you will need to use correlated subqueries instead. The other answers can be modified to achieve the desired result.
like image 29
Mr. Llama Avatar answered Jan 16 '23 17:01

Mr. Llama