Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL: selecting rows where column value changed from previous row

Tags:

sql

mysql

Let's say I have this (MySQL) database, sorted by increasing timestamp:

Timestamp   System StatusA StatusB  2011-01-01     A      Ok     Ok       2011-01-02     B      Ok     Ok      2011-01-03     A     Fail   Fail      2011-01-04     B      Ok    Fail      2011-01-05     A     Fail    Ok       2011-01-06     A      Ok     Ok       2011-01-07     B     Fail   Fail     

How do I select the rows where StatusA changed from the previous row for that system? StatusB doesn't matter (I show it in this question only to illustrate that there may be many consecutive rows for each system where StatusA doesn't change). In the example above, the query should return the rows 2011-01-03 (StatusA changed between 2011-01-01 and 2011-01-03 for SystemA), 2011-01-06, 2011-01-07.

The query should execute quickly with the table having tens of thousands of records.

Thanks

like image 278
Jimmy Avatar asked Jul 02 '11 22:07

Jimmy


People also ask

How do you use values from previous or next rows in a SQL Server query?

SQL Server LAG() is a window function that provides access to a row at a specified physical offset which comes before the current row. In other words, by using the LAG() function, from the current row, you can access data of the previous row, or the row before the previous row, and so on.

How do I select a row from a specific value in SQL?

To select rows using selection symbols for character or graphic data, use the LIKE keyword in a WHERE clause, and the underscore and percent sign as selection symbols. You can create multiple row conditions, and use the AND, OR, or IN keywords to connect the conditions.

Which join return rows that don't match?

The JOIN or INNER JOIN does not return any non-matching rows at all. It returns only the rows that match in both of the tables you join. If you want to get any unmatched rows, you shouldn't use it. The LEFT JOIN and the RIGHT JOIN get you both matched and unmatched rows.


2 Answers

SELECT a.* FROM tableX AS a WHERE a.StatusA <>       ( SELECT b.StatusA         FROM tableX AS b         WHERE a.System = b.System           AND a.Timestamp > b.Timestamp         ORDER BY b.Timestamp DESC         LIMIT 1       )  

But you can try this as well (with an index on (System,Timestamp):

SELECT System, Timestamp, StatusA, StatusB FROM   ( SELECT (@statusPre <> statusA AND @systemPre=System) AS statusChanged          , System, Timestamp, StatusA, StatusB          , @statusPre := StatusA          , @systemPre := System     FROM tableX        , (SELECT @statusPre:=NULL, @systemPre:=NULL) AS d     ORDER BY System            , Timestamp   ) AS good WHERE statusChanged ; 
like image 55
ypercubeᵀᴹ Avatar answered Sep 20 '22 11:09

ypercubeᵀᴹ


select a.Timestamp, a.System, a.StatusA, a.StatusB from tableX as a cross join tableX as b where a.System = b.System and a.Timestamp > b.Timestamp and not exists (select *      from tableX as c     where a.System = c.System     and a.Timestamp > c.Timestamp     and c.Timestamp > b.Timestamp ) and a.StatusA <> b.StatusA; 

Update addressing a comment: Why not use an inner join instead of a cross join?

The question asks for a MySQL solution. According to the documentation:

In MySQL, CROSS JOIN is a syntactic equivalent to INNER JOIN (they can replace each other). In standard SQL, they are not equivalent. INNER JOIN is used with an ON clause, CROSS JOIN is used otherwise.

This means that either of these joins would work.

The conditional_expr used with ON is any conditional expression of the form that can be used in a WHERE clause. Generally, you should use the ON clause for conditions that specify how to join tables, and the WHERE clause to restrict which rows you want in the result set.

The condition a.System = b.System probably falls under the 'how to join tables' category so using an INNER JOIN would be nicer in this case.

Since both produce the same results, the difference might be in performance. To say which will be faster I would need to know how are the joins implemented internally - whether they use indexes or hashing to do the joining.

like image 41
Jiri Avatar answered Sep 16 '22 11:09

Jiri