Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Write a SQL query to find all numbers that appear at least three times

Tags:

select

mysql

I'm practicing SQL language and got a question like:

Write a SQL query to find all numbers that appear at least three times consecutively.

+----+-----+
| Id | Num |
+----+-----+
| 1  |  1  |
| 2  |  1  |
| 3  |  1  |
| 4  |  2  |
| 5  |  1  |
| 6  |  2  |
| 7  |  2  |
+----+-----+

For example, given the above Logs table, 1 is the only number that appears consecutively for at least three times.

I got a solution online and test it. But I really do not understand it. The big picture of the solution is clear. sq table counts the occurrences. But I did not understand the part of computing the sq. I've done a lot of research on the MYSQL. @counter := IF(@prev = Num, @counter + 1, 1) means if prev = Num, making counter = counter + 1, otherwise counter = 1. (SELECT @counter:=1, @prev:=NULL) vars means create a table vars which includes two columns counter and pre.

Can anyone help me explain the logic of the sq part? Or is there any tutorial for this kind of expressions in the SELECT? I'm totally new to SQL and I know this question may be pretty simple. Thanks for your help!

SELECT  DISTINCT(Num) AS ConsecutiveNums
FROM (
    SELECT
    Num,
    @counter := IF(@prev = Num, @counter + 1, 1) AS how_many_cnt_in_a_row,
    @prev := Num
    FROM Logs y, (SELECT @counter:=1, @prev:=NULL) vars
) sq
WHERE how_many_cnt_in_a_row >= 3
like image 608
Fihop Avatar asked Jul 23 '15 03:07

Fihop


1 Answers

Let's go through each record and see how this query works. It's well written.

SELECT...FROM

What does SELECT...FROM Logs y, (...) vars mean?

If you had a table like this: create table test(field1 int) that contains 3 rows like so:

field1
-------
1
2
3

Doing select * from test, (select @counter:=1, @prev:=NULL) vars will result in

field1  @counter:=1  @prev=NULL
------- ------------ -----------
1       1            NULL
2       1            NULL
3       1            NULL

@counter and @prev are session variables. They are initialized to 1 and NULL respectively. All rows are combined with these variables to give you what you see above.

Row by row analysis of the subquery

Focus on just this subquery.

SELECT
Num,
@counter := IF(@prev = Num, @counter + 1, 1) AS how_many_cnt_in_a_row,
@prev := Num
FROM Logs y, (SELECT @counter:=1, @prev:=NULL) vars

The query selects the first row of ID=1, Num=1, and chooses Num as it's first column.

For the 2nd column, it does some math. It checks if @prev = Num. Well, @prev is NULL because that's how it was initialized. So, @prev = Num results in false. IF is generally written as IF(condition, what-to-do-if-condition-is-true, what-to-do-if-condition-is-false).

IF(@prev = Num, @counter + 1, 1)
   -----------  ------------  --
   condition    do this       do this if condition
                if true       is false

Since @prev is NULL and not equal to Num, 1 is returned.

For the 3rd column, the query just resets @prev to Num. That's really all. Now let's see how SELECT goes line by line and does its magic.

Num  @prev was  @counter was  @counter calculation      @prev reset to Num
---  ---------  ------------  -----------------------   ------------------
1    NULL       1             is @prev = 1? No. So 1      1
1    1          1             is @prev = 1? Yes! So 2     1
1    1          2             is @prev = 1? Yes! So 3     1
2    1          3             is @prev = 2? No. So 1      2
1    2          1             is @prev = 1? No. So 1      1
2    1          1             is @prev = 2? No. So 1      2
2    2          1             is @prev = 2? Yes! So 2     2

The 2nd and 3rd column above are for understanding purposes only.

Now that the subquery has done its job, SELECT DISTINCT... comes and asks: From the result above, give me only the row that has @counter of 3 or higher. The result is going to be

Num   @counter  @prev
----  --------  -----
1     3         1

If your dataset had five 1s one after the other, 3rd, 4th and 5th 1 will be retrieved. Therefore, DISTINCT(Num) is used to select only a single 1. It's just smart thinking. It may be possible to change the WHERE clause to read WHERE ... = 3 insted of >= 3.

Hope this makes sense.

like image 148
zedfoxus Avatar answered Nov 06 '22 17:11

zedfoxus