I have a table with measurements. Measurement is done every minute. I need to select only rows having the same sample_value more than once consecutively for the same device_id.
Here are initial data:
sample_date sample_time device_id sample_value
20180701 1010 111 11
20180701 1011 111 12
20180701 1012 111 13
20180701 1013 222 11
20180701 1014 222 11
20180701 1015 222 12
20180701 1016 111 12
20180701 1017 111 11
20180701 1018 222 13
20180701 1019 222 12
20180701 1020 222 13
20180701 1021 222 12
20180701 1022 222 12
20180701 1023 111 12
20180701 1024 111 13
20180701 1025 111 13
20180701 1026 111 12
20180701 1027 111 13
20180701 1028 222 14
20180701 1029 222 13
20180701 1030 222 14
20180701 1031 222 14
20180701 1032 222 14
20180701 1033 222 14
20180701 1034 222 14
20180701 1035 222 14
20180701 1036 111 13
20180701 1037 111 13
20180701 1038 111 14
20180701 1039 111 13
This is result I'm looking for:
sample_date sample_time device_id sample_value
20180701 1013 222 11
20180701 1014 222 11
20180701 1021 222 12
20180701 1022 222 12
20180701 1024 111 13
20180701 1025 111 13
20180701 1030 222 14
20180701 1031 222 14
20180701 1032 222 14
20180701 1033 222 14
20180701 1034 222 14
20180701 1035 222 14
20180701 1036 111 13
20180701 1037 111 13
Here are test data:
IF OBJECT_ID('samples', 'U') IS NOT NULL
DROP TABLE samples;
create table samples (
sample_date int,
sample_time int,
device_id int,
sample_value int
)
insert samples
values
(20180701, 1010, 111, 11)
,(20180701, 1011, 111, 12)
,(20180701, 1012, 111, 13)
,(20180701, 1013, 222, 11)
,(20180701, 1014, 222, 11)
,(20180701, 1015, 222, 12)
,(20180701, 1016, 111, 12)
,(20180701, 1017, 111, 11)
,(20180701, 1018, 222, 13)
,(20180701, 1019, 222, 12)
,(20180701, 1020, 222, 13)
,(20180701, 1021, 222, 12)
,(20180701, 1022, 222, 12)
,(20180701, 1023, 111, 12)
,(20180701, 1024, 111, 13)
,(20180701, 1025, 111, 13)
,(20180701, 1026, 111, 12)
,(20180701, 1027, 111, 13)
,(20180701, 1028, 222, 14)
,(20180701, 1029, 222, 13)
,(20180701, 1030, 222, 14)
,(20180701, 1031, 222, 14)
,(20180701, 1032, 222, 14)
,(20180701, 1033, 222, 14)
,(20180701, 1034, 222, 14)
,(20180701, 1035, 222, 14)
,(20180701, 1036, 111, 13)
,(20180701, 1037, 111, 13)
,(20180701, 1038, 111, 14)
,(20180701, 1039, 111, 13)
select * from samples
Here is SQL I'm trying to use, but I don't know how to set correct partitioning.
select *
from (select sample_date,
sample_time,
device_id,
sample_value,
row_number() over (partition by sample_date,
device_id,
sample_value
order by sample_date,
sample_time,
device_id) as occurrence
from samples) t
where occurrence > 1
Similar topics:
Select statement to find duplicates on certain fields
How to find consecutive rows based on the value of a column?
To select duplicate values, you need to create groups of rows with the same values and then select the groups with counts greater than one. You can achieve that by using GROUP BY and a HAVING clause.
Approach: Using DISTINCT and WHERE clause [Accepted] Then we can select any Num column from the above table to get the target data. However, we need to add a keyword DISTINCT because it will display a duplicated number if one number appears more than 3 times consecutively.
Here's the SQL query to compare each row with previous row. In the above query, we join sales table with itself using an INNER JOIN condition g2.id=g1.id + 1 that allows you to compare each row with its previous row. Please note, this condition depends on the fact that our id column has consecutive numbers.
If you wanted to do this without using LEAD
or LAG
then you could do something like this instead:
WITH Ordered AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY sample_date, sample_time) AS order_id
FROM
samples)
SELECT
s1.sample_date,
s1.sample_time,
s1.device_id,
s1.sample_value
FROM
Ordered s1
INNER JOIN Ordered s2 ON s2.device_id = s1.device_id AND s2.sample_value = s1.sample_value AND s2.order_id = s1.order_id + 1
UNION
SELECT
s2.sample_date,
s2.sample_time,
s2.device_id,
s2.sample_value
FROM
Ordered s1
INNER JOIN Ordered s2 ON s2.device_id = s1.device_id AND s2.sample_value = s1.sample_value AND s2.order_id = s1.order_id + 1
ORDER BY
1, 2;
Results are:
sample_date sample_time device_id sample_value
20180701 1013 222 11
20180701 1014 222 11
20180701 1021 222 12
20180701 1022 222 12
20180701 1024 111 13
20180701 1025 111 13
20180701 1030 222 14
20180701 1031 222 14
20180701 1032 222 14
20180701 1033 222 14
20180701 1034 222 14
20180701 1035 222 14
20180701 1036 111 13
20180701 1037 111 13
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With