Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to find repeated runs of rows, remove, & count

Tags:

r

I have a data set with repeating rows. I want to remove consecutive repeated and count them but only if they're consecutive. I'm looking for an efficient way to do this. Can't think of how in dplyr or data.table.

MWE

dat <- data.frame(     x = c(6, 2, 3, 3, 3, 1, 1, 6, 5, 5, 6, 6, 5, 4),     y = c(7, 5, 7, 7, 7, 5, 5, 7, 1, 2, 7, 7, 1, 7),     z = c(rep(LETTERS[1:2], each=7)) )  ##        x     y     z ## 1      6     7     A ## 2      2     5     A ## 3      3     7     A ## 4      3     7     A ## 5      3     7     A ## 6      1     5     A ## 7      1     5     A ## 8      6     7     B ## 9      5     1     B ## 10     5     2     B ## 11     6     7     B ## 12     6     7     B ## 13     5     1     B ## 14     4     7     B 

Desired output

       x     y     z   n 1      6     7     A   1 2      2     5     A   1 3      3     7     A   3 4      1     5     A   2 5      6     7     B   1 6      5     1     B   1 7      5     2     B   1 8      6     7     B   2 9      5     1     B   1  10     4     7     B   1 
like image 261
Tyler Rinker Avatar asked Apr 18 '16 01:04

Tyler Rinker


People also ask

How do I remove repetitive rows in Excel?

Select the range of cells that has duplicate values you want to remove. Tip: Remove any outlines or subtotals from your data before trying to remove duplicates. Click Data > Remove Duplicates, and then Under Columns, check or uncheck the columns where you want to remove the duplicates.

How do you identify duplicates in Excel and delete them?

In Excel, there are several ways to filter for unique values—or remove duplicate values: To filter for unique values, click Data > Sort & Filter > Advanced. To remove duplicate values, click Data > Data Tools > Remove Duplicates.


1 Answers

With data.table:

library(data.table) setDT(dat)  dat[, c(.SD[1L], .N), by=.(g = rleidv(dat))][, g := NULL]      x y z N  1: 6 7 A 1  2: 2 5 A 1  3: 3 7 A 3  4: 1 5 A 2  5: 6 7 B 1  6: 5 1 B 1  7: 5 2 B 1  8: 6 7 B 2  9: 5 1 B 1 10: 4 7 B 1 
like image 178
Frank Avatar answered Oct 06 '22 06:10

Frank