Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove consecutive duplicate entries

How to remove consecutive duplicate entries in R? I think with may be used but can't think how to use it. Illustrating one example:

read.table(text = "
   a        t1
   b        t2
   b        t3
   b        t4
   c        t5
   c        t6
   b        t7
   d        t8")

Sample Data: D

    events    time
       a        t1
       b        t2
       b        t3
       b        t4
       c        t5
       c        t6
       b        t7
       d        t8

Required Outcome:

     events     time
       a        t1
       b        t4
       c        t6
       b        t7
       d        t8

`

like image 410
anu Avatar asked Jul 15 '13 09:07

anu


People also ask

Which of the following command will help to remove consecutive duplicates?

If you use 'uniq' command without any arguments, it will remove all consecutive duplicate lines and display only the unique lines.


2 Answers

Yet an other one, assuming your data.frmae is named d:

d[cumsum(rle(as.numeric(d[,1]))$lengths),]
  V1 V2
1  a t1
4  b t4
6  c t6
7  b t7
8  d t8
like image 82
johannes Avatar answered Oct 01 '22 22:10

johannes


EDIT: Not exactly correct as it only shows one b row. You can also use the duplicated() function

x <- read.table(text = "    events    time
   a        t1
   b        t2
   b        t3
   b        t4
   c        t5
   c        t6
   d        t7", header = TRUE)
#Making sure the data is correctly ordered!
x <- x[order(x[,1], x[,2]), ]      
x[!duplicated(x[,1], fromLast=TRUE), ]
like image 37
Xachriel Avatar answered Oct 01 '22 23:10

Xachriel