Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Throw away first and last n rows

Tags:

r

data.table

I have a data.table in R where I want to throw away the first and the last n rows. I want to to apply some filtering before and then truncate the results. I know I can do this this way:

example=data.table(row1=seq(1,1000,1),row2=seq(2, 3000,3))
e2=example[row1%%2==0]
e2[100:(nrow(e2)-100)]

Is there a possiblity of doing this in one line? I thought of something like:

example[row1%%2==0][100:-100]

This of course does not work, but is there a simpler solution which does not require a additional variable?

like image 497
theomega Avatar asked Apr 11 '12 17:04

theomega


2 Answers

 example=data.table(row1=seq(1,1000,1),row2=seq(2, 3000,3))
 n = 5
 str(example[!rownames(example) %in% 
                 c( head(rownames(example), n), tail(rownames(example), n)), ])
Classes ‘data.table’ and 'data.frame':  990 obs. of  2 variables:
 $ row1: num  6 7 8 9 10 11 12 13 14 15 ...
 $ row2: num  17 20 23 26 29 32 35 38 41 44 ...
 - attr(*, ".internal.selfref")=<externalptr> 

Added a one-liner version with the selection criterion

str( 
     (res <- example[row1 %% 2 == 0])[ n:( nrow(res)-n ),  ] 
      )
Classes ‘data.table’ and 'data.frame':  491 obs. of  2 variables:
 $ row1: num  10 12 14 16 18 20 22 24 26 28 ...
 $ row2: num  29 35 41 47 53 59 65 71 77 83 ...
 - attr(*, ".internal.selfref")=<externalptr> 

And further added this version that does not use an intermediate named value

str(  
example[row1 %% 2 == 0][n:(sum( row1 %% 2==0)-n ),  ] 
   )
Classes ‘data.table’ and 'data.frame':  491 obs. of  2 variables:
 $ row1: num  10 12 14 16 18 20 22 24 26 28 ...
 $ row2: num  29 35 41 47 53 59 65 71 77 83 ...
 - attr(*, ".internal.selfref")=<externalptr> 
like image 107
IRTFM Avatar answered Nov 10 '22 03:11

IRTFM


In this case you know the name of one column (row1) that exists, so using length(<any column>) returns the number of rows within the unnamed temporary data.table:

example=data.table(row1=seq(1,1000,1),row2=seq(2, 3000,3))

e2=example[row1%%2==0]
ans1 = e2[100:(nrow(e2)-100)]

ans2 = example[row1%%2==0][100:(length(row1)-100)]

identical(ans1,ans2)
[1] TRUE
like image 3
Matt Dowle Avatar answered Nov 10 '22 04:11

Matt Dowle