Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find the first and last occurrences of an element in a data.frame?

Tags:

dataframe

r

sas

I have searched exhaustively for a direct R translation for the FIRST. and LAST. pointers in SAS DATA steps but can't seem to find one. For those not familiar with SAS, FIRST. is a boolean that identifies the first appearance of a given element in a table and LAST. is a boolean that identifies the last appearance. For instance, consider the following sorted table:

V1    V2    V3
1     1     1
1     1     2
1     2     3
1     2     4
2     3     5
2     3     6
2     4     7
2     4     8
3     5     9
3     5     10
3     6     11
3     6     12

Because SAS DATA steps read tables line by line, I can use a statement like:

IF FIRST.V1 THEN DO ...

FIRST.V1 will return TRUE if and only if this is the first time the observation has been encountered in V1. In other words, it will return true for V1[1] (the first appearance of '1'), V1[5] (the first appearance of '2'), and V1[9] (the first appearance of '3'). The LAST. pointer functions in analogous fashion, but with the final appearance of that element.

Is there anything in R that emulates this?

like image 907
asteri Avatar asked Jul 18 '12 17:07

asteri


1 Answers

You can do this with duplicated and rev (for LAST):

> v1=c(1,1,1,2,2,3,3,3,3,4,4,5)

> data.frame(v1,FIRST=!duplicated(v1),LAST=rev(!duplicated(rev(v1))))
   v1 FIRST  LAST
1   1  TRUE FALSE
2   1 FALSE FALSE
3   1 FALSE  TRUE
4   2  TRUE FALSE
5   2 FALSE  TRUE
6   3  TRUE FALSE
7   3 FALSE FALSE
8   3 FALSE FALSE
9   3 FALSE  TRUE
10  4  TRUE FALSE
11  4 FALSE  TRUE
12  5  TRUE  TRUE
like image 154
Spacedman Avatar answered Oct 19 '22 05:10

Spacedman