I have searched exhaustively for a direct R translation for the FIRST. and LAST. pointers in SAS DATA steps but can't seem to find one. For those not familiar with SAS, FIRST. is a boolean that identifies the first appearance of a given element in a table and LAST. is a boolean that identifies the last appearance. For instance, consider the following sorted table:
V1 V2 V3
1 1 1
1 1 2
1 2 3
1 2 4
2 3 5
2 3 6
2 4 7
2 4 8
3 5 9
3 5 10
3 6 11
3 6 12
Because SAS DATA steps read tables line by line, I can use a statement like:
IF FIRST.V1 THEN DO ...
FIRST.V1 will return TRUE if and only if this is the first time the observation has been encountered in V1. In other words, it will return true for V1[1] (the first appearance of '1'), V1[5] (the first appearance of '2'), and V1[9] (the first appearance of '3'). The LAST. pointer functions in analogous fashion, but with the final appearance of that element.
Is there anything in R that emulates this?
You can do this with duplicated and rev (for LAST):
> v1=c(1,1,1,2,2,3,3,3,3,4,4,5)
> data.frame(v1,FIRST=!duplicated(v1),LAST=rev(!duplicated(rev(v1))))
v1 FIRST LAST
1 1 TRUE FALSE
2 1 FALSE FALSE
3 1 FALSE TRUE
4 2 TRUE FALSE
5 2 FALSE TRUE
6 3 TRUE FALSE
7 3 FALSE FALSE
8 3 FALSE FALSE
9 3 FALSE TRUE
10 4 TRUE FALSE
11 4 FALSE TRUE
12 5 TRUE TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With