Does anybody know what is the best R alternative to SAS first. or last. operators? I did find none.
SAS has the FIRST. and LAST. automatic variables, which identify the first and last record amongst a group with the same value with a particular variable; so in the following dataset FIRST.model and LAST.model are defined:
Model,SaleID,First.Model,Last.Model
Explorer,1,1,0
Explorer,2,0,0
Explorer,3,0,0
Explorer,4,0,1
Civic,5,1,0
Civic,6,0,0
Civic,7,0,1
In SAS, if we wanted to run multiple linear regressions using different predictor variables, we could use a simple SAS macro to iterate over the independent variables. In R, we can simplify this even more by making use of the apply() function.
If you would like to see the first and last values of each group in separate columns, the first thing is to create a column with repeated a combination of necessary labels. After that, use pivot_wider to transform rows to columns.
VARIABLE assigns the value of 1 for the first observation in a BY group and the value of 0 for all other observations in the BY group. LAST. VARIABLE assigns the value of 1 for the last observation in a BY group and the value of 0 for all other observations in the BY group.
As discussed above, you can use the OBS=-option to specify the last observation that SAS processes from a data set. In contrast, you can use the FIRSTOBS=-option to specify the first observation that SAS processes. If you combine the FIRSTOBS= and OBS=-options, you are able to select a range of observations.
It sounds like you're looking for !duplicated
, with the fromLast
argument being FALSE
or TRUE
.
d <- datasets::Puromycin
d$state
# [1] treated treated treated treated treated treated treated
# [8] treated treated treated treated treated untreated untreated
#[15] untreated untreated untreated untreated untreated untreated untreated
#[22] untreated untreated
#Levels: treated untreated
!duplicated(d$state)
# [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[13] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
!duplicated(d$state,fromLast=TRUE)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
There are some caveats and edge-case behaviors to this function, which you can find out through the help files (?duplicated
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With