Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the equivalent of Stata function inlist() in R?

Tags:

command

r

stata

Stata's inlist allows us to refer to the real or string values of a variable. I was wondering whether R has such a function.

Examples:

I want to choose eight states from the variable state (you can think this as column state in any dataframe where state takes 50 string values (states of United States)).

    inlist(state,"NC","AZ","TX","NY","MA","CA","NJ")

I want to choose nine values of age from the variable age (you can think this as column age in any dataframe where age takes numerical values from 0 to 90).

    inlist(age,16, 24, 45, 54, 67,74, 78, 79, 85) 

Question:

age<-c(0:10) # for this problem age takes values from 0 to 10 only
data<-as.data.frame(age) # age is a variable of data frame data
data$m<-ifelse(c(1,7,9)%in%data$age,0,1) # generate a variable m which takes  value 0 if age is 1, 7, and 8 and 1, otherwise
Expected output: 
   age m
1    0 1
2    1 0
3    2 1
4    3 1
5    4 1
6    5 1
7    6 1
8    7 0
9    8 1
10   9 0
11  10 1
like image 426
Metrics Avatar asked Jan 12 '13 16:01

Metrics


Video Answer


1 Answers

I think you want %in%:

statevec <- c("NC","AZ","TX","NY","MA","CA","NJ")
state <- c("AZ","VT")
state %in% statevec ## TRUE FALSE
agevec <- c(16, 24, 45, 54, 67,74, 78, 79, 85) 
age <- c(34,45)
age %in% agevec ## FALSE TRUE

edit: working on updated question.

Copying from @NickCox's link:

inlist(z,a,b,...)
      Domain:       all reals or all strings
      Range:        0 or 1
      Description:  returns 1 if z is a member of the remaining arguments;
                        otherwise, returns 0.  All arguments must be reals
                        or all must be strings.  The number of arguments is
                        between 2 and 255 for reals and between 2 and 10 for
                        strings.

However, I'm not quite sure how this matches up with the original question. I don't know Stata well enough to know if z can be a vector or not: it doesn't sound that way, in which case the original question (considering z=state as a vector) doesn't make sense. If we consider that it can be a vector then the answer would be as.numeric(state %in% statevec) -- I think.

Edit: Update by Ananda

Using your updated data, here's one approach, again using %in%:

data <- data.frame(age=0:10)
within(data, {
    m <- as.numeric(!age %in% c(1, 7, 9))
})
   age m
1    0 1
2    1 0
3    2 1
4    3 1
5    4 1
6    5 1
7    6 1
8    7 0
9    8 1
10   9 0
11  10 1

This matches your expected output, by using ! (NOT) to invert the sense of %in%. It seems to be a little backwards from the way I would think about it (normally, 0=FALSE="is not in the list" and 1=TRUE="is in the list") and my reading of Stata's definition, but if it's what you want ...

Or one can use ifelse for more potential flexibility (i.e. values other than 0/1): substitute within(data, { m <- ifelse(age %in% c(1, 7, 9),0,1)}) in the code above.

like image 135
Ben Bolker Avatar answered Oct 06 '22 19:10

Ben Bolker