Logo Questions Linux Laravel Mysql Ubuntu Git Menu

What is the difference between `%in%` and `==`?



df <- structure(list(x = 1:10, time = c(0.5, 0.5, 1, 2, 3, 0.5, 0.5,  1, 2, 3)), .Names = c("x", "time"), row.names = c(NA, -10L), class = "data.frame")   df[df$time %in% c(0.5, 3), ] ##     x time ## 1   1  0.5 ## 2   2  0.5 ## 5   5  3.0 ## 6   6  0.5 ## 7   7  0.5 ## 10 10  3.0  df[df$time == c(0.5, 3), ] ##     x time ## 1   1  0.5 ## 7   7  0.5 ## 10 10  3.0 

What is the difference between %in% and == here?

like image 292
user1320502 Avatar asked Mar 12 '13 09:03


People also ask

What is the difference between the == and in operators in R?

What is the Difference Between the == and %in% Operators in R. The %in% operator is used for matching values. “returns a vector of the positions of (first) matches of its first argument in its second”. On the other hand, the == operator, is a logical operator and is used to compare if two elements are exactly equal.

What does == mean in R?

The Equality Operator == Relational operators, or comparators, are operators which help us see how one R object relates to another. For example, you can check whether two objects are equal (equality) by using a double equals sign == .

How do you write not in in r?

You can use the following basic syntax to select all elements that are not in a list of values in R: ! (data %in% c(value1, value2, value3, ...))

What is the opposite of in in r?

R operator %in% is handy for working with vectors, but how to use it oppositely? Something like %notin% that will exclude anything that is in a vector. There is no actual %notin% operator in R, but below is the explanation on how to get the desired result.

2 Answers

The problem is vector recycling.

Your first line does exactly what you'd expect. It checks what elements of df$time are in c(0.5, 3) and returns the values which are.

Your second line is trickier. It's actually equivalent to

df[df$time == rep(c(0.5,3), length.out=nrow(df)),] 

To see this, let's see what happens if use a vector rep(0.5, 10):

rep(0.5, 10) == c(0.5, 3) [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE 

See how it returns every odd value. Essentially it's matching 0.5 to the vector c(0.5, 3, 0.5, 3, 0.5...)

You can manipulate a vector to produce no matches this way. Take the vector: rep(c(3, 0.5), 5):


They're all FALSE; you are matching every 0.5 with 3 and vice versa.

like image 101
sebastian-c Avatar answered Sep 20 '22 15:09



df$time == c(0.5,3) 

the c(0.5,3) first gets broadcast to the shape of df$time, i.e. c(0.5,3,0.5,3,0.5,3,0.5,3,0.5,3). Then the two vectors are compared element-by-element.

On the other hand,

df$time %in% c(0.5,3) 

checks whether each element of df$time belongs to the set {0.5, 3}.

like image 43
NPE Avatar answered Sep 21 '22 15:09