What is the difference between `%in%` and `==`?

Tags:

r

df <- structure(list(x = 1:10, time = c(0.5, 0.5, 1, 2, 3, 0.5, 0.5,  1, 2, 3)), .Names = c("x", "time"), row.names = c(NA, -10L), class = "data.frame")   df[df$time %in% c(0.5, 3), ] ##     x time ## 1   1  0.5 ## 2   2  0.5 ## 5   5  3.0 ## 6   6  0.5 ## 7   7  0.5 ## 10 10  3.0  df[df$time == c(0.5, 3), ] ##     x time ## 1   1  0.5 ## 7   7  0.5 ## 10 10  3.0

What is the difference between %in% and == here?

292

asked Mar 12 '13 09:03

user1320502

2 Answers

The problem is vector recycling.

Your first line does exactly what you'd expect. It checks what elements of df$time are in c(0.5, 3) and returns the values which are.

Your second line is trickier. It's actually equivalent to

df[df$time == rep(c(0.5,3), length.out=nrow(df)),]

To see this, let's see what happens if use a vector rep(0.5, 10):

rep(0.5, 10) == c(0.5, 3) [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE

See how it returns every odd value. Essentially it's matching 0.5 to the vector c(0.5, 3, 0.5, 3, 0.5...)

You can manipulate a vector to produce no matches this way. Take the vector: rep(c(3, 0.5), 5):

rep(c(3, 0.5), 5) == c(0.5, 3) [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

They're all FALSE; you are matching every 0.5 with 3 and vice versa.

101

answered Sep 20 '22 15:09

sebastian-c

df$time == c(0.5,3)

the c(0.5,3) first gets broadcast to the shape of df$time, i.e. c(0.5,3,0.5,3,0.5,3,0.5,3,0.5,3). Then the two vectors are compared element-by-element.

On the other hand,

df$time %in% c(0.5,3)

checks whether each element of df$time belongs to the set {0.5, 3}.

answered Sep 21 '22 15:09

NPE

Related questions
                            
                                Install an R package directly from a URL for the package source
                            
                                Calling R script from python using rpy2
                            
                                Return most frequent string value for each group [duplicate]
                            
                                R markdown error: can't produce HTML file
                            
                                Valid time zones in lubridate
                            
                                How to build a dendrogram from a directory tree?
                            
                                Inserting a table under the legend in a ggplot2 histogram
                            
                                R barplot Y-axis scale too short
                            
                                How to center stacked percent barchart labels
                            
                                Select Subset of Columns based on Vector R
                            
                                Knitr wont compile PDF: "Error in tools::file_path_as_absolute(output_file)"
                            
                                Dollar sign before a variable
                            
                                What do backticks do in R?
                            
                                Cannot create an empty vector and append new elements in R
                            
                                how to convert date and time from character to datetime type
                            
                                Error: vector memory exhausted (limit reached?) R 3.5.0 macOS
                            
                                Setting the default value in a function?
                            
                                Running R Code from Command Line (Windows)
                            
                                Missing legend with ggplot2 and geom_line
                            
                                Create a Data Frame of Unequal Lengths

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With