I have two data.frames. For examples sake let's say they look like this:
df1 <- data.frame(x=rep(letters[1:26], 16))
df2 <- data.frame(y=letters[1:4])
What I would like to do is subset 'df1' to contain the rows whose first column value matches any value within the first column of 'df2'.
Now, I've tried:
subset(df1, df1$x == df2$y)
But this tells me that I need equally sized data.frames. Thoughts?
Both %in%
and match()
can be used for this. Here is the former:
> which( df1$x %in% df2$y )
[1] 1 2 3 4 27 28 29 30 53 54 55 56 79 80 81 82 105
[18] 106 107 108 131 132 133 134 157 158 159 160 183 184 185 186 209 210
[35] 211 212 235 236 237 238 261 262 263 264 287 288 289 290 313 314 315
[52] 316 339 340 341 342 365 366 367 368 391 392 393 394
>
>
> table(df1[ which( df1$x %in% df2$y ), "x"])
a b c d e f g h i j k l m n o p q r s t u v w x y
16 16 16 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
z
0
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With