I have a large dataframe which contains a column of names, and given the nature of my data the names repeat. I also have a vector of a subset of those names that I need to eliminate from that dataframe. So I want to identify the row number for each instance that the name in the dataframe matches a name in the list of names to be dropped. Here is an example of what I'm trying to do...but I can't get the code to work. Thanks!
a=c("tom", "bill", "sue", "jim", "tom", "amy")
b=c(12,15,7,22,45,5)
ab=data.frame(a,b)
ab
drop=which(ab$a==c("tom", "sue")) #only identifies those matching "tom"
drop
ab2=ab[-drop,]
ab2
you're looking for %in%
drop=which(ab$a %in% c("tom", "sue"))
however, more succinctly:
ab[!ab$a %in% c('tom', 'sue'),]
You should have a look at the package sqldf. You may perform SQL Selects on R Data Frames.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With