Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Not in subset [duplicate]

Tags:

r

subset

Possible Duplicate:
Standard way to remove multiple elements from a dataframe

I know in R that if you are searching for a subset of another group or matching based on id you'd use something like

subset(df1, df1$id %in% idNums1) 

My question is how to do the opposite or choose items NOT matching a vector of ids.

I tried using ! but get the error message

subset(df1, df1$id !%in% idNums1) 

I think my backup is to do sometime like this:

matches <- subset(df1, df1$id %in% idNums1) nonMatches <- df1[(-matches[,1]),] 

but I'm hoping there's something a bit more efficient.

like image 254
screechOwl Avatar asked Mar 24 '12 15:03

screechOwl


People also ask

How do I avoid duplicates in R?

Remove duplicate rows in a data frameThe function distinct() [dplyr package] can be used to keep only unique/distinct rows from a data frame. If there are duplicate rows, only the first row is preserved. It's an efficient version of the R base function unique() .

Can a subset have duplicates?

No, a subset is a set, and sets do not have duplicate values.

How do I remove duplicate rows from multiple columns in R?

Remove all the duplicate rows from the dataframe In this case, we just have to pass the entire dataframe as an argument in distinct() function, it then checks for all the duplicate rows for all variables/columns and removes them.


1 Answers

The expression df1$id %in% idNums1 produces a logical vector. To negate it, you need to negate the whole vector:

!(df1$id %in% idNums1) 
like image 103
Ari B. Friedman Avatar answered Oct 08 '22 17:10

Ari B. Friedman