How to combine multiple conditions to subset a data-frame using "OR"?

People also ask

How do you subset in R with conditions?

How to subset an R data frame with condition based on only one value from categorical column? First of all, create a data frame. Then, subset the data frame with condition using filter function of dplyr package.

How do I select multiple conditions in R?

Multiple conditions can also be combined using which() method in R. The which() function in R returns the position of the value which satisfies the given condition. The %in% operator is used to check a value in the vector specified.

Can you subset a subset in R?

Subsetting both rows and columnsIt is possible to subset both rows and columns using the subset function. The select argument lets you subset variables (columns).

How do I combine two subsets in R?

To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.

my.data.frame <- subset(data , V1 > 2 | V2 < 4)

An alternative solution that mimics the behavior of this function and would be more appropriate for inclusion within a function body:

new.data <- data[ which( data$V1 > 2 | data$V2 < 4) , ]

Some people criticize the use of which as not needed, but it does prevent the NA values from throwing back unwanted results. The equivalent (.i.e not returning NA-rows for any NA's in V1 or V2) to the two options demonstrated above without the which would be:

 new.data <- data[ !is.na(data$V1 | data$V2) & ( data$V1 > 2 | data$V2 < 4)  , ]

Note: I want to thank the anonymous contributor that attempted to fix the error in the code immediately above, a fix that got rejected by the moderators. There was actually an additional error that I noticed when I was correcting the first one. The conditional clause that checks for NA values needs to be first if it is to be handled as I intended, since ...

> NA & 1
[1] NA
> 0 & NA
[1] FALSE

Order of arguments may matter when using '&".

You are looking for "|." See http://cran.r-project.org/doc/manuals/R-intro.html#Logical-vectors

my.data.frame <- data[(data$V1 > 2) | (data$V2 < 4), ]

Just for the sake of completeness, we can use the operators [ and [[:

set.seed(1)
df <- data.frame(v1 = runif(10), v2 = letters[1:10])

Several options

df[df[1] < 0.5 | df[2] == "g", ] 
df[df[[1]] < 0.5 | df[[2]] == "g", ] 
df[df["v1"] < 0.5 | df["v2"] == "g", ]

df$name is equivalent to df[["name", exact = FALSE]]

Using dplyr:

library(dplyr)
filter(df, v1 < 0.5 | v2 == "g")

Using sqldf:

library(sqldf)
sqldf('SELECT *
      FROM df 
      WHERE v1 < 0.5 OR v2 = "g"')

Output for the above options:

          v1 v2
1 0.26550866  a
2 0.37212390  b
3 0.20168193  e
4 0.94467527  g
5 0.06178627  j

Related questions
                            
                                Find file name from full file path
                            
                                Repeat each row of data.frame the number of times specified in a column
                            
                                What do hjust and vjust do when making a plot using ggplot?
                            
                                Most underused data visualization [closed]
                            
                                R: += (plus equals) and ++ (plus plus) equivalent from c++/c#/java, etc.?
                            
                                How to count TRUE values in a logical vector
                            
                                Importing data from a JSON file into R [duplicate]
                            
                                How to remove all whitespace from a string?
                            
                                Fastest way to find second (third...) highest/lowest value in vector or column
                            
                                Problems installing the devtools package
                            
                                Difference between R MarkDown and R NoteBook
                            
                                Order data frame rows according to vector with specific order
                            
                                Append value to empty vector in R?
                            
                                What does .SD stand for in data.table in R
                            
                                Call apply-like function on each row of dataframe with multiple arguments from each row
                            
                                Prevent row names to be written to file when using write.csv
                            
                                How to find common elements from multiple vectors?
                            
                                Annotating text on individual facet in ggplot2
                            
                                For each row in an R dataframe
                            
                                Convert row names into first column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to combine multiple conditions to subset a data-frame using "OR"?

Tags:

dataframe

r

conditional

People also ask

Recent Activity

Donate For Us