I have a data.frame with a block of columns that are logicals, e.g. <pre class="prettyprint"><code>> tmp <- data.frame(a=c(13, 23, 52), + b=c(TRUE,FALSE,TRUE), + c=c(TRUE,TRUE,FALSE), + d=c(TRUE,TRUE,TRUE)) > tmp a b c d 1 13 TRUE TRUE TRUE 2 23 FALSE TRUE TRUE 3 52 TRUE FALSE TRUE </code></pre> I'd like to compute a summary column (say: e) that is a logical <code>AND</code> over the whole range of logical columns. In other words, for a given row, if all b:d are <code>TRUE</code>, then e would be <code>TRUE</code>; if any b:d are <code>FALSE</code>, then e would be <code>FALSE</code>. My expected result is: <pre class="prettyprint"><code>> tmp a b c d e 1 13 TRUE TRUE TRUE TRUE 2 23 FALSE TRUE TRUE FALSE 3 52 TRUE FALSE TRUE FALSE </code></pre> I want to indicate the range of columns by indices, as I have a bunch of columns, and the names are cumbersome. The following code works, but i'd rather use a vectorized approach to improve performance. <pre class="prettyprint"><code>> tmp$e <- NA > for(i in 1:nrow(tmp)){ + tmp[i,"e"] <- all(tmp[i,2:(ncol(tmp)-1)]==TRUE) + } > tmp a b c d e 1 13 TRUE TRUE TRUE TRUE 2 23 FALSE TRUE TRUE FALSE 3 52 TRUE FALSE TRUE FALSE </code></pre> Any way to do this without using a <code>for</code> loop to step through the rows of the data.frame?

You can use <code>rowSums</code> to loop over rows... and some fancy footwork to make it quasi-automated: <pre class="prettyprint"><code># identify the logical columns boolCols <- sapply(tmp, is.logical) # sum each row of the logical columns and # compare to the total number of logical columns tmp$e <- rowSums(tmp[,boolCols]) == sum(boolCols) </code></pre>

how to find if all elements in a subset of a data.frame row are TRUE

Tags:

r

I have a data.frame with a block of columns that are logicals, e.g.

> tmp <- data.frame(a=c(13, 23, 52),
+                   b=c(TRUE,FALSE,TRUE),
+                   c=c(TRUE,TRUE,FALSE),
+                   d=c(TRUE,TRUE,TRUE))
> tmp
   a     b     c    d
1 13  TRUE  TRUE TRUE
2 23 FALSE  TRUE TRUE
3 52  TRUE FALSE TRUE

I'd like to compute a summary column (say: e) that is a logical AND over the whole range of logical columns. In other words, for a given row, if all b:d are TRUE, then e would be TRUE; if any b:d are FALSE, then e would be FALSE.

My expected result is:

> tmp
   a     b     c    d     e
1 13  TRUE  TRUE TRUE  TRUE
2 23 FALSE  TRUE TRUE FALSE
3 52  TRUE FALSE TRUE FALSE

I want to indicate the range of columns by indices, as I have a bunch of columns, and the names are cumbersome. The following code works, but i'd rather use a vectorized approach to improve performance.

> tmp$e <- NA
> for(i in 1:nrow(tmp)){
+     tmp[i,"e"] <- all(tmp[i,2:(ncol(tmp)-1)]==TRUE)
+ }
> tmp
   a     b     c    d     e
1 13  TRUE  TRUE TRUE  TRUE
2 23 FALSE  TRUE TRUE FALSE
3 52  TRUE FALSE TRUE FALSE

Any way to do this without using a for loop to step through the rows of the data.frame?

784

asked Jul 09 '12 22:07

mac

1 Answers

You can use rowSums to loop over rows... and some fancy footwork to make it quasi-automated:

# identify the logical columns
boolCols <- sapply(tmp, is.logical)
# sum each row of the logical columns and
# compare to the total number of logical columns
tmp$e <- rowSums(tmp[,boolCols]) == sum(boolCols)

answered Oct 06 '22 04:10

Joshua Ulrich

Related questions
                            
                                Find distance of route from get.shortest.paths()
                            
                                How to assign within apply family?
                            
                                How can I use qnorm on Rcpp?
                            
                                R - convert BIG table into matrix by column names
                            
                                Reshape data with repeated columns
                            
                                how to assign a unique identifier to multiple data frame entries
                            
                                unable to find C_kmns object when passed to .Fortran()
                            
                                geom_map borders in ggplot2 - revisited
                            
                                Faster proportion tables in R
                            
                                why causes invalid format '%d in R?
                            
                                running multiple jobs in background at same time (parallel) in r
                            
                                Locate and merge duplicate rows in a data.frame but ignore column order
                            
                                Merge multiple CSV files and remove duplicates in R
                            
                                Profiling SVM (e1071) in R
                            
                                R : regular expression for 'not followed by' not working
                            
                                R: convert asymmetric list to matrix - number of elements in each sub-list differ
                            
                                How to change ggplot legend labels and names with two layers?
                            
                                rJava fails to install (MacOS, Red Hat)
                            
                                data.table's tables() function runs some of my .Rprofile functions
                            
                                Installing only 64 bit packages via the R command line

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to find if all elements in a subset of a data.frame row are TRUE

Tags:

r

mac

People also ask

1 Answers

Joshua Ulrich

Recent Activity

Donate For Us