Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing Boolean Vectors

Tags:

r

boolean

I have a dataframe with four logical vectors, v1, v2, v3, v4 that are TRUE or FALSE. I need to classify each row of the dataframe based on the combination of the boolean vectors (for example, "None", "v1 only", "v1 and v3", "All", etc.). I would like to do this without taking a subset of the dataframe or nesting ifelse statements. Any suggestions for the best way to do this? Thanks!

like image 203
Boom Shakalaka Avatar asked Jan 18 '23 01:01

Boom Shakalaka


1 Answers

Looks like I've arrived late at this party. Still, I might as well share what I've brought!

This works by treating the FALSE/TRUE possibilities like bits, and operating on them to assign to each combination of v1, v2, and v3 a unique integer between 1 and 8 (much like chmod can represent permission bits on *NIX systems). The integer is then used as an index to select the appropriate element of a vector of textual descriptors.

(For the demonstration, I've used just three columns, but this approach scales up nicely.)

# CONSTRUCT VECTOR OF DESCRIPTIONS
description <- c("None", "v1", "v2", "v1 and v2",
                 "v3", "v1 and v3", "v2 and v3", "All")

# DEFINE DESCRIPTION FUNCTION
getDescription <- function(X) {
    index <- 1 + sum(X*c(1,2,4))
    description[index]
}

# TRY IT OUT ON ALL COMBOS OF v1, v2, and v3
df <- expand.grid(v1=c(FALSE, TRUE),
                  v2=c(FALSE, TRUE),
                  v3=c(FALSE, TRUE))
df$description <- apply(df, 1, getDescription)

# YEP, IT WORKS.
df
#      v1    v2    v3 description
# 1 FALSE FALSE FALSE        None
# 2  TRUE FALSE FALSE          v1
# 3 FALSE  TRUE FALSE          v2
# 4  TRUE  TRUE FALSE   v1 and v2
# 5 FALSE FALSE  TRUE          v3
# 6  TRUE FALSE  TRUE   v1 and v3
# 7 FALSE  TRUE  TRUE   v2 and v3
# 8  TRUE  TRUE  TRUE         All
like image 136
Josh O'Brien Avatar answered Jan 20 '23 14:01

Josh O'Brien