Making a new column in R based on results of previous columns

Question

I'm trying to make a new column in a table based on two previous columns. My two columns contain logical variables and I want my new column to be a column based on those logical variables (so if the answer is false in both columns then they are part of group 1, if they answer false in choice 1 but true in choice2 then they are part of group 2 etc.)

My data table looks as follows:

Choice1  Choice2 
TRUE     FALSE     
TRUE     TRUE      
FALSE    FALSE     
FALSE    TRUE      
TRUE     TRUE

I want to create a new column where individuals are grouped into one of 4 groups: 1,2,3,4.

1= False, false

2= false true

3= true false

4= true true

So the results for the above table would be 3,4,1,2,4

I initially tried a code which began looking like what I've done below but currently this doesn't work for a code.

df$newcolumn <- c("(df$Choice1=FALSE,df$Choice2=FALSE)"="1", "(df$Choice1=FALSE, df$Choice2=TRUE)"="2"), "(df$Choice1=TRUE, df$Choice2=FALSE)"="3", "(df$Choice1=TRUE, df$Choice2=TRUE)", ="4")

I've also tried looking for other suitable similar examples on line to try and adapt mine too but haven't been able to find any. Any help would be appreciated

G. Grothendieck · Accepted Answer

1) linear formula The indicated formula will give the group number as factor. The factor part was requested in a comment below.

transform(dat1, group = factor(1 + 2*Choice1 + Choice2))
##   Choice1 Choice2 group
## 1    TRUE   FALSE     3
## 2    TRUE    TRUE     4
## 3   FALSE   FALSE     1
## 4   FALSE    TRUE     2
## 5    TRUE    TRUE     4

We can derive the above formula using regression. For this to work it is essential that every combination be represented in dat1 but that is, in fact, the case.

y <- c(3, 4, 1, 2, 4)
fm <- lm(y ~., dat1)
fm
##
## Call:
## lm(formula = y ~ ., data = dat1)
##
## Coefficients:
## (Intercept)  Choice1TRUE  Choice2TRUE  
##           1            2            1  

# check that it is exact, i.e. residual sum of squares is 0
deviance(fm)
## [1] 3.574526e-31

2) explicit combinations Another approach is to explicitly consider each combination:

transform(dat1, group = factor(
  1 * (!Choice1 & !Choice2) +
  2 * (!Choice1 & Choice2) + 
  3 * (Choice1 & !Choice2) + 
  4 * (Choice1 & Choice2)))

giving:

  Choice1 Choice2 group
1    TRUE   FALSE     3
2    TRUE    TRUE     4
3   FALSE   FALSE     1
4   FALSE    TRUE     2
5    TRUE    TRUE     4

3) interaction A third approach is to use interaction to create the group factor.

transform(dat1, group = factor(interaction(Choice2, Choice1), label = ""))

giving:

  Choice1 Choice2 group
1    TRUE   FALSE     3
2    TRUE    TRUE     4
3   FALSE   FALSE     1
4   FALSE    TRUE     2
5    TRUE    TRUE     4

Note

The input in reproducible form is:

dat1 <- structure(list(Choice1 = c(TRUE, TRUE, FALSE, FALSE, TRUE), 
  Choice2 = c(FALSE, 
  TRUE, FALSE, TRUE, TRUE)), class = "data.frame", row.names = 
  c(NA, -5L))

TarJae · Answer

Here is a solution with case_when from dplyr package:

library(dplyr)

# your data:
df <- tribble(
~Choice1,  ~Choice2, 
TRUE,     FALSE,     
TRUE,     TRUE,      
FALSE,    FALSE,     
FALSE,    TRUE ,     
TRUE,     TRUE) 

df1 <- df %>% 
  mutate(new_col = case_when(Choice1==FALSE & Choice2 == FALSE ~ 1,
                             Choice1==FALSE & Choice2 == TRUE ~ 2,
                             Choice1==TRUE & Choice2 == FALSE ~ 3,
                             Choice1==TRUE & Choice2 == TRUE ~4))

enter image description here

Making a new column in R based on results of previous columns

Tags:

r

Eams

2 Answers

Note

G. Grothendieck

TarJae

Recent Activity

Donate For Us

Making a new column in R based on results of previous columns

Tags:

r

Eams

2 Answers

Note

G. Grothendieck

TarJae

Related questions

Recent Activity

Donate For Us