Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making a new column in R based on results of previous columns

Tags:

r

I'm trying to make a new column in a table based on two previous columns. My two columns contain logical variables and I want my new column to be a column based on those logical variables (so if the answer is false in both columns then they are part of group 1, if they answer false in choice 1 but true in choice2 then they are part of group 2 etc.)

My data table looks as follows:

Choice1  Choice2 
TRUE     FALSE     
TRUE     TRUE      
FALSE    FALSE     
FALSE    TRUE      
TRUE     TRUE      

I want to create a new column where individuals are grouped into one of 4 groups: 1,2,3,4.

1= False, false

2= false true

3= true false

4= true true

So the results for the above table would be 3,4,1,2,4

I initially tried a code which began looking like what I've done below but currently this doesn't work for a code.

df$newcolumn <- c("(df$Choice1=FALSE,df$Choice2=FALSE)"="1", "(df$Choice1=FALSE, df$Choice2=TRUE)"="2"), "(df$Choice1=TRUE, df$Choice2=FALSE)"="3", "(df$Choice1=TRUE, df$Choice2=TRUE)", ="4")

I've also tried looking for other suitable similar examples on line to try and adapt mine too but haven't been able to find any. Any help would be appreciated

like image 780
Eams Avatar asked Nov 30 '25 20:11

Eams


2 Answers

1) linear formula The indicated formula will give the group number as factor. The factor part was requested in a comment below.

transform(dat1, group = factor(1 + 2*Choice1 + Choice2))
##   Choice1 Choice2 group
## 1    TRUE   FALSE     3
## 2    TRUE    TRUE     4
## 3   FALSE   FALSE     1
## 4   FALSE    TRUE     2
## 5    TRUE    TRUE     4

We can derive the above formula using regression. For this to work it is essential that every combination be represented in dat1 but that is, in fact, the case.

y <- c(3, 4, 1, 2, 4)
fm <- lm(y ~., dat1)
fm
##
## Call:
## lm(formula = y ~ ., data = dat1)
##
## Coefficients:
## (Intercept)  Choice1TRUE  Choice2TRUE  
##           1            2            1  

# check that it is exact, i.e. residual sum of squares is 0
deviance(fm)
## [1] 3.574526e-31

2) explicit combinations Another approach is to explicitly consider each combination:

transform(dat1, group = factor(
  1 * (!Choice1 & !Choice2) +
  2 * (!Choice1 & Choice2) + 
  3 * (Choice1 & !Choice2) + 
  4 * (Choice1 & Choice2)))

giving:

  Choice1 Choice2 group
1    TRUE   FALSE     3
2    TRUE    TRUE     4
3   FALSE   FALSE     1
4   FALSE    TRUE     2
5    TRUE    TRUE     4

3) interaction A third approach is to use interaction to create the group factor.

transform(dat1, group = factor(interaction(Choice2, Choice1), label = ""))

giving:

  Choice1 Choice2 group
1    TRUE   FALSE     3
2    TRUE    TRUE     4
3   FALSE   FALSE     1
4   FALSE    TRUE     2
5    TRUE    TRUE     4

Note

The input in reproducible form is:

dat1 <- structure(list(Choice1 = c(TRUE, TRUE, FALSE, FALSE, TRUE), 
  Choice2 = c(FALSE, 
  TRUE, FALSE, TRUE, TRUE)), class = "data.frame", row.names = 
  c(NA, -5L))
like image 69
G. Grothendieck Avatar answered Dec 03 '25 12:12

G. Grothendieck


Here is a solution with case_when from dplyr package:

library(dplyr)

# your data:
df <- tribble(
~Choice1,  ~Choice2, 
TRUE,     FALSE,     
TRUE,     TRUE,      
FALSE,    FALSE,     
FALSE,    TRUE ,     
TRUE,     TRUE) 

df1 <- df %>% 
  mutate(new_col = case_when(Choice1==FALSE & Choice2 == FALSE ~ 1,
                             Choice1==FALSE & Choice2 == TRUE ~ 2,
                             Choice1==TRUE & Choice2 == FALSE ~ 3,
                             Choice1==TRUE & Choice2 == TRUE ~4))

enter image description here

like image 26
TarJae Avatar answered Dec 03 '25 12:12

TarJae



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!