Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: df header columns are ordinal ranking and spread across columns for each observation

Tags:

r

I have a questionnaire data that look like below:

   items no_stars1  no_stars2   no_stars3   average satisfied   bad
1     A         1           0           0         0         0     1
2     B         0           1           0         1         0     0
3     C         0           0           1         0         1     0
4     D         0           1           0         0         1     0
5     E         0           0           1         1         0     0
6     F         0           0           1         0         1     0
7     G         1           0           0         0         0     1

Basically, the header columns (no. of stars rating and satisfactory) are ordinal ranking for each Items. I would like to summarize the no_stars(col 2:4) and satisfactory(col 5:7) into one column so that the output would look like this :

   items    no_stars    satisfactory    
1     A         1           1           
2     B         2           2           
3     C         3           3           
4     D         2           3           
5     E         3           2           
6     F         3           3           
7     G         1           1         

$no_stars <- 1 is for no_stars1, 2 for no_stars2, 3 for no_stars3

$satisfactory <- 1 is for bad, 2 for average, 3 for good

I have tried the code below

df$no_stars2[df$no_stars2 == 1] <- 2
df$no_stars3[df$no_stars3 == 1] <- 3

df$average[df$average == 1] <- 2
df$satisfied[df$satisfied == 1] <- 3

no_stars <- df$no_stars1 + df$no_stars2 + df$no_stars3
satisfactory <- df$bad + df$average + df$satisfied

tidy_df <- data.frame(df$Items, no_stars, satisfactory)
tidy_df

Is there any function in R that can do the same thing? or anyone got better and simpler solution ?

Thanks

like image 517
PybabyR Avatar asked Jan 21 '26 14:01

PybabyR


2 Answers

Just use max.col and set preferences:

starsOrder<-c("no_stars1","no_stars2","no_stars3")
satOrder<-c("bad","average","satisfied")
data.frame(items=df$items,no_stars=max.col(df[,starsOrder]),
            satisfactory=max.col(df[,satOrder]))
#  items no_stars satisfactory
#1     A        1            1
#2     B        2            2
#3     C        3            3
#4     D        2            3
#5     E        3            2
#6     F        3            3
#7     G        1            1
like image 114
nicola Avatar answered Jan 23 '26 09:01

nicola


Another tidyverse solution making use of factor to integer conversions to encode no_stars and satisfactory and spreading from wide to long twice:

library(tidyverse)
df %>%
    gather(no_stars, v1, starts_with("no_stars")) %>%
    mutate(no_stars = as.integer(factor(no_stars))) %>%
    gather(satisfactory, v2, average, satisfied, bad) %>%
    filter(v1 > 0 & v2 > 0) %>%
    mutate(satisfactory = as.integer(factor(
        satisfactory, levels = c("bad", "average", "satisfied")))) %>%
    select(-v1, -v2) %>%
    arrange(items)
#  items no_stars satisfactory
#1     A        1            1
#2     B        2            2
#3     C        3            3
#4     D        2            3
#5     E        3            2
#6     F        3            3
#7     G        1            1
like image 27
Maurits Evers Avatar answered Jan 23 '26 09:01

Maurits Evers



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!