Trying to use dplyr to group_by and apply scale()

Tags:

Trying to use dplyr to group_by the stud_ID variable in the following data frame, as in this SO question:

> str(df) 'data.frame':   4136 obs. of  4 variables:  $ stud_ID         : chr  "ABB112292" "ABB112292" "ABB112292" "ABB112292" ...  $ behavioral_scale: num  3.5 4 3.5 3 3.5 2 NA NA 1 2 ...  $ cognitive_scale : num  3.5 3 3 3 3.5 2 NA NA 1 1 ...  $ affective_scale : num  2.5 3.5 3 3 2.5 2 NA NA 1 1.5 ...

I tried the following to obtain scale scores by student (rather than scale scores for observations across all students):

scaled_data <-            df %>%               group_by(stud_ID) %>%                   mutate(behavioral_scale_ind = scale(behavioral_scale),                          cognitive_scale_ind = scale(cognitive_scale),                          affective_scale_ind = scale(affective_scale))

Here is the result:

> str(scaled_data) Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 4136 obs. of  7 variables:  $ stud_ID             : chr  "ABB112292" "ABB112292" "ABB112292" "ABB112292" ...  $ behavioral_scale    : num  3.5 4 3.5 3 3.5 2 NA NA 1 2 ...  $ cognitive_scale     : num  3.5 3 3 3 3.5 2 NA NA 1 1 ...  $ affective_scale     : num  2.5 3.5 3 3 2.5 2 NA NA 1 1.5 ...  $ behavioral_scale_ind: num [1:12, 1] 0.64 1.174 0.64 0.107 0.64 ...   ..- attr(*, "scaled:center")= num 2.9   ..- attr(*, "scaled:scale")= num 0.937  $ cognitive_scale_ind : num [1:12, 1] 1.17 0.64 0.64 0.64 1.17 ...   ..- attr(*, "scaled:center")= num 2.4   ..- attr(*, "scaled:scale")= num 0.937  $ affective_scale_ind : num [1:12, 1] 0 1.28 0.64 0.64 0 ...   ..- attr(*, "scaled:center")= num 2.5   ..- attr(*, "scaled:scale")= num 0.782

The three scaled variables (behavioral_scale, cognitive_scale, and affective_scale) have only 12 observations - the same number of observations for the first student, ABB112292.

What's going on here? How can I obtain scaled scores by individual?

540

asked Mar 03 '16 15:03

Joshua Rosenberg

1 Answers

The problem seems to be in the base scale() function, which expects a matrix. Try writing your own.

scale_this <- function(x){   (x - mean(x, na.rm=TRUE)) / sd(x, na.rm=TRUE) }

Then this works:

library("dplyr")  # reproducible sample data set.seed(123) n = 1000 df <- data.frame(stud_ID = sample(LETTERS, size=n, replace=TRUE),                  behavioral_scale = runif(n, 0, 10),                  cognitive_scale = runif(n, 1, 20),                  affective_scale = runif(n, 0, 1) ) scaled_data <-    df %>%   group_by(stud_ID) %>%   mutate(behavioral_scale_ind = scale_this(behavioral_scale),          cognitive_scale_ind = scale_this(cognitive_scale),          affective_scale_ind = scale_this(affective_scale))

Or, if you're open to a data.table solution:

library("data.table")  setDT(df)  cols_to_scale <- c("behavioral_scale","cognitive_scale","affective_scale")  df[, lapply(.SD, scale_this), .SDcols = cols_to_scale, keyby = factor(stud_ID)]

155

answered Oct 07 '22 00:10

C8H10N4O2

Related questions
                            
                                Why does position absolute make page to overflow?
                            
                                Order by relationship column
                            
                                RxJs good tutorials [closed]
                            
                                How can I recreate a fragment?
                            
                                How to add overflow menu to Toolbar?
                            
                                VueJs child component props not updating instantly
                            
                                Laravel @extends and @include
                            
                                What does the word "Trident" in my user agent string refer to?
                            
                                Json parsing in Ansible
                            
                                Parsing a .json column in Power BI
                            
                                How to use visual studio code to debug django
                            
                                Develop Tampermonkey scripts in a real IDE with automatic deployment to OpenUserJs repo

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With