Calculating standard deviation of each row

Tags:

r

I am trying to use rowSds()to calculate each rows standard deviation so that I can pick the rows that have high sds to graph.

My data frame is called xx is like this:

head(xx,1)
     Job     variable 2012-02-23 2012-02-24 2012-02-25 2012-02-27 2012-02-28 2012-02-29 2012-03-01 2012-03-02 2012-03-03 2012-03-05 2012-03-06 2012-03-07 2012-03-08 2012-03-09 2012-03-10 2012-03-12 2012-03-13 2012-03-14
1 A Duration        152        424         NA        499        320        117        211        363         NA        605         76        309        204        185         NA         25        733        500
  2012-03-15 2012-03-16 2012-03-17 2012-03-19 2012-03-20 2012-03-21 2012-03-22 2012-03-23 2012-03-24 2012-03-26 2012-03-27 2012-03-28 2012-03-29 2012-03-30 2012-03-31 2012-04-02 2012-04-03 2012-04-04 2012-04-05 2012-04-06
1        521        601         NA        229        758        421        334        659         NA        419        423        444        289        594         NA        327        533        183        211        235
  2012-04-07 2012-04-09 2012-04-10 2012-04-11 2012-04-12 2012-04-13 2012-04-14 2012-04-16 2012-04-17 2012-04-18 2012-04-19 2012-04-20 2012-04-21 2012-04-23 2012-04-24 2012-04-25 2012-04-26 2012-04-27 2012-04-28 2012-04-30
1         NA        225        419        236        218        188         NA        205        547        153        196        200         NA        259        257        208        302        244         NA        806
  2012-05-01 2012-05-02 2012-05-03 2012-05-04 2012-05-05 2012-05-07 2012-05-08 2012-05-09 2012-05-10 2012-05-11 2012-05-12 2012-05-14 2012-05-15 2012-05-16 2012-05-17 2012-05-18 2012-05-19 2012-05-21 2012-05-22 2012-05-23
1        402        492       1078        440         NA        382        576       1105        511        368         NA        360        381       1152        718        353         NA        408        413        935
  2012-05-24 2012-05-25 2012-05-26 2012-05-28 2012-05-29 2012-05-30 2012-05-31 2012-06-01 2012-06-02 2012-06-04 2012-06-05 2012-06-06 2012-06-07 2012-06-08 2012-06-09 2012-06-11 2012-06-12 2012-06-13 2012-06-14 2012-06-15
1        306        277         NA        253        367        977        557        432         NA        328        521        467        972       1556         NA        386       1394        401        857        857
  2012-06-16 2012-06-18 2012-06-19 2012-06-20 2012-06-21 2012-06-22 2012-06-23 2012-06-25 2012-06-26 2012-06-27 2012-06-28 2012-06-29 2012-06-30 2012-07-02 2012-07-03 2012-07-04 2012-07-05 2012-07-06 2012-07-07 2012-07-09
1         NA       1056        324        329        327        325         NA        341        268        231        245        301         NA        283        365        297        310        260         NA        254
  2012-07-10 2012-07-11 2012-07-12 2012-07-13 2012-07-14 2012-07-16 2012-07-17 2012-07-18 2012-07-19 2012-07-20 2012-07-21 2012-07-23 2012-07-24 2012-07-25 2012-07-26 2012-07-27 2012-07-28 2012-07-30 2012-07-31 2012-08-01
1        283        395        273        273         NA        278        243        210        356        267         NA        442        483        271        327        271         NA        716        598        577
  2012-08-02 2012-08-03 2012-08-06 2012-08-07 2012-08-08 2012-08-09 2012-08-10 2012-08-13 2012-08-14 2012-08-15 2012-08-16 2012-08-17 2012-08-20 2012-08-21 2012-08-22 2012-08-23 2012-08-24 2012-08-27 2012-08-28 2012-08-29
1        345        403        318        522        333        259        404        244        240        288        245         22        738        530        390        648        294        403        381        724
  2012-08-30 2012-08-31 2012-09-03 2012-09-04 2012-09-05 2012-09-06 2012-09-07 2012-09-10 2012-09-11 2012-09-12 2012-09-13 2012-09-14 2012-09-17 2012-09-18 2012-09-19 2012-09-20 2012-09-21 2012-09-24 2012-09-25 2012-09-26
1        740        575        558        785        883        501        901        500        285        174        562       1047        603        990        289        173        253        512        236        278
  2012-09-27 2012-09-28 2012-10-01 2012-10-02 2012-10-03 2012-10-04 2012-10-05 2012-10-08 2012-10-09 2012-10-10 2012-10-11 1        173        277        217        291        197        308        124        387        369        250        242

I am trying to calculate each rows standard deviation and assinging to sd column name:

xx$sd<-rowSds(xx)

I get this error:

Error in apply(na.omit(as.matrix(x), ...), 1, FUN, ...) : 
  error in evaluating the argument 'X' in selecting a method for function 'apply': Error in na.omit(as.matrix(x), ...) : 
  error in evaluating the argument 'object' in selecting a method for function 'na.omit': Error in `colnames<-`(`*tmp*`, value = c("2012-02-23", "2012-02-24", "2012-02-25",  : 
  length of 'dimnames' [2] not equal to array extent

Any ideas how can I omit NA when calculating the SD? Is my syntax correct?

628

asked Oct 12 '12 14:10

user1471980

1 Answers

You can use apply and transform functions

set.seed(007)
X <- data.frame(matrix(sample(c(10:20, NA), 100, replace=TRUE), ncol=10))
transform(X, SD=apply(X,1, sd, na.rm = TRUE))
   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10       SD
1  NA 12 17 18 19 16 12 13 20  14 3.041381
2  14 12 13 13 14 18 16 17 20  10 3.020302
3  11 19 NA 12 19 19 19 20 12  20 3.865805
4  10 11 20 12 15 17 18 17 18  12 3.496029
5  12 15 NA 14 20 18 16 11 14  18 2.958040
6  19 11 10 20 13 14 17 16 10  16 3.596294
7  14 16 17 15 10 11 15 15 11  16 2.449490
8  NA 10 15 19 19 12 15 15 19  14 3.201562
9  11 NA NA 20 20 14 14 17 14  19 3.356763
10 15 13 14 15 NA 13 15 NA 15  12 1.195229

From ?apply you can see ... which allows using optional arguments to FUN, in this case you can use na.rm=TRUE to omit NA values.

Using rowSds from matrixStats package also requires setting na.rm=TRUE to omit NA

library(matrixStats)
transform(X, SD=rowSds(X, na.rm=TRUE)) # same result as before.

130

answered Oct 22 '22 16:10

Jilber Urbina

Related questions
                            
                                Earliest Date for each id in R
                            
                                dplyr - filter by group size
                            
                                How to erase all attributes?
                            
                                outer() equivalent for non-vector lists in R
                            
                                How to create an "inkblot" chart with R?
                            
                                Out of memory when modifying a big R data.frame
                            
                                XPath to extract text after br tags in R
                            
                                How can I determine if try returned an error or not?
                            
                                How to generate all possible combinations of vectors without caring for order?
                            
                                Calculating column means based on values in another column [duplicate]
                            
                                Passing a `data.table` to c++ functions using `Rcpp` and/or `RcppArmadillo`
                            
                                Arrange ggplots together in custom ratios and spacing
                            
                                Elegant way to drop rare factor levels from data frame
                            
                                How do you create a 50 state map (instead of just lower-48)
                            
                                Linear Regression and storing results in data frame [duplicate]
                            
                                How to identify the distribution of the given data using r
                            
                                Mutating column in `dplyr` using `rowSums`
                            
                                Why are my functions on lubridate dates so slow?
                            
                                How to swap (translate) values inside a vector
                            
                                Lagging Variables in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With