Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

standard deviation on dataframe does not work

Tags:

r

I have an unexpected [for me at least] error in calculating a standard deviation. The idea [*] is to convert all missing values to 1 and 0 otherwise. Then extract variables that have some [but not all] missing values, before a correlation is done. That extraction step is attempted with a sd function, but it fails [why?].

library(VIM)
data(sleep) # dataset with missing values

x = as.data.frame(abs(is.na(sleep))) # converts all NA to 1, otherwise 0
y = x[which(sd(x) > 0)] # attempt to extract variables with missing values

Error in is.data.frame(x) : 
(list) object cannot be coerced to type 'double'

# convert to double    
z = as.data.frame(apply(x, 2, as.numeric))
y = z[which(sd(z) > 0)]

Error in is.data.frame(x) : 
(list) object cannot be coerced to type 'double'

[*] R in Action, Robert Kabacoff

like image 327
Henk Avatar asked Jun 05 '14 10:06

Henk


1 Answers

sd on data.frames has been defunct since R-3.0.0:

> ## Build a db of all R news entries.
> db <- news()
> ## sd
> news(grepl("sd", Text), db=db)
Changes in version 3.0.3:

PACKAGE INSTALLATION

    o   The new field SysDataCompression in the DESCRIPTION file allows
        user control over the compression used for sysdata.rda objects in
        the lazy-load database.

Changes in version 3.0.0:

DEPRECATED AND DEFUNCT

    o   mean() for data frames and sd() for data frames and matrices are
        defunct.

Use sapply(x, sd) instead.

like image 128
Joshua Ulrich Avatar answered Sep 18 '22 20:09

Joshua Ulrich