Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting "NA" when I run a standard deviation

Quick question. I read my csv file into the variable data. It has a column label var, which has numerical values.

When I run the command

sd(data$var)

I get

[1] NA 

instead of my standard deviation.

Could you please help me figure out what I am doing wrong?

like image 606
evt Avatar asked Apr 21 '11 04:04

evt


2 Answers

I've made the mistake a time or two of reusing variable names in dplyr strings which has caused issues.

mtcars %>%
  group_by(gear) %>%
  mutate(ave = mean(hp)) %>%
  ungroup() %>%
  group_by(cyl) %>%
  summarise(med = median(ave),
            ave = mean(ave), # should've named this variable something different
            sd = sd(ave)) # this is the sd of my newly created variable "ave", not the original one.
like image 127
Jeff Parker Avatar answered Sep 24 '22 04:09

Jeff Parker


Try sd(data$var, na.rm=TRUE) and then any NAs in the column var will be ignored. Will also pay to check out your data to make sure the NA's should be NA's and there haven't been read in errors, commands like head(data), tail(data), and str(data) should help with that.

like image 24
nzcoops Avatar answered Sep 22 '22 04:09

nzcoops