Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate/normalize Zero mean and unit variance

What is "Zero mean and unit variance" and how to calculate/normalize it for single column file in R? I also want to divide the normalized values into two classes:

  1. normalized value at least 0.5 standard deviation (SD) above the mean
  2. normalized value at least 0.5 standard deviation (SD) below the mean

Thanks

like image 434
mona Avatar asked Jun 09 '16 15:06

mona


1 Answers

The quote "Zero mean and unit variance" means that the normalized variable has a mean of 0 and a standard deviation (and variance) of 1. One way to normalize variables in R is to use the scale function. Here is an example:

# create vector
set.seed(1234)
temp <- rnorm(20, 3, 7)

# take a look
> mean(temp)
[1] 1.245352
> sd(temp)
[1] 7.096653

# scale vector
tempScaled <- c(scale(temp))

# take a look
> mean(tempScaled)
[1] 1.112391e-17
> sd(tempScaled)
[1] 1

# find values below 0.5 standard deviation in scaled vector
tempScaled[tempScaled < -0.5]
# find values above 0.5 standard deviation in scaled vector
tempScaled[tempScaled > 0.5]

You could also scale the variable by hand pretty easily:

tempScaled2 <- (temp - mean(temp)) / sd(temp) 

> all.equal(tempScaled, tempScaled2)
[1] TRUE
like image 196
lmo Avatar answered Oct 19 '22 15:10

lmo