What is "Zero mean and unit variance" and how to calculate/normalize it for single column file in R? I also want to divide the normalized values into two classes:
Thanks
The quote "Zero mean and unit variance" means that the normalized variable has a mean of 0 and a standard deviation (and variance) of 1. One way to normalize variables in R is to use the scale
function. Here is an example:
# create vector
set.seed(1234)
temp <- rnorm(20, 3, 7)
# take a look
> mean(temp)
[1] 1.245352
> sd(temp)
[1] 7.096653
# scale vector
tempScaled <- c(scale(temp))
# take a look
> mean(tempScaled)
[1] 1.112391e-17
> sd(tempScaled)
[1] 1
# find values below 0.5 standard deviation in scaled vector
tempScaled[tempScaled < -0.5]
# find values above 0.5 standard deviation in scaled vector
tempScaled[tempScaled > 0.5]
You could also scale the variable by hand pretty easily:
tempScaled2 <- (temp - mean(temp)) / sd(temp)
> all.equal(tempScaled, tempScaled2)
[1] TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With