I have a matrix I want to transform, such that every feature in the transformed dataset has mean of 0 and variance of 1.
I have tried to use the following code:
scale <- function(train, test)
{
trainmean <- mean(train)
trainstd <- sd(train)
xout <- test
for (i in 1:length(train[1,])) {
xout[,i] = xout[,i] - trainmean(i)
}
for (i in 1:lenght(train[1,])) {
xout[,i] = xout[,i]/trainstd[i]
}
}
invisible(xout)
normalized <- scale(train, test)
This is, however, not working for me. Am I on the right track?
Edit: I am very new to the syntax!
You can use the built-in scale
function for this.
Here's an example, where we fill a matrix with random uniform variates between 0 and 1 and centre and scale them to have 0 mean and unit standard deviation:
m <- matrix(runif(1000), ncol=4)
m_scl <- scale(m)
Confirm that the column means are 0 (within tolerance) and their standard deviations are 1:
colMeans(m_scl)
# [1] -1.549004e-16 -2.490889e-17 -6.369905e-18 -1.706621e-17
apply(m_scl, 2, sd)
# [1] 1 1 1 1
See ?scale
for more details.
To write your own normalisation function, you could use:
my_scale <- function(x) {
apply(m, 2, function(x) {
(x - mean(x))/sd(x)
})
}
m_scl <- my_scale(m)
or the following, which is probably faster on larger matrices
my_scale <- function(x) sweep(sweep(x, 2, colMeans(x)), 2, apply(x, 2, sd), '/')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With