I am trying to transform data in a vector in R. This is not for linear regression so I don't have a predictor and response relationship. I am simply using a model that will improve accuracy by normalizing my data. (hence I can't use the boxcox function since it only works with linear models). the data I'm trying to transform is: <pre class="prettyprint"><code>vect [1] 99.64 49.71 246.84 96.17 16.67 352.00 421.25 81.77 105.00 37.85 </code></pre> I have looked at this post. It was not clear on what was being done and how the optimize function is being used but I did manage to modify the function to create a function that I would like to minimize. <pre class="prettyprint"><code>xskew <- function(data,par) { abs(skewness((data^par-1)/par)) } </code></pre> I would like to input a sequence of values for lambda (perhaps between 0.5 and 1 with jumps of 0.01) and find which one of those values minimizes xskew for my dataset. I have tried to do this with the optim function but with no luck so I don't think this might be the right function for me. How do I perform this calculation? edit: I would like something along the lines of: <pre class="prettyprint"><code> x <- seq(0.51,0.99,by=0.01) which(xskew(vect,x) < 0.05) </code></pre> So perhaps I would find a value under some threshold. This code obviously produces an error.

For applying box cox transformation on vector, use forecast package in r: <pre class="prettyprint"><code>library(forecast) # to find optimal lambda lambda = BoxCox.lambda( vector ) # now to transform vector trans.vector = BoxCox( vector, lambda) </code></pre>

Finding Optimal Lambda for Box-Cox Transform in R

Tags:

optimization

r

normalization

I am trying to transform data in a vector in R.

This is not for linear regression so I don't have a predictor and response relationship. I am simply using a model that will improve accuracy by normalizing my data. (hence I can't use the boxcox function since it only works with linear models).

the data I'm trying to transform is:

vect
 [1]  99.64  49.71 246.84  96.17  16.67 352.00 421.25  81.77 105.00  37.85

I have looked at this post.

It was not clear on what was being done and how the optimize function is being used but I did manage to modify the function to create a function that I would like to minimize.

xskew <- function(data,par) {
abs(skewness((data^par-1)/par)) }

I would like to input a sequence of values for lambda (perhaps between 0.5 and 1 with jumps of 0.01) and find which one of those values minimizes xskew for my dataset.

I have tried to do this with the optim function but with no luck so I don't think this might be the right function for me. How do I perform this calculation?

edit: I would like something along the lines of:

 x <- seq(0.51,0.99,by=0.01)
 which(xskew(vect,x) < 0.05)

So perhaps I would find a value under some threshold. This code obviously produces an error.

759

asked Oct 28 '14 20:10

Michal

2 Answers

Note that y~1 counts as a linear model in R, so you can use the boxcox function from MASS:

tmp <- exp(rnorm(10))
out <- boxcox(lm(tmp~1))
range(out$x[out$y > max(out$y)-qchisq(0.95,1)/2])

I think that the most important part of that function is not that it finds a "best" lambda, but that it finds the confidence interval for lambda, then encourages you to think about what the different transformations mean and combine that with the science behind the data. If the "best" lambda for your data is 0.41, but the interval contains 0.5 and there is scientific reasoning why a square root transform makes sense, then why use 0.41 instead of 0.5?

answered Oct 24 '22 03:10

Greg Snow

For applying box cox transformation on vector, use forecast package in r:

library(forecast)
# to find optimal lambda
lambda = BoxCox.lambda( vector )
# now to transform vector
trans.vector = BoxCox( vector, lambda)

answered Oct 24 '22 02:10

TheMI

Related questions
                            
                                Finding Mean Squared Error?
                            
                                Recode multiple columns using dplyr
                            
                                Separate rows into columns using the first split character
                            
                                How many numbers after the decimal point can you show using R?
                            
                                How to grep a word exactly
                            
                                Function for median similar to "which.max" and "which.min" / Extracting median rows from a data.frame
                            
                                Get width of plot area in ggplot2
                            
                                Calculating statistics on subsets of data [duplicate]
                            
                                How can I tell if a certain package was already installed?
                            
                                Get list of available data frames
                            
                                How to convert CamelCase to not.camel.case in R
                            
                                devtools::install_github Error in function (type, msg, asError = TRUE) : <not set>
                            
                                Find minimum non-zero value in a column R
                            
                                Concatenate char vector with | separator
                            
                                ggplot and two different geom_line(): the legend does not appear
                            
                                pie chart with ggplot2 with specific order and percentage annotations
                            
                                apply function to every element in data.frame and return data.frame
                            
                                Viewing tables of data in R
                            
                                R getting substrings and regular expressions?
                            
                                R TwitteR package authorization error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With