Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Power calculation for minimum effect hypothesis (non-central f distribution) in R

Tags:

r

At work I was asked to calculate the statistical power for a Wald test statistic. We are testing against a null hypothesis of 1.54% of explained variance which is why the F distribution we are comparing against is non-central.

Now I was told to at first calculate the critical F value under the null hypothesis for the significance level (we used .05). I did this in R like this:

`groups <- 2
N <- 4440
PV <- 0.0154

min.eff.hyp <- function(N, groups=2, PV=.01, alpha=.05){
  df1 <- groups - 1 # numerator degrees of freedom
  df2 <- N-groups # denominator degrees of freedom
  ncp <- ((N-1)*PV)/(1-PV) # non-centrality parameter
  g <- (df1+ncp)^2/(df1+2*ncp) # adjusted df1
  k <- (df1+ncp)/df1 # to adjust for non-centrality
  crit.central <- qf(1-alpha,g,df2)
  crit.val <- crit.central * k
  return(paste('Kritischer F-Wert (PV=',PV,'): ',round(crit.val,3), sep = ''))
}

min.eff.hyp(N,groups,PV)`

And then I am supposed to use this critical F value to determine the alpha' under the alternative hypothesis. This alpha' can then be used to calculate the statistical power.

Unfortunately, I don't know how to get alpha'. Again I tried using R: pf(crit.val,g,df2,ncp)

I think, this should calculate the probability of my data given that the alternative hypothesis is true. But to be true, I am at a loss. I don't know how to implement the non-centrality in power calculations and somehow I can't find somebody who has had the same problem before and actually found a solution.

How can I calculate the statistical power of a test when I'm testing against a minimum effect hypothesis and thus comparing against non-central distributions?

Thank you very much for taking the time and helping me!

Greetings, Maria

like image 412
Maria Karllein Avatar asked Feb 02 '16 12:02

Maria Karllein


1 Answers

We can calculate the statistical power for a F-test statistic with just: # of groups, # of individuals N, significance level alpha, and percentage of explained variance PV.

One way is to use non-central F distribution with the estimated non-centrality parameter NCP. Another way, as in your example, is to use an adjusted central F distribution (Murphy et al. 2009).

The simulation results suggest that the first method is more accurate for power estimation.

>        theoretical  simulation  method_1   method_2
> Power    0.4631185      0.4647  0.460971  0.5521764
> F mean    4.518145    4.529506  4.497807          -
> PV       0.0089153   0.0089509         -          -
> NCP            3.5           -  3.479744          -

The simulation of a non-central F-distribution:

## Simulation to get PV and power
groups <- 2
N <- 500
alpha <- 0.05
ncp <- 3.5 #Non-Centrality Parameter

#simulated Sum of Squares for Model and Error
set.seed(1)
df1 <- groups - 1 #main effect degrees of freedom 
df2 <- N - groups #error degrees of freedom
SSM = SSE = rep(NA,10000)
for (i in 1:10000)
  {
    SSM[i] <- sum(rnorm(df1, mean=sqrt(ncp/df1))^2)
    SSE[i] <- sum(rnorm(df2)^2)
  }

#mean F
F <- SSM/df1 / (SSE/df2)
mean(F) #estimated
> 4.529506
(df1+ncp) / df1 / ((df2-2)/df2) #theoretical
> 4.518145

#PV: Percentage of explained Variance
PV <- SSM / (SSM + SSE)
mean(PV)
> 0.008950899
(ncp + df1) / (ncp + df1 + df2) #theoretical
> 0.008915272

#statistical Power
F_crit <- qf(1-alpha, df1, df2) #critical F value
sum(F > F_crit) / length(F) #estimated
> 0.4647
1 - pf(F_crit, df1, df2, ncp) #theoretical
> 0.4631185

Method 1 (non-central F-distribution):

groups <- 2
N <- 500
alpha <- 0.05
PV <- 0.008950899
ncp <- NA

df1 <- groups - 1
df2 <- N - groups

#mean F
PV/df1 / ((1-PV) / df2)
> 4.497807

#estimated Non-Centrality Parameter
ncp <- (df2 - 2) * PV / (1-PV) - df1
> 3.479744

#estimated statistical power
F_crit <- qf(1-alpha, df1, df2) #critical F value
1 - pf(F_crit, df1, df2, ncp=ncp)
> 0.460971

Method 2 (adjusted central F-distribution):

lambda <- df2 * PV / (1 - PV) # another definition of non-centrality parameter
g <- (df1 + lambda)^2 / (df1 + 2 * lambda) # adjusted df1
k <- (df1 + lambda) / df1 # to adjust for non-centrality

#estimated statistical power
F_crit <- qf(1-alpha, df1, df2)
1 - pf(F_crit/k, g, df2)
> 0.5521764

The algebra behind method 1 (to estimate non-centrality parameter):

abc.

abc

where the last equal is from wikipedia.

abc.

like image 140
Ryan SY Kwan Avatar answered Sep 20 '22 05:09

Ryan SY Kwan