Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Power regression in R similar to excel

Tags:

r

ggplot2

I have a simple dataset and I am trying to use the power trend to best fit the data. The sample data is very small and is as follows:

structure(list(Discharge = c(250, 300, 500, 700, 900), Downstream = c(0.3, 
0.3, 0.3, 0.3, 0.3), Age = c(1.32026239202165, 1.08595138888889, 
0.638899189814815, 0.455364583333333, 0.355935185185185)), .Names = c("Discharge", 
"Downstream", "Age"), row.names = c(NA, 5L), class = "data.frame")

Data looks as follows:

> new
  Discharge Downstream       Age
1       250        0.3 1.3202624
2       300        0.3 1.0859514
3       500        0.3 0.6388992
4       700        0.3 0.4553646
5       900        0.3 0.3559352

I tried to plot the above data using ggplot2

ggplot(new)+geom_point(aes(x=Discharge,y=Age))

I could add the linear line using geom_smooth(method="lm") but I am not sure what code do I need to show the power line.

The output is as follows:

enter image description here

How Can I add a power linear regression line as done in excel ? The excel figure is shown below:

enter image description here

like image 572
Jd Baba Avatar asked Aug 19 '13 03:08

Jd Baba


People also ask

How do you do a power regression in Excel?

To do so, click the Data tab along the top ribbon. Then click the Data Analysis option within the Analyze section. If you don't see this option available, you need to first load the Analysis ToolPak. In the dropdown window that appears, click Regression and then click OK.

What is the difference between exponential and power regression?

A variable grows exponentially if it is multiplied by a fixed number greater than 1 in each equal time period. Exponential decay occurs when the factor is less than one. Power Regression is one in which the response variable is proportional to the explanatory variable raised to a power.

What is power regression used for?

The power regression equation will be used to predict y-values that lie within the plotted values of x (interpolate). If you are undecided as to which regression model would be best suited for your scatter plot, it will be necessary to investigate the possible choices a little more closely in relation to your data.


2 Answers

While mnel's answer is correct for a nonlinear least squares fit, note that Excel isn't actually doing anything nearly that sophisticated. It's really just log-transforming the response and predictor variables, and doing an ordinary (linear) least squares fit. To reproduce this in R, you would do:

lm(log(Age) ~ log(Discharge), data=df)

Call:
lm(formula = log(Age) ~ log(Discharge), data = df)

Coefficients:
   (Intercept)  log(Discharge)  
         5.927          -1.024  

As a check, the coefficient for log(Discharge) is identical to that from Excel while exp(5.927) ~ 375.05.

While I'm not sure how to use this as a trendline in ggplot2, you can do it in base graphics thusly:

m <- lm(log(y) ~ log(x), data=df)

newdf <- data.frame(Discharge=seq(min(df$Discharge), max(df$Discharge), len=100))
plot(Age ~ Discharge, data=df)
lines(newdf$Discharge, exp(predict(m, newdf)))

text(600, .8, substitute(b0*x^b1, list(b0=exp(coef(m)[1]), b1=coef(m)[2])))
text(600, .75, substitute(plain("R-square: ") * r2, list(r2=summary(m)$r.squared)))
like image 109
Hong Ooi Avatar answered Sep 29 '22 05:09

Hong Ooi


Use nls (nonlinear least squares) as your smoother

eg

ggplot(DD,aes(x = Discharge,y = Age)) +
  geom_point() + 
  stat_smooth(method = 'nls', formula = 'y~a*x^b', start = list(a = 1,b=1),se=FALSE)

Noting Doug Bates comments on R-squared values and non-linear models here, you could use the ideas in Adding Regression Line Equation and R2 on graph

to append the regression line equation

# note that you have to give it sensible starting values
# and I haven't worked out why the values passed to geom_smooth work!
power_eqn = function(df, start = list(a =300,b=1)){
  m = nls(Discharge ~ a*Age^b, start = start, data = df);
  eq <- substitute(italic(y) == a  ~italic(x)^b, 
               list(a = format(coef(m)[1], digits = 2), 
                    b = format(coef(m)[2], digits = 2)))
  as.character(as.expression(eq));                 
}

ggplot(DD,aes(x = Discharge,y = Age)) +
  geom_point() + 
  stat_smooth(method = 'nls', formula = 'y~a*x^b', start = list(a = 1,b=1),se=FALSE) +  
  geom_text(x = 600, y = 1, label = power_eqn(DD), parse = TRUE)
like image 30
mnel Avatar answered Sep 29 '22 05:09

mnel