For the last two days I've been trying to fit a curve to the following data set. As you can see (from the image) the data forms a near perfect curve on its own but I haven't been able to find a way to mathematically represent the data for both interpolation and extrapolation.
y = c(0.2966, 0.2793, 0.2147, 0.1523, 0.1177, 0.1026, 0.0934, 0.0767,
0.0729, 0.0693, 0.0658, 0.0624, 0.0561, 0.0502, 0.0424, 0.04, 0.0356,
0.0335, 0.0316, 0.0279, 0.0231, 0.0217, 0.0203, 0.019, 0.02, 0.016,
0.0151, 0.0134, 0.0127, 0.0119, 0.0113, 0.0106, 0.01, 0.0094, 0.0089,
0.0084, 0.0074, 0.007, 0.0062, 0.0059, 0.0053, 0.0048, 0.0043,
0.0041, 0.0037, 0.0033, 0.0032, 0.003, 0.0029, 0.0025, 0.0024,
0.0023, 0.0021, 0.002, 0.0016, 0.0016, 0.0014, 0.0012, 0.001,
0.0007, 0.0006, 0.0004, 0.0003)
x = c(0.77894, 0.79452, 0.85683, 0.92694, 0.97367, 0.99704, 1.01262,
1.04378, 1.05157, 1.05936, 1.06714, 1.07493, 1.09051, 1.10609,
1.12946, 1.13725, 1.15283, 1.16062, 1.16841, 1.18399, 1.20735,
1.21514, 1.22293, 1.23072, 1.2463, 1.25409, 1.26188, 1.27746,
1.28525, 1.29304, 1.30083, 1.30862, 1.3164, 1.32419, 1.33198,
1.33977, 1.35535, 1.36314, 1.37872, 1.38651, 1.40209, 1.41767,
1.43325, 1.44103, 1.45661, 1.47219, 1.47998, 1.48777, 1.49556,
1.51893, 1.52672, 1.53451, 1.55009, 1.55788, 1.58903, 1.59682,
1.6124, 1.63577, 1.67472, 1.75261, 1.79156, 1.86945, 1.92398)
Here's is the data plotted with a exponential curve (pink) and a 4th order polynomial (red). The exponential curve has quite a bit of error while the 4th order fits but you can't extrapolate using it and when applying to similar data sets it doesn't always work.
For what I'm working on I really need something perfectly fits to the curve put I haven't figured out how to do it yet. Thanks.

An object lesson in the perils of extrapolation.
In the absence of a theoretical model, your data is fit about equally well using either a logistic function (f1(...)) or a scaled log-normal density function (f2(...)). There are probably other functions that would also fit well.
df <- data.frame(x,y)
library(minpack.lm) # for nlsLM(...)
f1 <- function(x,a,b,c,d) a*exp(-(b*x))/(1+c*exp(-d*x))
fit.1 <- nlsLM(y~f1(x,a,b,c,d), df,
start=c(a=1, b=1, c=100, d=0), control=list(maxiter=500))
f2 <- function(x,a,m,s) a*dlnorm(x, meanlog=m, sdlog=s)
fit.2 <- nlsLM(y~f2(x,a,m,s), df,
start=c(a=1, m=0, s=1), control=list(maxiter=500))
plot(y~x,df)
curve(predict(fit.1,data.frame(x)),add=TRUE, col="blue")
curve(predict(fit.2,data.frame(x)),add=TRUE, col="red")

But look what happens when you extrapolate.
plot(y~x, df, xlim=c(0.5,2), ylim=c(0,.6))
curve(predict(fit.1,data.frame(x)),add=TRUE, col="blue")
curve(predict(fit.2,data.frame(x)),add=TRUE, col="red")

It turns out that actually the log-normal density function is a slightly better fit in that the residuals are more nearly normal, although there is a strong pattern in the residuals in both cases. The point is that just by looking at the data and the fitted curve you might accept either function, but they will give very different results on extrapolation, and in fact neither one is a great fit. You really need a theoretical model.
A LOESS regression seems to work quite well with that data.
plot(y~x)
ls <- loess(y~x, span = 0.5)
pr <- predict(ls, x)
lines(x, pr, col = "red", lwd = 2)

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With