Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Gnuplot behaves oddly in polynomial fit. Why is that?

Tags:

gnuplot

A friend of mine discovered some odd behavior in gnuplot regarding a simple polynomial fit Can sombody explain this?

Here is the file:

#!/usr/bin/gnuplot -p

f(x) = B*(x**4) + A
fit f(x) "data.txt" using ($1+273.14):2 via A, B

plot    "data.txt" using ($1+273.14):2 notitle,\
        f(x) notitle

The data is:

# content of data.txt
350 3.856
330 3.242
290 2.391
250 1.713
210 1.181
170 0.763
130 0.437

The resulting plot is the green line. The blue line shows a far better fit using another function of basically the same form. For the green line A was replaced by a constant value (A = 0.2123 which is about B*300^4)

odd fitting behavior gnuplot

So the green line is clearly not the best fit here since f(x) = B*(x**4) - 0.2123 yields far betterresults and is also of the form B*x4 + A. In the green fit the parameter **A is simply ignored by gnuplot and remains unchanged by the fitting algorithm. Setting different initial values for A and B doesn't seem to help much - the value of A never changes for its inital value. My friend and I are using the standard Gnuplot version that comes with Ubuntu: gnuplot 4.4 patchlevel 3.

like image 264
con-f-use Avatar asked Jun 17 '12 18:06

con-f-use


2 Answers

This is a very good (and involved) questions that I don't have a complete answer for, but the following will hopefully be illuminating.

Fit uses a least-squares fitting routine ( Levenberg–Marquardt ). which iteratively converges on a "good" solution. How good a solution is required is determined by the FIT_LIMIT variable. By default, FIT_LIMIT is set to (a conservative) 1.e-5. Apparently, your data converges much faster by changing the value of B in the iterative routing compared to changing A. In fact, as you've noticed, you can get under the error threshold without even touching variable A. However, if you crank up your expectations (You expect to obtain better fit, so you set FIT_LIMIT to a lower value -- I set it to 1.e-14), you'll get a much better result. The price you pay here is that the fit may take much longer to converge (or it may even diverge -- I'm not an expert in fitting). One take-away here is that function fitting is more of an art than a science -- and there is no such thing as a best fit, only a good enough fit.

Also note that the algorithm searches for a local minimization of the squares of the residuals (that meets the tolerance you've given). It doesn't guarantee that it finds a global minimum.

#!/usr/bin/gnuplot -p

FIT_LIMIT=1.e-14
f(x) =A +  B*(x**4)
fit f(x) "data.txt" using ($1+273.14):2 via A, B

plot     "data.txt" using ($1+273.14):2 notitle,\
         f(x) notitle

Also note that if you find that gnuplot is converging on the wrong minimum, you can "seed" the fit routine by doing:

FIT_LIMIT=1.e-14
f(x) =A +  B*(x**4)
A=1.3  #initial guess for A
fit f(x) "data.txt" using ($1+273.14):2 via A, B

plot     "data.txt" using ($1+273.14):2 notitle,\
         f(x) notitle
like image 117
mgilson Avatar answered Nov 08 '22 14:11

mgilson


Just try the code below. The trick is to ensure that the range of the x and the y variables are of the same order of magnitude.

reset;
plot 'data.txt' u ($1+273.14):2 w p;
f(x, a, b) = a*(1e-2*x)**4 + b; # note the 1e-2 multiplicative factor
a = 1; b = 1; # initial parmeters
fit f(x,a,b) 'data.txt' u (($1+273.14)):2 via a, b
#plot 'data.txt' u (($1+273.14)):2 w p, f(x, a, b) w l
plot 'data.txt' u (($1+273.14)):2 w p, (a*(1e-2)**4)*x**4+b w l
print sprintf("Fit parameters for the fit function a*x^4 + b are :\n\ta = %e, \n\tb = %f", a*(1e-2)**4, b)

Image of the graph

like image 44
kvv Avatar answered Nov 08 '22 14:11

kvv