Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

trying to get reasonable values from scipy powerlaw fit

I'm trying to fit some data from a simulation code I've been running in order to figure out a power law dependence. When I plot a linear fit, the data does not fit very well.

Here's the python script I'm using to fit the data:

#!/usr/bin/env python
from scipy import optimize
import numpy

xdata=[ 0.00010851,  0.00021701,  0.00043403,  0.00086806,  0.00173611, 0.00347222]
ydata=[ 29.56241016,  29.82245508,  25.33930469,  19.97075977,  12.61276074, 7.12695312]

fitfunc = lambda p, x: p[0] + p[1] * x ** (p[2])
errfunc = lambda p, x, y: (y - fitfunc(p, x))

out,success = optimize.leastsq(errfunc, [1,-1,-0.5],args=(xdata, ydata),maxfev=3000)

print "%g + %g*x^%g"%(out[0],out[1],out[2])

the output I get is: -71205.3 + 71174.5*x^-9.79038e-05

While on the plot the fit looks about as good as you'd expect from a leastsquares fit, the form of the output bothers me. I was hoping the constant would be close to where you'd expect the zero to be (around 30). And I was expecting to find a power dependence of a larger fraction than 10^-5.

I've tried rescaling my data and playing with the parameters to optimize.leastsq with no luck. Is what I'm trying to accomplish possible or does my data just not allow it? The calculation is expensive, so getting more data points is non-trivial.

Thanks!

like image 448
zje Avatar asked Apr 16 '12 20:04

zje


2 Answers

It is much better to first take the logarithm, then use leastsquare to fit to this linear equation, which will give you a much better fit. There is a great example in the scipy cookbook, which I've adapted below to fit your code.

The best fits like this are: amplitude = 0.8955, and index = -0.40943265484

As we can see from the graph (and your data), if its a power law fit we would not expect the amplitude value to be near 30. As in the power law equation f(x) == Amp * x ** index, so with a negative index: f(1) == Amp and f(0) == infinity.

enter image description here

from pylab import *
from scipy import *
from scipy import optimize

xdata=[ 0.00010851,  0.00021701,  0.00043403,  0.00086806,  0.00173611, 0.00347222]
ydata=[ 29.56241016,  29.82245508,  25.33930469,  19.97075977,  12.61276074, 7.12695312]

logx = log10(xdata)
logy = log10(ydata)

# define our (line) fitting function
fitfunc = lambda p, x: p[0] + p[1] * x
errfunc = lambda p, x, y: (y - fitfunc(p, x))

pinit = [1.0, -1.0]
out = optimize.leastsq(errfunc, pinit,
                       args=(logx, logy), full_output=1)

pfinal = out[0]
covar = out[1]

index = pfinal[1]
amp = 10.0**pfinal[0]

print 'amp:',amp, 'index', index

powerlaw = lambda x, amp, index: amp * (x**index)
##########
# Plotting data
##########
clf()
subplot(2, 1, 1)
plot(xdata, powerlaw(xdata, amp, index))     # Fit
plot(xdata, ydata)#, yerr=yerr, fmt='k.')  # Data
text(0.0020, 30, 'Ampli = %5.2f' % amp)
text(0.0020, 25, 'Index = %5.2f' % index)
xlabel('X')
ylabel('Y')

subplot(2, 1, 2)
loglog(xdata, powerlaw(xdata, amp, index))
plot(xdata, ydata)#, yerr=yerr, fmt='k.')  # Data
xlabel('X (log scale)')
ylabel('Y (log scale)')

savefig('power_law_fit.png')
show()
like image 103
fraxel Avatar answered Nov 15 '22 14:11

fraxel


It helps to rescale xdata so the numbers are not all so small. You could work in a new variable xprime = 1000*x. Then fit xprime versus y.

Least squares will find parameters q fitting

y = q[0] + q[1] * (xprime ** q[2]) 
  = q[0] + q[1] * ((1000*x) ** q[2])

So let

p[0] = q[0]
p[1] = q[1] * (1000**q[2])
p[2] = q[2]

Then y = p[0] + p[1] * (x ** p[2])

It also helps to change the initial guess to something closer to your desired result, such as [max(ydata), -1, -0.5].

from scipy import optimize
import numpy as np

def fitfunc(p, x):
    return p[0] + p[1] * (x ** p[2])
def errfunc(p, x, y):
    return y - fitfunc(p, x)

xdata=np.array([ 0.00010851,  0.00021701,  0.00043403,  0.00086806,
                 0.00173611, 0.00347222])
ydata=np.array([ 29.56241016,  29.82245508,  25.33930469,  19.97075977,
                 12.61276074, 7.12695312])

N = 5000
xprime = xdata * N

qout,success = optimize.leastsq(errfunc, [max(ydata),-1,-0.5],
                               args=(xprime, ydata),maxfev=3000)

out = qout[:]
out[0] = qout[0]
out[1] = qout[1] * (N**qout[2])
out[2] = qout[2]
print "%g + %g*x^%g"%(out[0],out[1],out[2])

yields

40.1253 + -282.949*x^0.375555

like image 33
unutbu Avatar answered Nov 15 '22 14:11

unutbu