Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scipy.optimize.curve_fit producing nonsensical curve fits

I am trying to fit a curve to some generated data that resemble an exponential function when plotted. I am using scipy.optimize.curve_fit as it seems like it is the best (and best documented) for the job. The actual data are newly generated each time I run the code, but here is an example set:

import pandas
import scipy.optimize as opt

x1 = [0.4145392937447818, 0.7807888116968482, 0.7903528929788539, 
1.5081613036989836, -0.295895237606155, -0.0855307279546107, 
1.0523973736479486, -0.6967509832843239, -0.30499200990688413, 
1.1990545631966807, -1.270460772249312, 0.9531042718153095, 1.5747175535222993, 
-0.6483709650867473, 0.47820180254528477, 1.14266851615097, 0.6237953640100202, 
0.0664027559951128, 0.877280002485417, 0.9432317053343211, 1.0367424879878504, 
-0.6410400513164749, 1.667835241401498, -0.20484029870424125, 
2.887026948755316]

y1 = [0.718716626591187, 0.579938466590508, 0.722005637974309, 
1.61842778379047, 0.331301712743162, 0.342649242449043, 1.14950611092907, 
0.299221762023701, 0.345063839940754, 1.08398125906313, 0.315433168226251, 
1.3343730617376, 1.32514210008176, 0.308702648499771, 0.495749985226691, 
0.406025683910759, 0.445087968405107, 0.423578575247177, 0.816264419038205, 
1.16110461165631, 1.81572974380867, 0.420890068255763, 0.821468286117842, 
0.416275933630732, 4.7877353794036]

data = pandas.DataFrame({"Pi_values": x1, 
                         "CO2_at_solubility": y1})

Then, I do the curve fitting business...

##Define curve fitting
def func(x, m, c, c0):
    return c0 + m**x * c

#draw the figure
fig, ax1 = plt.subplots()
plt.xlabel('Pi Parameter')
plt.ylabel('CO2 wt%')

#plot generated data
#tried converting pandas columns to np arrays based on an issue another user was having, but it does not help
x1 = data["Pi_values"].values
y1 = data["CO2_at_solubility"].values

# Curve fitting with scipy.optimize.curve_fit
popt, pcov = opt.curve_fit(func, x1, y1)
# Use the optimized parameters to plot the best fit
plt.plot(x1, y1, 'o', x1, func(x1, *popt))

And here is the very weird result. No matter what form of the equation I try in fun, if it is able to fit any "curve" it looks like this mess:

Click here to see the plot

Or this mess...

Click here to see another plot

Any idea what could be going on here? I've not been able to find any other examples like this. I'm running python3.5 in a jupyter notebook.

Other things I tried that didn't work: other forms of the equation; other equations; changing initial guess values; scaling values in case y values were too small.

like image 427
keirasan Avatar asked Jan 28 '23 11:01

keirasan


2 Answers

You just need to sort the x values with

data.sort_values(by='Pi_values', ascending=True, inplace=True)

before curve_fit:

x1 = data["Pi_values"].values
y1 = data["CO2_at_solubility"].values
# Curve fitting with scipy.optimize.curve_fit
popt, pcov = opt.curve_fit(func, x1, y1)
# Use the optimized parameters to plot the best fit
plt.plot(x1, y1, 'o', x1, func(x1, *popt))

enter image description here

like image 110
Sandipan Dey Avatar answered Jan 31 '23 23:01

Sandipan Dey


The elements of the x axis need to be sorted when plotting.

Example:

x1, y1 = zip(*sorted(zip(x1, y1)))
# Curve fitting with scipy.optimize.curve_fit
popt, pcov = opt.curve_fit(func, x1, y1)
# Use the optimized parameters to plot the best fit
plt.plot(x1, y1, 'o', x1, func(x1, *popt))

which results in:

enter image description here

like image 45
abc Avatar answered Jan 31 '23 23:01

abc