I am writing a function to do non-linear curve fitting and am running into this error:
TypeError: Improper input: N=2 must not exceed M=1.
I don't know why it thinks I am trying to use too large of an array when I am only reading in columns from a csv file.
import math
#stolen sig-fig function <--trust but verify
def round_figures(x, n):
return round(x, int(n - math.ceil(math.log10(abs(x)))))
def try_michaelis_menten_fit( df, pretty=False ):
# auto-guess
p0 = ( df['productFinal'].max(), df['substrateConcentration'].mean() )
popt, pcov = curve_fit( v, df['substrateConcentration'], df['productFinal'], p0=p0 )
perr = sqrt( diag( pcov ) )
kcat_km = popt[0] / popt[1]
# error propegation
kcat_km_err = (sqrt( (( (perr[0]) / popt[0])**2) + (( (perr[1]) / popt[1])**2) ))
kcat = ( popt[0] )
kcat_std_err = ( perr[0] )
km_uM = ( popt[1] * 1000000 )
km_std_err = ( perr[1] *1000000)
if pretty:
results = {
'kcat': round_figures(kcat, 3),
'kcat_std_err': round_figures(kcat_std_err, 3),
'km_uM': round_figures(km_uM, 5),
'km_std_err': round_figures(km_std_err, 3),
'kcat/km': round_figures(kcat_km, 2),
'kcat/km_err': round_figures(kcat_km_err, 2),
}
return pandas.Series( results )
else:
return popt, perr
df = pandas.read_csv( 'PNP_Raw2Fittr.csv' )
fits = df.groupby('sample').apply( try_michaelis_menten_fit, pretty=True )
fits.to_csv( 'fits_pretty_output.csv' )
print( fits )
I am reading in a data frame that is an expanded version of something like this:
sample yield dilution time productAbsorbance substrateConcentration internalStandard
0 PNPH_I_4 2.604 10000 2400 269.6 0.007000 2364.0
1 PNPH_I_4 2.604 10000 2400 215.3 0.002333 2515.7
2 PNPH_I_4 2.604 10000 2400 160.3 0.000778 2252.2
3 PNPH_I_4 2.604 10000 2400 104.1 0.000259 2302.4
4 PNPH_I_4 2.604 10000 2400 60.9 0.000086 2323.5
5 PNPH_I_4 2.604 10000 2400 35.4 0.000029 2367.9
6 PNPH_I_4 2.604 10000 2400 0.0 0.000000 2165.3
When I call this function on this smaller version of my data frame it seems to work, but when I use it on the large one I get this error. This error began when I added the internalStandard
column and worked perfectly before that. To make matters even more confusing, when I revert back to old code with an old version of the data frame it works fine, however if I add that line I get the error as would be expected, HOWEVER, when i delete the same line in my data frame and run the code again I STILL get the same error!
I have figured out that I pass in method='trf'
instead of lm
for my optimization method I instead get the error OverflowError: cannot convert float infinity to integer
, however I do use the df.dropna(inplace=True)
, is there a similar method that is specific for infinity?
I believe this error is referring to the fact that the length of your x
and y
(e.g. df['substrateConcentration']
and df['productFinal']
) input data is less than the number of fitting parameters that are given to curve_fit
, as defined in your fitting function v
. This is a consequence of the mathematics; attempting to perform curve fitting (optimization) with too few constraints.
I reproduced the same error with scipy.optimize.curve_fit
by providing a fit function that expects 4 fitting parameters with an array of shape (2,).
e.g.
import numpy as np
from scipy.optimize import curve_fit
x, y = np.array([0.5, 4.0]), np.array([1.5, 0.6])
def func(x, a, b, c, d):
return a*x**3. + b*x**2. - c/x + d
popt, pcov = curve_fit(func, x, y)
TypeError: Improper input: N=4 must not exceed M=2
However, since you have not provided your fit function v
in the question it is not possible to confirm that this is the specific cause of your problem.
Maybe your input data is not being formatted exactly the way you think it is. I suggest that you check how your arrays look when they are being passed to curve_fit
. You might be parsing the data wrongly so that the number of rows ends up being very small.
I have figured out that I pass in method='trf' instead of lm for my optimization method I instead get the error OverflowError: cannot convert float infinity to integer, however I do use the df.dropna(inplace=True), is there a similar method that is specific for infinity?
Yes, so different methods for the optimization check the input data differently and throw different errors. This suggests, again, that there is some kind of problem with your input data. The first method is probably rejecting (ignoring) those rows that 'trf' is throwing this error for, and perhaps ending up with no rows at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With