I have a problem with this piece of Python-code:
import matplotlib
matplotlib.use("Agg")
import numpy as np
import pylab as pl
A1=np.loadtxt('/tmp/A1.txt',delimiter=',')
A1_extrema = [min(A1),max(A1)]
A2=np.loadtxt('/tmp/A2.txt',delimiter=',')
pl.close()
ab = np.polyfit(A1,A2,1)
print ab
fit = np.poly1d(ab)
print fit
r2 = np.corrcoef(A1,A2)[0,1]
print r2
pl.plot(A1,A2,'r.', label='TMP36 vs. DS18B20', alpha=0.7)
pl.plot(A1_extrema,fit(A1_extrema),'c-')
pl.annotate('{0}'.format(r2) , xy=(min(A1)+0.5,fit(min(A1))), size=6, color='r' )
pl.title('Sensor correlations')
pl.xlabel("T(x) [degC]")
pl.ylabel("T(y) [degC]")
pl.grid(True)
pl.legend(loc='upper left', prop={'size':8})
pl.savefig('/tmp/C123.png')
A1 and A2 are arrays containing temperature readings from different sensors. I want to find a correlation between the two and show that graphically.
However, occasionally, sensor-read-errors occur. And in such a case a NaN is inserted in one of the files instead of a temperature value. Then the np.polyfit
refuses to fit the data and returns [nan, nan]
as a result. All else fails after that as well.
My question: How can I convince numpy.polyfit
to ignore the NaN values?
N.B.: Datasets are relatively small at the moment. I expect that they may grow to about 200k...600k elements once deployed.
I know this is a little old, but if you have arrays that have NaNs in them, you have to "clean them up" by only considering the indexes that are finite. The way to do this is
idx = np.isfinite(x) & np.isfinite(y)
ab = np.polyfit(x[idx], y[idx], 1)
That way you pass only the "good" points to polyfit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With