So, I looked up information about the weights parameter in the polyfit (numpy.polynomial.polynomial.polyfit) function in Python and it seems like it has something to do with the error associated with the individual points. (How to include measurement errors in numpy.polyfit)
However, what I am trying to do has nothing to do with the error, but weights. I have an image in the form of a numpy array which indicates the amount of charge deposited in the detector. I convert that image to a scatter plot and then do a fit. But I want that fit to give more weight to the points which have more charge deposited and less to the ones that have less charge. Is that what the weights parameter is for?
Here's an example image: Here's my code:
def get_best_fit(image_array, fixedX, fixedY):
weights = np.array(image_array)
x = np.where(weights>0)[1]
y = np.where(weights>0)[0]
size = len(image_array) * len(image_array[0])
y = np.zeros((len(image_array), len(image_array[0])))
for i in range(len(np.where(weights>0)[0])):
y[np.where(weights>0)[0][i]][np.where(weights>0)[1][i]] = np.where(weights>0)[0][i]
y = y.reshape(size)
x = np.array(range(len(image_array)) * len(image_array[0]))
weights = weights.reshape((size))
b, m = polyfit(x, y, 1, w=weights)
angle = math.atan(m) * 180/math.pi
return b, m, angle
Let me explain to you the code:
The first line assigns the charged deposited in a variable called weights. The next two lines get the points where the charge deposited is >0, so there's some charge deposited to capture the coordinates for the scatter plot. Then I get the size of the entire image to later convert to just a one dimensional array for plotting. I then go through the image and try to get the coordinates of the points where there's some charge deposited (remember that the amount of charge is stored in the variable weights
). I then reshape the y coordinates to get a one dimensional array and get the x coordinates for all the corresponding y coordinates from the image, then change the shape of the weights too to be just one dimensional.
Edit: if there's a way of doing this using the np.linalg.lstsq
function, that would be ideal since I'm also trying to get the fit to go through the vertex of the plot. I could just reposition the plot so the vertex is at zero and then use np.linalg.lstsq
, but that wouldn't allow me to use the weights.
You can use sklearn.linear_model.LinearRegression
. It allows you to not fit the intercept (i.e. line goes through the origin, or, with some finagling, the point of your choice). It also deals with weighted data.
e.g. (mostly stolen shamelessly from @Hiho's answer)
import numpy as np
import matplotlib.pyplot as plt
import sklearn.linear_model
y = np.array([1.0, 3.3, 2.2, 4.25, 4.8, 5.1, 6.3, 7.5])
x = np.arange(y.shape[0]).reshape((-1,1))
w = np.linspace(1,5,y.shape[0])
model = sklearn.linear_model.LinearRegression(fit_intercept=False)
model.fit(x, y, sample_weight=w)
line_x = np.linspace(min(x), max(x), 100).reshape((-1,1))
pred = model.predict(line_x)
plt.scatter(x, y)
plt.plot(line_x, pred)
plt.show()
So I might be misunderstanding the problem, but I just tried to fit a straight line to a scatter plot and then change the fit to prioritise specific points using the weights parameter.
I tried this with np.polyfit and np.polynomial.polynomial.polyfit, I would have expected them both to behave the same as they are both minimising squared error (at least thats my understanding).
However the fits were quite different, see below. Not quite sure what to make of that.
import numpy as np
import matplotlib.pyplot as plt
def func(p1, p2, x):
return p1 * x + p2
y = np.array([1.0, 3.3, 2.2, 4.25, 4.8, 5.1, 6.3, 7.5])
x = np.arange(y.shape[0])
plt.scatter(x, y)
w = np.ones(x.shape[0])
w[1] = 12
# p1, p2 = np.polyfit(x, y, 1, w=w)
p1, p2 = np.polynomial.polynomial.polyfit(x, y, 1, w=w)
print(p1, p2, w)
plt.plot(x, func(p1, p2, x))
plt.show()
With no weights (or all set 1)
With the weight of the 2nd point set to 12, all other weights are 1
No weights
With the weight of the 2nd point set to 12, all other weights are 1
So np.polyfit behaves as I would expect, however I don't really know whats going on with np.polynomial.polynomial.polyfit even the fit without any weights doesn't make any sense to me.
But I think np.polyfit does what you are after? Changing the weight parameter clearly gives more weight to higher weighted points.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With