I have sets of x,y,z points in 3D space and another variable called charge
which represents the amount of charge that was deposited in a specific x,y,z coordinate. I would like to do a weighted (weighted by the amount of charge deposited in the detector, which just corresponds to a higher weight for more charge) for this data such that it passes through a given point, the vertex.
Now, when I did this for 2D, I tried all sorts of methods (bringing the vertex to the origin and doing the same transformation for all the other points and forcing the fit to go through the origin, giving the vertex really high weight) but none of them were as good as the answer given here by Jaime: How to do a polynomial fit with fixed points
It uses the method of Lagrange multipliers, which I'm vaguely familiar with from an undergraduate Advanced Multi variable course, but not much else and it doesn't seem like the transformation of that code will be as easy as just adding a z coordinate. (Note that even though the code doesn't take into consideration the amount of charge deposited, it still gave me the best results). I was wondering if there was a version of the same algorithm out there, but in 3D. I also contacted the author of the answer in Gmail, but didn't hear back from him.
Here is some more information about my data and what I'm trying to do in 2D: How to weigh the points in a scatter plot for a fit?
Here is my code for doing this in a way where I force the vertex to be at the origin and then fit the data setting fit_intercept=False
. I'm currently pursuing this method for 2D data since I'm not sure if there's a 3D version out there for Lagrange multipliers, but there are linear regression ways of doing this in 3D, for instance, here: Fitting a line in 3D:
import numpy as np
import sklearn.linear_model
def plot_best_fit(image_array, vertexX, vertexY):
weights = np.array(image_array)
x = np.where(weights>0)[1]
y = np.where(weights>0)[0]
size = len(image_array) * len(image_array[0])
y = np.zeros((len(image_array), len(image_array[0])))
for i in range(len(np.where(weights>0)[0])):
y[np.where(weights>0)[0][i]][np.where(weights>0)[1][i]] = np.where(weights>0)[0][i]
y = y.reshape(size)
x = np.array(range(len(image_array)) * len(image_array[0]))
weights = weights.reshape((size))
for i in range(len(x)):
x[i] -= vertexX
y[i] -= vertexY
model = sklearn.linear_model.LinearRegression(fit_intercept=False)
model.fit(x.reshape((-1, 1)),y,sample_weight=weights)
line_x = np.linspace(0, 512, 100).reshape((-1,1))
pred = model.predict(line_x)
m, b = np.polyfit(np.linspace(0, 512, 100), np.array(pred), 1)
angle = math.atan(m) * 180/math.pi
return line_x, pred, angle, b, m
image_array
is a numpy array and vertexX
and vertexY
are the x and y coordinates of the vertex, respectively. Here's my data: https://uploadfiles.io/bbhxo. I cannot create a toy data as there is not a simple way of replicating this data, it was produced by Geant4 simulation of a neutrino interacting with an argon nucleus. I don't want to get rid of the complexity of the data. And this specific event happens to be the one for which my code does not work, I'm not sure if I can generate a data specifically so my code doesn't work on it.
To perfectly fit a polynomial to data points, an order polynomial is required. To restate slightly differently, any set of points can be modeled by a polynomial of order . It can be shown that such a polynomial exists and that there is only one polynomial that exactly fits those points.
In python, the most common way of doing curve fitting is using the curve fit function in Scipy. This is a good approach as the method can be used for fitting all functions, not just polynomials and the only code that you need to change is the code of the function that you want to fit in your data.
Polyfit and Polyval. Polyfit is a Matlab function that computes a least squares polynomial for a given set of data. Polyfit generates the coefficients of the polynomial, which can be used to model a curve to fit the data. Polyval evaluates a polynomial for a given set of x values.
This is more a hand made solution using basic optimization. It is straight forward. One just measures the distance of a point to the line to be fitted and minimizes the weighted distances using basic optimize.leastsq
. Code is as follows:
# -*- coding: utf-8 -*
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
from scipy import optimize
import numpy as np
def rnd( a ):
return a * ( 1 - 2 * np.random.random() )
def affine_line( s, theta, phi, x0, y0, z0 ):
a = np.sin( theta) * np.cos( phi )
b = np.sin( theta) * np.sin( phi )
c = np.cos( theta )
return np.array( [ s * a + x0, s * b + y0, s * c + z0 ] )
def point_to_line_distance( x , y, z , theta, phi, x0, y0, z0 ):
xx = x - x0
yy = y - y0
zz = z - z0
a = np.sin( theta) * np.cos( phi )
b = np.sin( theta) * np.sin( phi )
c = np.cos( theta )
r = np.array( [ xx, yy, zz ] )
t = np.array( [ a, b, c ] )
return np.linalg.norm( r - np.dot( r, t) * t )
def residuals( parameters, fixpoint, data, weights=None ):
theta, phi = parameters
x0, y0, z0 = fixpoint
if weights is None:
w = np.ones( len( data ) )
else:
w = np.array( weights )
diff = np.array( [ point_to_line_distance( x , y, z , theta, phi , *fixpoint ) for x, y, z in data ] )
diff = diff * w
return diff
### some test data
fixpoint = [ 1, 2 , -.3 ]
trueline = np.array( [ affine_line( s, .7, 1.7, *fixpoint ) for s in np.linspace( -1, 2, 50 ) ] )
rndData = np.array( [ np.array( [ a + rnd( .6), b + rnd( .35 ), c + rnd( .45 ) ] ) for a, b, c in trueline ] )
zData = [ 20 * point_to_line_distance( x , y, z , .7, 1.7, *fixpoint ) for x, y, z in rndData ]
### unweighted
bestFitValuesUW, ier= optimize.leastsq(residuals, [ 0, 0],args=( fixpoint, rndData ) )
print bestFitValuesUW
uwLine = np.array( [ affine_line( s, bestFitValuesUW[0], bestFitValuesUW[1], *fixpoint ) for s in np.linspace( -2, 2, 50 ) ] )
### weighted ( chose inverse distance as weight....would be charge in OP's case )
bestFitValuesW, ier= optimize.leastsq(residuals, [ 0, 0],args=( fixpoint, rndData, [ 1./s for s in zData ] ) )
print bestFitValuesW
wLine = np.array( [ affine_line( s, bestFitValuesW[0], bestFitValuesW[1], *fixpoint ) for s in np.linspace( -2, 2, 50 ) ] )
### plotting
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1, projection='3d' )
ax.plot( *np.transpose(trueline ) )
ax.scatter( *fixpoint, color='k' )
ax.scatter( rndData[::,0], rndData[::,1], rndData[::,2] , c=zData, cmap=cm.jet )
ax.plot( *np.transpose( uwLine ) )
ax.plot( *np.transpose( wLine ) )
ax.set_xlim( [ 0, 2.5 ] )
ax.set_ylim( [ 1, 3.5 ] )
ax.set_zlim( [ -1.25, 1.25 ] )
plt.show()
which returns
>> [-0.68236386 -1.3057938 ]
>> [-0.70928735 -1.4617517 ]
The fix point is shown in black. The original line in blue. The unweighted and weighted fit are in orange and green, respectively. Data is colored according to distance to line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With