I'm creating a linear regression model for supervised learning.
I have a bunch of data points plotted on a graph (x1, y1), (x2, y2), (x3, y3), etc, where the x's are the real data and the y values are the training data values.
As part of the next step in writing a basic nearest neighbor algorithm, I want to create a distance metric to measure the distance (and similarity) between two instances.
If I wanted to write a generic function to compute the L-Norm distance in ipython, I know that a lot of people use numpy.linalg.norm(arr, ord = , axis=). What I'm confused about is how to format my array of data points so that it properly calculates the L-norm values.
If I had just two data points, say (3, 4) and (5, 9), would my array need to look like this with each data point's values in one row?
arry = ([[3,4]
[5,9]])
or would it need to look like this where all the x-axis values are in one row and y in another?
arry = ([[3,5]
[4,9]])
numpy. linalg. norm is used to calculate the norm of a vector or a matrix. It take order=None as default, so just to calculate the Frobenius norm of (a-b) , this is ti calculate the distance between a and b( using the upper Formula).
norm() function is used to calculate one of the eight different matrix norms or one of the vector norms.
The norm of a vector is a measure of its distance from the origin in the vector space. To calculate the norm, you can either use Numpy or Scipy. Both offer a similar function to calculate the norm.
To find a matrix or vector norm we use function numpy. linalg. norm() of Python library Numpy. This function returns one of the seven matrix norms or one of the infinite vector norms depending upon the value of its parameters.
numpy.linalg.norm(x) == numpy.linalg.norm(x.T)
where .T
denotes the transpose. So it doesn't matter.
For example:
>>> import numpy as np
>>> x = np.random.rand(5000, 2)
>>> x.shape
(5000, 2)
>>> x.T.shape
(2, 5000)
>>> np.linalg.norm(x)
57.82467111195578
>>> np.linalg.norm(x.T)
57.82467111195578
Edit:
Given that your vector is basically
x = [[real_1, training_1],
[real_2, training_2],
...
[real_n, training_n]]
then the Frobenius norm is basically computing
np.sqrt(np.sum(x**2))
Are you sure this is the right metric. There are a whole bunch of other norms. Here are 3
np.sum((x[:,0]**2 - x[:,1]**2) # N-dimensional euclidean norm
np.sqrt(np.sum(x[:,0]**2) + np.sum(x[:,1]**2)) # L^2 norm
np.sqrt(x[:,0].dot(x[:,1])) # sqrt dot product
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With