Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding numpy.linalg.norm() in IPython

I'm creating a linear regression model for supervised learning.

I have a bunch of data points plotted on a graph (x1, y1), (x2, y2), (x3, y3), etc, where the x's are the real data and the y values are the training data values.

As part of the next step in writing a basic nearest neighbor algorithm, I want to create a distance metric to measure the distance (and similarity) between two instances.

If I wanted to write a generic function to compute the L-Norm distance in ipython, I know that a lot of people use numpy.linalg.norm(arr, ord = , axis=). What I'm confused about is how to format my array of data points so that it properly calculates the L-norm values.

If I had just two data points, say (3, 4) and (5, 9), would my array need to look like this with each data point's values in one row?

arry = ([[3,4] 
         [5,9]])

or would it need to look like this where all the x-axis values are in one row and y in another?

arry = ([[3,5]
         [4,9]])
like image 701
geo_coder Avatar asked Feb 25 '14 22:02

geo_coder


People also ask

What does NP Linalg norm () do?

numpy. linalg. norm is used to calculate the norm of a vector or a matrix. It take order=None as default, so just to calculate the Frobenius norm of (a-b) , this is ti calculate the distance between a and b( using the upper Formula).

What does Linalg mean in Python?

norm() function is used to calculate one of the eight different matrix norms or one of the vector norms.

What does norm () do in Python?

The norm of a vector is a measure of its distance from the origin in the vector space. To calculate the norm, you can either use Numpy or Scipy. Both offer a similar function to calculate the norm.

How do you find the norm in NumPy?

To find a matrix or vector norm we use function numpy. linalg. norm() of Python library Numpy. This function returns one of the seven matrix norms or one of the infinite vector norms depending upon the value of its parameters.


1 Answers

numpy.linalg.norm(x) == numpy.linalg.norm(x.T) where .T denotes the transpose. So it doesn't matter.

For example:

>>> import numpy as np
>>> x = np.random.rand(5000, 2)
>>> x.shape
(5000, 2)
>>> x.T.shape
(2, 5000)
>>> np.linalg.norm(x)
57.82467111195578
>>> np.linalg.norm(x.T)
57.82467111195578

Edit:

Given that your vector is basically

x = [[real_1, training_1],
     [real_2, training_2],
      ...
     [real_n, training_n]]

then the Frobenius norm is basically computing

np.sqrt(np.sum(x**2))

Are you sure this is the right metric. There are a whole bunch of other norms. Here are 3

np.sum((x[:,0]**2 - x[:,1]**2) # N-dimensional euclidean norm
np.sqrt(np.sum(x[:,0]**2) + np.sum(x[:,1]**2)) # L^2 norm
np.sqrt(x[:,0].dot(x[:,1])) # sqrt dot product
like image 50
wflynny Avatar answered Oct 05 '22 04:10

wflynny