Suppose you have a function 'normalize' which takes a list of numbers (representing a vector) as input and returns the normalized vector. What should the result be when the vector is all zeros or the sum of its components is zero?
To normalize a vector, therefore, is to take a vector of any length and, keeping it pointing in the same direction, change its length to 1, turning it into what is called a unit vector. Since it describes a vector's direction without regard to its length, it's useful to have the unit vector readily accessible.
Any vector, when normalized, only changes its magnitude, not its direction. Also, every vector pointing in the same direction, gets normalized to the same vector (since magnitude and direction uniquely define a vector). Hence, unit vectors are extremely useful for providing directions.
The L2 norm is calculated as the square root of the sum of the squared vector values. The L2 norm of a vector can be calculated in NumPy using the norm() function with default parameters. First, a 1×3 vector is defined, then the L2 norm of the vector is calculated.
Mathematically speaking, the zero vector cannot be normalized. Its length will always remain 0
.
For given vector v = (v1, v2, ..., vn)
we have: ||v|| = sqrt(v1^2 + v2^2 + ... + vn^2)
. Let us remember that a normalized vector is one that has ||v||=1
.
So for v = 0
we have: ||0|| = sqrt(0^2 + 0^2 + ... + 0^2) = 0
. You can never normalize that.
Also important to note that to ensure consistency, you should not return NaN
or any other null value. The normalized form of v=0
is indeed v=0
.
It's even worse than Yuval suggests.
Mathematically, given a vector x you are looking for a new vector x/||x||
where ||.|| is the norm, which you are probably thinking of as a Euclidean norm with
||.|| = sqrt(dot(v,v)) = sqrt(sum_i x_i**2)
These are floating point numbers, so it's not enough to just guard against dividing by zero, you also have a floating point issue if the x_i's are all small (they may underflow and you lose the magnitude).
Basically what it all boils down to is that if you really need to be able to handle small vectors properly, you'll have to do some more work.
If small and zero vectors don't make sense in your application, you can test against the magnitude of the vector and do something appropriate.
(note that as soon as you start dealing with floating point, rather than real, numbers, doing things like squaring and then square rooting numbers (or sums of them) is problematic at both the large an small ends of the representable range)
bottom line: doing numerical work correctly over all cases is trickier than it first looks.
For example, off the top of my head potential problems with this (normalization) operation done in a naive way
Mathematically speaking, the zero vector cannot be normalized. This is an example of what we call in computational geometry a "degenerate case", and this is a huge topic, making much headache for geometry algorithm designers. I can imagine the following approaches to the problem.
degenerate_case_exception
.is_degenerate_case
output parameter to your procedure.Personally I in my code use the 3 approach everywhere. One of its advantages is that it does not let the programmer to forget to deal with degenerate cases.
Note, that due to the limited range of floating point numbers, even if the input vector is not equal to the zero vector, You may still get infinite coordinates in the output vector. Because of this, I do not consider the 1. approach to be a bad design decision.
What I can recommend You is to avoid the exception throwing solution. If the degenerate cases are rare among the others, then the exception throwing will not slow down the program. But the problem is that in most cases You can not know that degenerate cases will be rare.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With