Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Normalized Double Vector Not Unit Length To Machine Precision

I have a Java application that uses high-dimensional vectors formed from double's. It normalizes these vectors by multiplying the vector components by the reciprocal of the Euclidean norm. Sometimes, the resulting vector has a norm that is not equal to 1 to machine-precision. That this occurs does not surprise me.

My question is: how do I normalize the vector such that the resulting vector has unit length to machine precision?

These are the methods for my Vector class to compute the norm and normalize the vector:

public double getFrobeniusNorm() {
    return Math.sqrt(getFrobeniusNormSquared());
}

public double getFrobeniusNormSquared() {
    double normSquared = 0.0;
    int numberOfRows = getRowDimension();
    int numberOfColumns = getColumnDimension();
    for(int i = 0; i < numberOfRows; ++i) {
        for(int j = 0; j < numberOfColumns; ++j) {
            double matrixElement = get(i,j);
            normSquared += matrixElement*matrixElement;
        }
    }
    return normSquared;
}

public void normalize() {
    double norm = getFrobeniusNorm();
    if (norm == 0) {
        throw new ArithmeticException("Cannot get a unit vector from the zero vector.");            
    } else {
        double oneOverNorm = 1.0 / norm;
        multiplyEquals(oneOverNorm);
    }
}

Since this is Java, I can't use techniques specific to the operating system and processor, but otherwise this seems like a standard floating-point algorithm issue.

I can improve the norm calculation using Kahan summation and/or dividing out the largest component, but the consistency between normalizing and calculating the norm is the real issue. The norm is more important than the direction, so I see this as finding the floating point vector closest in direction to the original vector with the constraint that the norm is 1 to machine precision. For my purposes, the original vector is exact.

Suppose the original vector is u. I call u.normalize(). Then, if I compute Math.abs(u.getFrobeniusNorm()-1d, in some cases, the result is hundreds of ulps. This is the problem. I can accept that the vector norm has error. I just want to normalize the vector such that the norm as calculated by u.getFrobeniusNorm() is 1 to the smallest possible ulps. Improving u.getFrobeniusNorm() makes sense, but I don't think that solves the consistency issue.

like image 790
wdb Avatar asked Sep 15 '25 13:09

wdb


2 Answers

Simple: Your requirement can not be met - assuming any imaginable vector possible, it can't event be met with any precision less than infinite.

You can get reasonably close to 1.0, and that should be good enough in most cases (it should already be with your code).

If it turns out the accuracy is too small for your case, you need to perform error analysis (since youre asking the question in the first place, get someone with experience to do the error analysis for you - this will cost money).

The basics behind floating point accuray are explained here: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html (What Every Computer Scientist Should Know About Floating-Point)

like image 149
Durandal Avatar answered Sep 17 '25 04:09

Durandal


You may be able to reformulate your stated question by doing some awkward greedy rounding hack. (You can also probably formulate it as an even-more-awkward network flow problem.) I don't think you can guarantee a "nice" rounding here where the stuff rounded up all has larger fractional parts than all of the stuff rounded down.

Backing up a little bit, I'm not sure why you got yourself into a position where you need the norm of a vector to be exactly 1, rather than within n*machine epsilon of 1. It might be a better idea to rethink the code that uses the normalised vector than to rethink the normalisation itself.

(You also say this: "As for the question of unity, the unit vector has norm 1 exactly, and all my equations use that fact. I want the floating point representation closest to that unit vector (by inner product)." This changes the game completely; the closest vector in Euclidean norm to the exactly-normalised vector will be the rounded normalised vector.)

like image 33
tmyklebu Avatar answered Sep 17 '25 03:09

tmyklebu