I have a set of 3 million vectors (300 dimensions each), and I'm looking for a new point in this 300 dim space that is approximately equally distant from all the other points(vectors)
What I could do is initialize a random vector v, and run an optimization over v with the objective:

Where d_xy is the distance between vector x and vector y, but this would be very computationally expensive.
I'm looking for an approximate solution vector for this problem that can be found quickly over very large sets of vectors. (Or any libraries that will do something like this for me- any language)
From this question on the Math StackExchange:
There is no point that is equidistant from 4 or more points in general position in the plane, or n+2 points in n dimensions.
Criteria for representing a collection of points by one point are considered in statistics, machine learning, and computer science. The centroid is the optimal choice in the least-squares sense, but there are many other possibilities.
The centroid is the point C in the the plane for which the sum of squared distances $\sum |CP_i|^2$ is minimum. One could also optimize a different measure of centrality, or insist that the representative be one of the points (such as a graph-theoretic center of a weighted spanning tree), or assign weights to the points in some fashion and take the centroid of those.
Note, specifically, "the centroid is the optimal choice in the least-squares sense", so the optimal solution to your cost function (which is a least-squares cost) is simply to average all the coordinates of your points (which will give you the centroid).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With