The point that minimizes the sum of euclidean distances to a set of n points

Tags:

I have a set of points W={(x1, y1), (x2, y2),..., (xn, yn)} on the 2D plane. Can you find an algorithm that takes these points as the input and returns a point (x, y) on the 2D plane which has the minimum sum of distances from the points in W? In other words, if

di = Euclidean_distance((x, y), (xi, yi))

I want to minimize:

d1 + d2 + ... + dn

268

asked Jul 30 '19 17:07

mhsnk

Video Answer

1 Answers

The Problem

You're looking for the geometric median.

An Easy Solution

There is no closed-form solution to this problem, so iterative or probabilistic methods are used. The easiest way to find this is probably with Weiszfeld's algorithm:

Weiszfeld's algorithm

We can implement this in Python as follows:

import numpy as np
from numpy.linalg import norm as npnorm
c_pt_old = np.random.rand(2)
c_pt_new = np.array([0,0])

while npnorm(c_pt_old-c_pt_new)>1e-6:
    num   = 0
    denom = 0
    for i in range(POINT_NUM):
        dist   = npnorm(c_pt_new-pts[i,:])
        num   += pts[i,:]/dist
        denom += 1/dist
    c_pt_old = c_pt_new
    c_pt_new = num/denom

print(c_pt_new)

There's a chance that Weiszfeld's algorithm won't converge, so it might be best to run it several times from different starting points.

A General Solution

You can also find this using second-order cone programming (SOCP). In addition to solving your specific problem, this general formulation then allows you to easily add constraints and weightings, such as variable uncertainty in the location of each data point.

To do so, you create a number of indicator variables representing the distance between the proposed center point and the data points.

You then minimize the sum of the indicator variables. The result follows

import cvxpy as cp
import numpy as np
import matplotlib.pyplot as plt

#Generate random test data
POINT_NUM = 100
pts       = np.random.rand(POINT_NUM,2)

c_pt      = cp.Variable(2)           #The center point we wish to locate
distances = cp.Variable(POINT_NUM)   #Distance from the center point to each data point

#Generate constraints. These are used to hold distances.
constraints = []                     
for i in range(POINT_NUM):
    constraints.append( cp.norm(c_pt-pts[i,:])<=distances[i] ) 

objective = cp.Minimize(cp.sum(distances))

problem = cp.Problem(objective,constraints)

optimal_value = problem.solve()

print("Optimal value = {0}".format(optimal_value))
print("Optimal location = {0}".format(c_pt.value))

plt.scatter(x=pts[:,0], y=pts[:,1], s=1)
plt.scatter(c_pt.value[0], c_pt.value[1], s=10)
plt.show()

SOCPs are available in a number of solvers including CPLEX, Elemental, ECOS, ECOS_BB, GUROBI, MOSEK, CVXOPT, and SCS.

I've tested and the two approaches give the same answers to within tolerance.

Weiszfeld, E. (1937). "Sur le point pour lequel la somme des distances de n points donnes est minimum". Tohoku Mathematical Journal. 43: 355–386.

197

answered Nov 15 '22 07:11

Richard

Related questions
                            
                                Coloring heightmap faces instead of vertices
                            
                                Replace operators of equation, so that the sum is equal to zero
                            
                                Sorting sequences where the binary sorting function return is undefined for some pairs
                            
                                Functions to convert between depth first and breadth first traversals of a complete tree
                            
                                Using Loop invariant to prove correctness of merge sort (Initialization , Maintenance , Termination)
                            
                                MongoDB Full-Text Search Score "What does Score means?"
                            
                                How to deal with English contractions programmatically [Regex, JS, Ruby]
                            
                                Upper bounds and Lower bounds in Algorithms
                            
                                Algorithm for 2D nearest-neighbour queries with dynamic points
                            
                                Generate any number in the fewest step using multiply by 2 or divide by 3?
                            
                                Is there a pushable/poppable hash function for stack-like objects?
                            
                                Alibaba interview: print a sentence with min spaces
                            
                                What concepts or algorithms exist for parallelizing parsers?
                            
                                Optimal way to determine If it is possible to arrive at pair (c,d) when starting from (a,b)
                            
                                Trying to solve Sudoku with cvxpy
                            
                                How to efficiently find similar strings in a unique string in JavaScript?
                            
                                Divide array into sub arrays such that no sub array contains duplicate elements
                            
                                Why is my code not calculating the correct value for the expression string?
                            
                                Closure Number Method for Generate Parenthesis Problem
                            
                                Splitting an array finding minimum difference between the sum of two subarray in distributed environment

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

The point that minimizes the sum of euclidean distances to a set of n points

Tags:

algorithm

mathematical-optimization

computational-geometry

convex-optimization

mhsnk

People also ask

Video Answer

1 Answers

Richard

Recent Activity

Donate For Us