Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to place a mailbox to minimize the total distance that the residents travel to get their mail

Assume the input is specified as an array of building objects, where each building has a number of residents an its distance from the start of the street.

Total distance = SUM(distance[i] * #residents[i])

I found here two questions that are similar but they have slightly different requirements:

  • Minimizing weighted sum: The solution of this question finds the minimum path crossing all points. Here I am looking for minimal sum of total distances from each building to the place where the mailbox is.

  • Minimum Total Distance From Locations: It uses 2D coordinates, and more important, the solution doesn't consider the weight (number of residents) on each location.

I saw this problem while reading Elements of Programming Interviews (really nice book, BTW), and this is listed as a variant of the quickselect algorithm. Considering that the median is the point that minimizes the sum of the distances, it looks like the solution would involve the quick selection to find the building at the "median" position of the street in O(N).

But I can't figure out how to account the residents on each building and still keep the solution linear.

like image 320
carlos22 Avatar asked Jun 23 '17 16:06

carlos22


Video Answer


2 Answers

We can use the deltas to determine direction. I'll explain what I mean. As it relates to choosing a mailbox location at one of the buildings' (that is, not in between two buildings):

Choose one of the buildings as a pivot (potential mailbox location). Partition the buildings according to their location in relation to the pivot. While partitioning, keep a record of the closest building on each side of the pivot, as well as (1) the total number of residents on each side of the pivot, and (2) f(side, pivot) representing the total sum of each buildings' distance from the pivot multiplied by the number of residents in that building.

Now we have:

L pivot R

To determine if an improvement can be made for our choice, try each of the closest buildings we recorded earlier:

If we were to move our choice one building to the left, how would the results change? Let's call the closest building on the left build_l, and the right, build_r. So the new results moving our choice one building to the left would be:

Left side:

  f(L, pivot)
- distance(build_l, pivot) * num_residents(build_l)

Right side:

  f(R, pivot) 
  // we saved this earlier
+ total_residents(R) * distance(pivot, build_l)
+ num_residents(pivot) * distance(pivot, build_l)

Perform a similar calculation for moving the choice one building to the right to see which yields a smaller total. Then pick the side with the building that yields an improvement and partition it recursively in similar quickselect fashion until an optimal result is found. For the other side we keep track of the total number of residents, and total result for f so far, which we can update with the new additions as we go.

like image 195
גלעד ברקן Avatar answered Oct 05 '22 05:10

גלעד ברקן


I'd solve it in the following fashion (pseudocode below).

Pass array left to right and compute cost of putting mailbox in the house i for all residents j <= i.

# When we place mailbox at building i, all its residents contribute 0 to the total cost.
current_number_of_residents += residents_at_building[i-1]

# For each resident we've seen so far, the cost is increased by building_location[i] - building_location[i-1]
distance_delta = building_location[i] - building_location[i-1]

C_left[i] = C_left[i-1] + distance_delta * current_number_of_residents

Then we process array right-to-left in similar fashion. Now we can find the optimal location by checking for minimum sum:

min_total_distance = min(min_total_distance, C_left[i] + C_right[i])

Time complexity is O(n) since we make 3 passes over the array. Space complexity is O(n) to keep C_left and C_right arrays.

Quickselect algorithm complexity (suggested by @גלעד-ברקן) is also O(n) on average, but can be O(n^2) in the worst case. So I don't see the benefit of that approach over what I suggested. Any comments are welcome.

like image 36
Pavel Podlipensky Avatar answered Oct 05 '22 04:10

Pavel Podlipensky