Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dynamic programming algorithm during an interview [closed]

This question was asked to me in an interview and it embarrassingly exposed my shortcomings on dynamic programming. I will appreciate if someone can help me crack this one. Also, it would be very helpful to me (and others) if you can explain your thinking process along the way as you devise the solution as i seem to be able to understand when i see a solution which uses dynamic programming paradigm but struggle to come up with my own.

Without further ado, here is the question i was asked.

Given an integer i and set X of k points x1, x2, ... xk on real line, select i points from set X so as to minimize the sum of the distance from every point in X to a point in i using Dynamic programming.

like image 584
user976078 Avatar asked Oct 03 '11 05:10

user976078


People also ask

Is dynamic programming used in interviews?

DP as a technique helps us solve difficult problems efficiently. That's the reason why it's so popular in academia, industry, and software engineering interviews in top roles. Q5. How is dynamic programming different from recursion?

How common are DP problems in interviews?

It happens, but it's uncommon, because dynamic programming questions take more time to set up and solve. At Google, the typical coding interview is 45 minutes, which is really around 35 minutes for the problem per se (minus introductions and questions).

What are some examples of dynamic programming algorithms?

The standard All Pair Shortest Path algorithms like Floyd-Warshall and Bellman-Ford are typical examples of Dynamic Programming.


1 Answers

With most DP problems I try and find a kind of reduce-and-conquer relation. That is, a relation whereby I can cut away from the problem size with each step (like divide and conquer, but usually doesn't divide the problem, it just removes a small part). In this problem (like many others) we can make a very simple observation: Either the first point is in the set of i points, or it isn't.

Some notation: Let's say X = {x1, x2, ..., xk}, and denote the reduced set Xn = {xn, xn+1, ..., xk}.

So the observation is that either x1 is one of the i points, or it isn't. Let's call our i-set finding function MSD(i,Xk) (minimum sum of distances). We can express that cut-away observation as follows:

MSD(i,Xk) = Either MSD(i-1,Xk-1) U {x1} or MSD(i,Xk-1)

We can formalise the "either or" part by realising a simple way of checking which of those two options it actually is: We run through the set X and calculate the sum of the distances, and check which is actually the smaller. We note at this point, that that check has a running time of ki since we will naively run through each of the k points and grab the minimum distance from points in the set of size i.

We make two simple observations regarding base cases:

MSD(i,Xi) = Xi
MSD(0,Xn) = {}

The first is that when looking for i points in a set of size i we obviously just take the whole set.
The second is that when looking for no points in a set, we return the empty set. This inductively ensures that MSD returns sets of size i (it's true for the case where i=0 and by induction is true according to our definition of MSD above).

That's it. That will find the appropriate set. Runtime complexity is upper bounded by O(ik * step) where step is our O(ik) check from above. This is because MSD will be run on parameters that range from 0-i and X1 - Xk, which is a total of ik possible arguments.

That leaves us with a runtime of O((ik)2).

The following part is based on my understanding of the OP's question. I'm not sure if the distance of every point in X from the i-sized subset is the sum of the distances of every point from every other point in the subset, or the sum of the distances of every point in X from the subset itself.
I.e. sigma of x in X of (sum of distances of x from every point in the subset) OR sigma of x in X of (distance of x from the subset which is the minimum distance from x to any point in the subset)

I assume the latter.

We can reduce the runtime by optimising the O(ik) check from above. We notice that the elements are actually sorted (albeit in reverse order in this current notation), since when we add them on we do so always from the right hand side. Assuming they're sorted to begin with, they will be once out of the MSD routine. If they weren't sorted to begin with we can sort them, which will only cost O(klogk) anyway.

Once sorted, checking the distance of each point from a point in the set will be k * logi since for each point we do a binary search. This yields a total running time of O(ik * klogi + klogk)
= O(k2 * ilogi).

Finally, we can express that as O(k3logk). Not the fastest solution, but a solution.

I'm sure there are even more optimisations, but that's my 2c.

like image 189
davin Avatar answered Sep 24 '22 14:09

davin