Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimum number of days required to solve a list of questions

There are N problems numbered 1..N which you need to complete. You've arranged the problems in increasing difficulty order, and the ith problem has estimated difficulty level i. You have also assigned a rating vi to each problem. Problems with similar vi values are similar in nature. On each day, you will choose a subset of the problems and solve them. You've decided that each subsequent problem solved on the day should be tougher than the previous problem you solved on that day. Also, to not make it boring, consecutive problems you solve should differ in their vi rating by at least K. What is the least number of days in which you can solve all problems?

Input: The first line contains the number of test cases T. T test cases follow. Each case contains an integer N and K on the first line, followed by integers v1,...,vn on the second line.

Output: Output T lines, one for each test case, containing the minimum number of days in which all problems can be solved.

Constraints:
1 <= T <= 100
1 <= N <= 300
1 <= vi <= 1000
1 <= K <= 1000

Sample Input:
2
3 2
5 4 7
5 1
5 3 4 5 6

Sample Output:
2
1

This is one of the challenge from interviewstreet.
Below is my approach
Start from 1st question and find out max possible number of question can be solve and remove these questions from the question list.Now again start from first element of the remainning list and do this till now size of the question list is 0. I am getting wrong answer from this method so looking for some algo to solve this challenge.

like image 539
archit Avatar asked Apr 24 '12 18:04

archit


2 Answers

Construct a DAG of problems in the following way. Let pi and pj be two different problems. Then we will draw a directed edge from pi to pj if and only if pj can be solved directly after pi on the same day, consecutively. Namely, the following conditions have to be satisfied:

  1. i < j, because you should solve the less difficult problem earlier.
  2. |vi - vj| >= K (the rating requirement).

Now notice that each subset of problems that is chosen to be solved on some day corresponds to the directed path in that DAG. You choose your first problem, and then you follow the edges step by step, each edge in the path corresponds to the pair of problems that have been solved consecutively on the same day. Also, each problem can be solved only once, so any node in our DAG may appear only in exactly one path. And you have to solve all the problems, so these paths should cover all the DAG.

Now we have the following problem: given a DAG of n nodes, find the minimal number of non-crossing directed paths that cover this DAG completely. This is a well-known problem called Path cover. Generally speaking, it is NP-hard. However, our directed graph is acyclic, and for acyclic graphs it can be solved in polynomial time using reduction to the matching problem. Maximal matching problem, in its turn, can be solved using Hopcroft-Karp algorithm, for example. The exact reduction method is easy and can be read, say, on Wikipedia. For each directed edge (u, v) of the original DAG one should add an undirected edge (au, bv) to the bipartite graph, where {ai} and {bi} are two parts of size n.

The number of nodes in each part of the resulting bipartite graph is equal to the number of nodes in the original DAG, n. We know that Hopcroft-Karp algorithm runs in O(n2.5) in the worst case, and 3002.5 ≈ 1 558 845. For 100 tests this algorithm should take under a 1 second in total.

like image 80
Skiminok Avatar answered Sep 19 '22 02:09

Skiminok


The algorithm is simple. First, sort the problems by v_i, then, for each problem, find the number of problems in the interval (v_i-K, v_i]. The maximum of those numbers is the result. The second phase can be done in O(n), so the most costly operation is sorting, making the whole algorithm O(n log n). Look here for a demonstration of the work of the algorithm on your data and K=35 in a spreadsheet.

Why does this work

Let's reformulate the problem to the problem of graph coloring. We create graph G as follows: vertices will be the problems and there will be an edge between two problems iff |v_i - v_j| < K.

In such graph, independent sets exactly correspond to sets of problems doable on the same day. (<=) If the set can be done on a day, it is surely an independent set. (=>) If the set doesn't contain two problems not satisfying the K-difference criterion, you can just sort them according to the difficulty and solve them in this order. Both condition will be satisfied this way.

Therefore, it easily follows that colorings of graph G exactly correspond to schedules of the problems on different days, with each color corresponding to one day.

So, we want to find the chromaticity of graph G. This will be easy once we recognize the graph is an interval graph, which is a perfect graph, those have chromaticity equal to cliqueness, and both can be found by a simple algorithm.

Interval graphs are graphs of intervals on the real line, edges are between intervals that intersect. Our graph, as can be easily seen, is an interval graph (for each problem, assign an interval (v_i-K, v_i]. It can be easily seen that the edges of this interval graph are exactly the edges of our graph).

Lemma 1: In an interval graph, there exist a vertex whose neighbors form a clique.

Proof is easy. You just take the interval with the lowest upper bound (or highest lower bound) of all. Any intervals intersecting it have the upper bound higher, therefore, the upper bound of the first interval is contained in the intersection of them all. Therefore, they intersect each other and form a clique. qed

Lemma 2: In a family of graphs closed on induced subgraphs and having the property from lemma 1 (existence of vertex, whose neighbors form a clique), the following algorithm produces minimal coloring:

  1. Find the vertex x, whose neighbors form a clique.
  2. Remove x from the graph, making its subgraph G'.
  3. Color G' recursively
  4. Color x by the least color not found on its neighbors

Proof: In (3), the algorithm produces optimal coloring of the subgraph G' by induction hypothesis + closeness of our family on induced subgraphs. In (4), the algorithm only chooses a new color n if there is a clique of size n-1 on the neighbors of x. That means, with x, there is a clique of size n in G, so its chromaticity must be at least n. Therefore, the color given by the algorithm to a vertex is always <= chromaticity(G), which means the coloring is optimal. (Obviously, the algorithm produces a valid coloring). qed

Corollary: Interval graphs are perfect (perfect <=> chromaticity == cliqueness)

So we just have to find the cliqueness of G. That is easy easy for interval graphs: You just process the segments of the real line not containing interval boundaries and count the number of intervals intersecting there, which is even easier in your case, where the intervals have uniform length. This leads to an algorithm outlined in the beginning of this post.

like image 26
jpalecek Avatar answered Sep 20 '22 02:09

jpalecek