Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm puzzle interview

Tags:

algorithm

I found this interview question, and I couldn't come up with an algorithm better than O(N^2 * P):

Given a vector of P natural numbers (1,2,3,...,P) and another vector of length N whose elements are from the first vector, find the longest subsequence in the second vector, such that all elements are uniformly distributed (have the same frequency).

Example : (1,2,3) and (1,2,1,3,2,1,3,1,2,3,1). The longest subsequence is in the interval [2,10], because it contains all the elements from the first sequence with the same frequency (1 appears three times, 2 three times, and 3 three times).

The time complexity should be O(N * P).

like image 279
flowerpower Avatar asked Apr 17 '12 13:04

flowerpower


People also ask

Are puzzle interviews effective for employee selection Why or why not?

Puzzle based interview questions aim at predicting a candidate's reaction to a certain environment or situation. Very little to no evidence exists supporting the correlation between personality traits and their impact on the person's future performance. The correlations are limited and the predictions extremely broad.


Video Answer


1 Answers

"Subsequence" usually means noncontiguous. I'm going to assume that you meant "sublist".

Here's an O(N P) algorithm assuming we can hash (assumption not needed; we can radix sort instead). Scan the array keeping a running total for each number. For your example,

  1  2  3  --------   0  0  0 1    1  0  0 2   1  1  0 1   2  1  0 3   2  1  1 2   2  2  1 1   3  2  1 3   3  2  2 1   4  2  2 2   4  3  2 3   4  3  3 1   5  3  3 

Now, normalize each row by subtracting the minimum element. The result is

 0: 000  1: 100  2: 110  3: 210  4: 100  5: 110  6: 210  7: 100  8: 200  9: 210 10: 100 11: 200. 

Prepare two hashes, mapping each row to the first index at which it appears and the last index at which it appears. Iterate through the keys and take the one with maximum last - first.

000: first is at 0, last is at 0 100: first is at 1, last is at 10 110: first is at 2, last is at 5 210: first is at 3, last is at 9 200: first is at 8, last is at 11 

The best key is 100, since its sublist has length 9. The sublist is the (1+1)th element to the 10th.

This works because a sublist is balanced if and only if its first and last unnormalized histograms are the same up to adding a constant, which occurs if and only if the first and last normalized histograms are identical.

like image 187
uty Avatar answered Oct 09 '22 00:10

uty