Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Disperse Duplicates in an Array

Source : Google Interview Question

Write a routine to ensure that identical elements in the input are maximally spread in the output?

Basically, we need to place the same elements,in such a way , that the TOTAL spreading is as maximal as possible.

Example:

Input: {1,1,2,3,2,3}

Possible Output: {1,2,3,1,2,3}  

Total dispersion = Difference between position of 1's + 2's + 3's = 4-1 + 5-2 + 6-3 = 9 .

I am NOT AT ALL sure, if there's an optimal polynomial time algorithm available for this.Also,no other detail is provided for the question other than this .

What i thought is,calculate the frequency of each element in the input,then arrange them in the output,each distinct element at a time,until all the frequencies are exhausted.

I am not sure of my approach .

Any approaches/ideas people .

like image 644
Spandan Avatar asked Jul 27 '13 15:07

Spandan


2 Answers

I believe this simple algorithm would work:

  • count the number of occurrences of each distinct element.
  • make a new list
  • add one instance of all elements that occur more than once to the list (order within each group does not matter)
  • add one instance of all unique elements to the list
  • add one instance of all elements that occur more than once to the list
  • add one instance of all elements that occur more than twice to the list
  • add one instance of all elements that occur more than trice to the list
  • ...

Now, this will intuitively not give a good spread:
for {1, 1, 1, 1, 2, 3, 4} ==> {1, 2, 3, 4, 1, 1, 1}
for {1, 1, 1, 2, 2, 2, 3, 4} ==> {1, 2, 3, 4, 1, 2, 1, 2}

However, i think this is the best spread you can get given the scoring function provided. Since the dispersion score counts the sum of the distances instead of the squared sum of the distances, you can have several duplicates close together, as long as you have a large gap somewhere else to compensate.

for a sum-of-squared-distances score, the problem becomes harder. Perhaps the interview question hinged on the candidate recognizing this weakness in the scoring function?

like image 160
HugoRune Avatar answered Oct 11 '22 22:10

HugoRune


In perl

@a=(9,9,9,2,2,2,1,1,1);

then make a hash table of the counts of different numbers in the list, like a frequency table

map { $x{$_}++ } @a;

then repeatedly walk through all the keys found, with the keys in a known order and add the appropriate number of individual numbers to an output list until all the keys are exhausted

@r=();
$g=1; 
while( $g == 1 ) { 
   $g=0;
   for my $n (sort keys %x) 
      {
      if ($x{$n}>1) {
                    push @r, $n;
                    $x{$n}--;
                    $g=1
                    }
      } 
}

I'm sure that this could be adapted to any programming language that supports hash tables

like image 36
Vorsprung Avatar answered Oct 11 '22 22:10

Vorsprung