Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expected number of maxima

I have is algorithm, which takes an array as an argument, and returns its maximum value.

find_max(as) :=
    max = as[0]
    for i = 1 ... len(as) {
        if max < as[i] then max = as[i]
   }
    return max

My question is: given that the array is initially in a (uniformly) random permutation and that all its elements are distinct, what's the expected number of times the max variable is updated (ignoring the initial assignment).

For example, if as = [1, 3, 2], then the number of updates to max would be 1 (when reading the value 3).

like image 673
notsogeek Avatar asked Sep 29 '13 06:09

notsogeek


1 Answers

Assume the original array contains the values 1, 2, ..., N.

Let X_i, i = 1..N be random variables that take the value 1 if i is, at some point during the algorithm, the maximum value.

Then the number of maximums the algorithm takes is the random variable: M = X_1 + X_2 + ... + X_N.

The average is (by definition) E(M) = E(X_1 + X_2 + ... + X_N). Using linearity of expectation, this is E(X_1) + E(X_2) + .. + E(X_N), which is prob(1 appears as a max) + prob(2 appears as a max) + ... + prob(N appears as a max) (since each X_i takes the value 0 or 1).

When does i appear as a maximum? It's when it appears first in the array amongst the i, i+1, i+2, ..., N. The probability of this is 1/(N-i+1) (since each of those numbers are equally likely to be first).

So... prob(i appears as a max) = 1/(N-i+1), and the overall expectation is 1/N + 1/(N-1) + ..+ 1/3 + 1/2 + 1/1

This is Harmonic(N) which is approximated closely by ln(N) + emc where emc ~= 0.5772156649, the Euler-Mascheroni constant.

Since in the problem you don't count the initial setting of the maximum to the first value as a step, the actual answer is Harmonic(N) - 1, or approximately ln(N) - 0.4227843351.

A quick check for some simple cases:

  • N=1, only one permutation, and no maximum updates. Harmonic(1) - 1 = 0.
  • N=2, permutations are [1, 2] and [2, 1]. The first updates the maximum once, the second zero times, so the average is 1/2. Harmonic(2) - 1 = 1/2.
  • N=3, permutations are [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]. Maximum updates are 2, 1, 1, 1, 0, 0 respectively. Average is (2+1+1+1)/6 = 5/6. Harmonic(3) - 1 = 1/2 + 1/3 = 5/6.

So the theoretical answer looks good!

like image 152
Paul Hankin Avatar answered Oct 04 '22 19:10

Paul Hankin