I am trying to derive the average case running time for deterministic linear search algorithm. The algorithm searches an element x in an unsorted array A in the order A[1], A[2], A[3]...A[n]. It stops when it finds the element x or proceeds until it reaches the end of the array. I searched on wikipedia and the answer given was (n+1)/(k+1) where k is the number of times x is present in the array. I approached in another way and am getting a different answer. Can anyone please give me the correct proof and also let me know whats wrong with my method?
E(T)= 1*P(1) + 2*P(2) + 3*P(3) ....+ n*P(n) where P(i) is the probability that
the algorithm runs for 'i' time (i.e. compares 'i' elements).
P(i)= (n-i)C(k-1) * (n-k)! / n!
Here, (n-i)C(k-1) is (n-i) Choose (k-1). As the algorithm has reached the ith
step, the rest of k-1 x's must be in the last n-i elements. Hence (n-i)C(k-i).
(n-k)! is the total number of ways of arranging the rest non x numbers, and n!
is the total number of ways of arranging the n elements in the array.
I am not getting (n+1)/(k+1) on simplifying.
If element P is not in the list, then Linear Search will do N comparisons. The dominant term in "Average number of comparisons" is N/2. So, the Average Case Time Complexity of Linear Search is O(N).
Average Case Complexity When the element to be searched is in the middle of the array, the average case of the Linear Search Algorithm is O(n).
We say that an algorithm requires average time proportional to f(n) (or that it has average-case complexity O(f(N)) if there are constants c and n0 such that the average time the algorithm requires to process an input set of size n is no more than c∗f(n) time units whenever n≥n0.
Time Complexity In linear search, best-case complexity is O(1) where the element is found at the first index. Worst-case complexity is O(n) where the element is found at the last index or element is not present in the array. In binary search, best-case complexity is O(1) where the element is found at the middle index.
You've forgotten to account for the permutations of the k
copies of x
. The correct definition of P(i)
is
P(i) = (n-i)C(k-1) * k! * (n-k)! / n! = (n-i)C(k-1) / nCk.
^^
I'll turn things over to Mathematica:
In[1]:= FullSimplify[Sum[i Binomial[n-i, k-1]/Binomial[n, k], {i, 1, n}], 0 <= k <= n]
1 + n
Out[1]= -----
1 + k
To elaborate on my comment below: assume that all elements are distinct, let X be the set of matches, and let Y be the set of non-matches. By assumption, |X| = k and |Y| = n-k. The expected number of reads is equal to the sum over elements e of the probability that e is read.
Given a set of elements S, each element in S comes before all of the others with probability 1/|S|.
An element x in X is read if and only if it comes before every other element of X, which is probability 1/k. The total contribution of elements in X is |X| (1/k) = 1.
An element y in Y is read if and only if it comes before every element of X, which is probability 1/(k+1). The total contribution of elements in Y is |Y| (1/(k+1)) = (n-k)/(k+1).
We have 1 + (n-k)/(k+1) = (n+1)/(k+1).
Here is a solution that uses Cormen terms:
Let S
be the set of the other n-k
elements.
Let the indicator random variable Xi=1
, if we encounter the i
'th element
of the set S
in the course of our execution.Pr{Xi=1}=1/(k+1)
and therefore E[Xi]=1/(k+1)
.
Let the indicator random variable Y=1
, if we encounter any of the k
elements that we are searching for in the course of our execution.Pr{Y=1}=1
and therefore E[Y]=1
.
Let the random variable X=Y+X1+X2+...X(n-k)
be the sum of the elements that we
encounter in the course of our execution.E[X]=E[Y+X1+X2+...X(n-k)]=E[Y]+E[X1]+E[X2]+...E[X(n-k)]=1+(n-k)/(k+1)=(n+1)/(k+1)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With