Given an array of numbers a[0], a[1], ..., a[n-1]
, we get queries of the kind:
output k
-th largest number in the range a[i], a[i+1], ..., a[j]
Can these queries be answered in polylogarithmic time (in n
) per query? If not, is it possible to average results and still get a good amortized complexity?
EDIT: this can be solved using persistent segment trees http://blog.anudeep2011.com/persistent-segment-trees-explained-with-spoj-problems/
The first order statistic is the smallest sample value (i.e. the minimum), once the values have been placed in order. For example, in the sample 9, 2, 11, 5, 7, 4 the first order statistic is 2. In notation, that's x(1) = 2. The second order statistic x(2) is the next smallest value.
The rth order statistic X(r) of a sample of n random variables X1,...,Xn is equal to its rth smallest value. Thus X(1) denotes min{X1,...,Xn} (the minimum of the Xi's), X(2) denotes min({X1,...,Xn}\{X(1)}) (the second minimum), and in general, X(r) = min({X1,...,Xn}\{X(1),...,X(r−1)}). Thus X(n) = max{X1,...,Xn}.
The ith order statistic of a set of n elements is the ith smallest element. For example, the minimum of a set of elements is the first order statistic (i = 1), and the maximum is the nth order statistic (i = n). A median, informally, is the "halfway point" of the set.
Order statistics are employed in many ways in acceptance sampling. First, order statistics are used to improve the robustness of sampling plans by variables. Second, in life testing, order statistics is used to shorten test times.
Yes, these queries can be answered in polylog time if O(n log n)
space is available.
Preprocess given array by constructing segment tree with depth log(n)
. So that leaf nodes are identical to source array, next-depth nodes contain sorted 2-element sub-arrays, next level consists of 4-element arrays produced by merging those 2-element arrays, etc. In other words, perform merge sort but keep results of each merge step in separate array. Here is an example:
root: | 1 2 3 5 5 7 8 9 |
| 1 2 5 8 | 3 5 7 9 |
| 1 5 | 2 8 | 7 9 | 3 5 |
source: | 5 | 1 | 2 | 8 | 7 | 9 | 5 | 3 |
To answer a query, split given range (into at most 2*log(n) subranges). For example, range [0, 4]
should be split into [0, 3]
and [4]
, which gives two sorted arrays [1 2 5 8]
and [7]
. Now the problem is simplified to finding k-th element in several sorted arrays. The easiest way to solve it is nested binary search: first use binary search to choose some candidate element from every array starting from largest one; then use binary search in other (smaller) arrays to determine rank of this candidate element. This allows to get k-th element in O(log(n)^4)
time. Probably some optimization (like fractional cascading) or some other algorithm could do this faster...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With