A
is an array of the integers from 1
to n
in random order.
I need random access to the i
th largest element of the first j
elements in at least log time.
What I've come up with so far is an n x n
matrix M
, where the element in the (i, j)
position is the i
th largest of the first j
. This gives me constant-time random access, but requires n^2
storage.
By construction, M
is sorted by row and column. Further, each column differs from its neighbors by a single value.
Can anyone suggest a way to compress M
down to n log(n)
space or better, with log(n)
or better random access time?
I believe you can perform the access in O(log(N)) time, given O(N log(N)) preprocessing time and O(N log(N)) extra space. Here's how.
You can augment a red-black tree to support a select(i)
operation which retrieves the element at rank i
in O(log(N)) time. For example, see this PDF or the appropriate chapter of Introduction to Algorithms.
You can implement a red-black tree (even one augmented to support select(i)
) in a functional manner, such that the insert operation returns a new tree which shares all but O(log(N)) nodes with the old tree. See for example Purely Functional Data Structures by Chris Okasaki.
We will build an array T
of purely functional augmented red-black trees, such that the tree T[j]
stores the indexes 0 ... j-1
of the first j
elements of A
sorted largest to smallest.
Base case: At T[0]
create an augmented red-black tree with just one node, whose data is the number 0, which is the index of the 0th largest element in the first 1 elements of your array A
.
Inductive step: For each j
from 1 to N-1
, at T[j]
create an augmented red-black tree by purely functionally inserting a new node with index j
into the tree T[j-1]
. This creates at most O(log(j)) new nodes; the remaining nodes are shared with T[j-1]
. This takes O(log(j)) time.
The total time to construct the array T
is O(N log(N)) and the total space used is also O(N log(N)).
Once T[j-1]
is created, you can access the i
th largest element of the first j
elements of A
by performing T[j-1].select(i)
. This takes O(log(j)) time. Note that you can create T[j-1]
lazily the first time it is needed. If A
is very large and j
is always relatively small, this will save a lot of time and space.
Unless I misunderstand, you are just finding the k-th order statistic of an array which is the prefix of another array.
This can be done using an algorithm that I think is called 'quickselect' or something along those lines. Basically, it's like quicksort:
There's a (much) better description here under the Quickselect and Quicker Select headings, and also just generally on the internet if you search for k-th order quicksort solutions.
Although the worst-case time for this algorithm is O(n2) like quicksort, its expected case is much better (also like quicksort) if you properly select your random pivots. I think the space complexity would just be O(n); you can just make one copy of your prefix to muck up the ordering for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With