Give a data structure that stores comparable objects and supports <code>add()</code> and <code>get(k)</code> operations [<code>get(k)</code> returns the kth smallest element in the data structure <code>(1 <= k <= n)</code>]. <code>get(k)</code> must be <code>O(1)</code> and <code>add()</code> must be <code>O(log n)</code> where n is the number of objects added to the data structure. Give another structure where <code>get(k)</code> is <code>O(log n)</code> and add is <code>O(1)</code>

If I got this interview question I would respond by saying that I am unaware of any such data structures, and suspect that they don't exist. However I suspect that the data structures that the interviewer is thinking of are "sorted array" and "skip list" respectively. I would then explain that retrieving any element of an array by position is O(1). Figuring out where to insert it is O(log(n)). However the actual insertion is O(n) due to having to move the rest of the array. However the O(n) piece comes with very good constants. For the skip list, retrieving is O(log(n)). Inserting involves half of the time only modifying one element, 1/4 of the time editing 2, 1/8 of the time editing 3 and so on, which is an average of 2 elements. That's O(1). However you cannot insert an element without figuring where to put it. And that lookup is O(log(n)). (To make the insert truly O(1) you either need to collect O(log(n)) data on the lookup that you make available to the insert, or you need to create the moral equivalent of a doubly linked skip list.)

There's no way to build a deterministic comparison-based data structure with amortized O(1)-time adds and worst-case O(log n)-time gets. The other configuration cannot be ruled out by an information-theoretic lower bound, but I seriously doubt that anyone knows how to do it. For the experts: the adversary first adds n items, answering the algorithm's O(n) comparisons in such a way as to leave an antichain of size at least log2 n. It then chooses k in such a way that computing get(k) forces the algorithm to do selection on the antichain, incurring a cost of Ω(log2 n). Why can the adversary force such a large antichain? Suppose that the algorithm always left no antichain of more than log2 n elements. By Dilworth's theorem, the elements can be partitioned into at most log2 n chains, which can be merged using O(n log log n) comparisons, giving a sorting algorithm that uses o(n log n) comparisons and thus a contradiction. What could your interviewer have meant? It's conceivable to me that if both operations are amortized, then there's a solution. This is a non-canonical relaxation of the requirements, however.

Datastructures where Add, Get kth largest are O(log n) and O(1)

Q: How do you find the largest KTH element?

If we sort the array in ascending order, the kth element of an array will be the kth smallest element. To find the kth largest element, we can pass k= length(Array) – k.

Q: How do you find the kth largest element in a sorted array?

It can be clearly observed that Kth largest element is the same as (N – K)th smallest element, where N is the size of the given array. Therefore, we can apply the Kth smallest approach to this problem.

Q: How do I find the kth largest element in a list Python?

Suppose we have an unsorted array, we have to find the kth largest element from that array. So if the array is [3,2,1,5,6,4] and k = 2, then the result will be 5. We will sort the element, if the k is 1, then return last element, otherwise return array[n – k], where n is the size of the array.

Q: Which data structure is best suited to get the highest K elements in a list?

The best data structure to keep track of top K elements is Heap. If we iterate through the array one element at a time and keep kth largest element in a heap such that each time we find a larger number than the smallest number in the heap, we do two things: Take out the smallest number from the heap.

Tags:

language-agnostic

algorithm

sorting

data-structures

Give a data structure that stores comparable objects and supports add() and get(k) operations [get(k) returns the kth smallest element in the data structure (1 <= k <= n)]. get(k) must be O(1) and add() must be O(log n) where n is the number of objects added to the data structure. Give another structure where get(k) is O(log n) and add is O(1)

474

asked May 17 '11 00:05

pathikrit

2 Answers

If I got this interview question I would respond by saying that I am unaware of any such data structures, and suspect that they don't exist. However I suspect that the data structures that the interviewer is thinking of are "sorted array" and "skip list" respectively.

I would then explain that retrieving any element of an array by position is O(1). Figuring out where to insert it is O(log(n)). However the actual insertion is O(n) due to having to move the rest of the array. However the O(n) piece comes with very good constants.

For the skip list, retrieving is O(log(n)). Inserting involves half of the time only modifying one element, 1/4 of the time editing 2, 1/8 of the time editing 3 and so on, which is an average of 2 elements. That's O(1). However you cannot insert an element without figuring where to put it. And that lookup is O(log(n)). (To make the insert truly O(1) you either need to collect O(log(n)) data on the lookup that you make available to the insert, or you need to create the moral equivalent of a doubly linked skip list.)

102

answered Oct 02 '22 00:10

btilly

There's no way to build a deterministic comparison-based data structure with amortized O(1)-time adds and worst-case O(log n)-time gets. The other configuration cannot be ruled out by an information-theoretic lower bound, but I seriously doubt that anyone knows how to do it.

For the experts: the adversary first adds n items, answering the algorithm's O(n) comparisons in such a way as to leave an antichain of size at least log² n. It then chooses k in such a way that computing get(k) forces the algorithm to do selection on the antichain, incurring a cost of Ω(log² n).

Why can the adversary force such a large antichain? Suppose that the algorithm always left no antichain of more than log² n elements. By Dilworth's theorem, the elements can be partitioned into at most log² n chains, which can be merged using O(n log log n) comparisons, giving a sorting algorithm that uses o(n log n) comparisons and thus a contradiction.

What could your interviewer have meant? It's conceivable to me that if both operations are amortized, then there's a solution. This is a non-canonical relaxation of the requirements, however.

answered Oct 02 '22 02:10

slowpoke

Related questions
                            
                                What is an s2k algorithm?
                            
                                All possible paths in a cyclic undirected graph
                            
                                MySQL: How do I SUM non duplicates values when doing multiples JOINS
                            
                                Algorithm to balance variably-sized items into roughly-balanced sets
                            
                                Multiple Image Placement Algorithm - Collage Algorithm
                            
                                Efficient way to tokenize a string - C
                            
                                mersenne twister - is there a way to jump to a particular state?
                            
                                Facebook Hacker Cup: After the Dance Battle
                            
                                Java problem time limit exceeded issue
                            
                                Produce MD5 or SHA1 hash code to long (64 bits)
                            
                                Modifying a heap in O(lgn) time
                            
                                Is it possible to convert this recursive solution (to print brackets) to an iterative version?
                            
                                Compact data structure for storing a large set of integral values
                            
                                Trying to understand Quadtree concept and apply it to storing coloring info of an image
                            
                                Open-Source compression algorithm with Checkpoints [closed]
                            
                                Sorting a tuple based on one of the fields
                            
                                A Value Based Heatmap Algorithm
                            
                                Why is this a greedy algorithm?
                            
                                Merge of Skip Lists
                            
                                Efficient point inside rectangle boundaries search

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With