Counting according to query

Tags:

Given an array of N positive elements. Lets suppose we list all N × (N+1) / 2 non-empty continuous subarrays of the array A and then replaced all the subarrays with the maximum element present in the respective subarray. So now we have N × (N+1) / 2 elements where each element is maximum among its subarray.

Now we are having Q queries, where each query is one of 3 types :

1 K : We need to count of numbers strictly greater than K among those N × (N+1) / 2 elements.

2 K : We need to count of numbers strictly less than K among those N × (N+1) / 2 elements.

3 K : We need to count of numbers equal to K among those N × (N+1) / 2 elements.

Now main problem am facing is N can be upto 10^6. So i can't generate all those N × (N+1) / 2 elements. Please help to solve this porblem.

Example : Let N=3 and we have Q=2. Let array A be [1,2,3] then all sub arrays are :

[1] -> [1]
[2] -> [2]
[3] -> [3]
[1,2] -> [2]
[2,3] -> [3]
[1,2,3] -> [3]

So now we have [1,2,3,2,3,3]. As Q=2 so :

Query 1 : 3 3

It means we need to tell count of numbers equal to 3. So answer is 3 as there are 3 numbers equal to 3 in the generated array.

Query 2 : 1 4

It means we need to tell count of numbers greater than 4. So answer is 0 as no one is greater than 4 in generated array.

Now both N and Q can be up to 10^6. So how to solve this problem. Which data structure should be suitable to solve it.

495

asked Aug 08 '15 07:08

mat7

2 Answers

I believe I have a solution in O(N + Q*log N) (More about time complexity). The trick is to do a lot of preparation with your array before even the first query arrives.

For each number, figure out where is the first number on left / right of this number that is strictly bigger.

Example: for array: 1, 8, 2, 3, 3, 5, 1 both 3's left block would be position of 8, right block would be the position of 5.

This can be determined in linear time. This is how: Keep a stack of previous maximums in a stack. If a new maximum appears, remove maximums from the stack until you get to a element bigger than or equal to the current one. Illustration:

In this example, in the stack is: [15, 13, 11, 10, 7, 3] (you will of course keep the indexes, not the values, I will just use value for better readability).

Now we read 8, 8 >= 3 so we remove 3 from stack and repeat. 8 >= 7, remove 7. 8 < 10, so we stop removing. We set 10 as 8's left block, and add 8 to the maximums stack.

Also, whenever you remove from the stack (3 and 7 in this example), set the right block of removed number to the current number. One problem though: right block would be set to the next number bigger or equal, not strictly bigger. You can fix this with simply checking and relinking right blocks.

Compute what number is how many times a maximum of some subsequence.

Since for each number you now know where is the next left / right bigger number, I trust you with finding appropriate math formula for this.

Then, store the results in a hashmap, key would be a value of a number, and value would be how many times is that number a maximum of some subsequence. For example, record [4->12] would mean that number 4 is the maximum in 12 subsequences.

Lastly, extract all key-value pairs from the hashmap into an array, and sort that array by the keys. Finally, create a prefix sum for the values of that sorted array.

Handle a request

For request "exactly k", just binary search in your array, for more/less thank``, binary search for key k and then use the prefix array.

124

answered Oct 01 '22 02:10

kajacx

This answer is an adaptation of this other answer I wrote earlier. The first part is exactly the same, but the others are specific for this question.

Here's an implemented a O(n log n + q log n) version using a simplified version of a segment tree.

Creating the segment tree: O(n)

In practice, what it does is to take an array, let's say:

A = [5,1,7,2,3,7,3,1]

And construct an array-backed tree that looks like this:

segment tree

In the tree, the first number is the value and the second is the index where it appears in the array. Each node is the maximum of its two children. This tree is backed by an array (pretty much like a heap tree) where the children of the index i are in the indexes i*2+1 and i*2+2.

Then, for each element, it becomes easy to find the nearest greater elements (before and after each element).

To find the nearest greater element to the left, we go up in the tree searching for the first parent where the left node has value greater and the index lesser than the argument. The answer must be a child of this parent, then we go down in the tree looking for the rightmost node that satisfies the same condition.

Similarly, to find the nearest greater element to the right, we do the same, but looking for a right node with an index greater than the argument. And when going down, we look for the leftmost node that satisfies the condition.

Creating the cumulative frequency array: O(n log n)

From this structure, we can compute the frequency array, that tells how many times each element appears as maximum in the subarray list. We just have to count how many lesser elements are on the left and on the right of each element and multiply those values. For the example array ([1, 2, 3]), this would be:

[(1, 1), (2, 2), (3, 3)]

This means that 1 appears only once as maximum, 2 appears twice, etc.

But we need to answer range queries, so it's better to have a cumulative version of this array, that would look like:

[(1, 1), (2, 3), (3, 6)]

The (3, 6) means, for example, that there are 6 subarrays with maxima less than or equal to 3.

Answering `q` queries: O(q log n)

Then, to answer each query, you just have to make binary searches to find the value you want. For example. If you need to find the exact number of 3, you may want to do: query(F, 3) - query(F, 2). If you want to find those lesser than 3, you do: query(F, 2). If you want to find those greater than 3: query(F, float('inf')) - query(F, 3).

Implementation

I've implemented it in Python and it seems to work well.

import sys, random, bisect
from collections import defaultdict
from math import log, ceil

def make_tree(A):
    n = 2**(int(ceil(log(len(A), 2))))
    T = [(None, None)]*(2*n-1)

    for i, x in enumerate(A):
        T[n-1+i] = (x, i)

    for i in reversed(xrange(n-1)):
        T[i] = max(T[i*2+1], T[i*2+2])

    return T

def print_tree(T):
    print 'digraph {'
    for i, x in enumerate(T):
        print '    ' + str(i) + '[label="' + str(x) + '"]'
        if i*2+2 < len(T):
            print '    ' + str(i)+ '->'+ str(i*2+1)
            print '    ' + str(i)+ '->'+ str(i*2+2)

    print '}'

def find_generic(T, i, fallback, check, first, second):
    j = len(T)/2+i
    original = T[j]
    j = (j-1)/2

    #go up in the tree searching for a value that satisfies check
    while j > 0 and not check(T[second(j)], original):
        j = (j-1)/2

    #go down in the tree searching for the left/rightmost node that satisfies check
    while j*2+1<len(T):
        if check(T[first(j)], original):
            j = first(j)
        elif check(T[second(j)], original):
            j = second(j)
        else:
            return fallback

    return j-len(T)/2


def find_left(T, i, fallback):
    return find_generic(T, i, fallback, 
        lambda a, b: a[0]>b[0] and a[1]<b[1],  #value greater, index before
        lambda j: j*2+2,                       #rightmost first
        lambda j: j*2+1                        #leftmost second
    ) 


def find_right(T, i, fallback):
    return find_generic(T, i, fallback,
        lambda a, b: a[0]>=b[0] and a[1]>b[1], #value greater or equal, index after
        lambda j: j*2+1,                       #leftmost first
        lambda j: j*2+2                        #rightmost second
    )       

def make_frequency_array(A):
    T = make_tree(A)

    D = defaultdict(lambda: 0)
    for i, x in enumerate(A):
        left = find_left(T, i, -1)
        right = find_right(T, i, len(A))
        D[x] += (i-left) * (right-i)

    F = sorted(D.items())
    for i in range(1, len(F)):
        F[i] = (F[i][0], F[i-1][1] + F[i][1])

    return F

def query(F, n):
    idx = bisect.bisect(F, (n,))
    if idx>=len(F): return F[-1][1]
    if F[idx][0]!=n: return 0
    return F[idx][1]

F = make_frequency_array([1,2,3])
print query(F, 3)-query(F, 2) #3 3
print query(F, float('inf'))-query(F, 4) #1 4
print query(F, float('inf'))-query(F, 1) #1 1
print query(F, 2) #2 3

answered Oct 01 '22 02:10

Juan Lopes

Related questions
                            
                                1d Array - determine best container size with minimum waste
                            
                                C-style array of pointers to Objective-C objects under ARC
                            
                                How to count groups of same cells in a 2d array?
                            
                                Temporary const array not binding to rvalue reference
                            
                                Combining LinkedList with String Reversal as Goal is to understand LinkedList
                            
                                Spread an array into multiple arguments for a function
                            
                                How to dynamically set array keys in php
                            
                                Reason for .get() without index in jQuery API?
                            
                                PHP Weird Undefined index error
                            
                                Deleting both values from array if duplicate - JavaScript/jQuery
                            
                                Holding Arrays not in RAM
                            
                                Constructing a PHP array from separate files
                            
                                Cython: Create memoryview without NumPy array?
                            
                                Overload function for both arrays-of-const and pointers-to-const
                            
                                Dynamic/computed keys in JMESPath?
                            
                                Haskell: Can I read integers directly into an array?
                            
                                How to Create JSON Array in Java
                            
                                Abstracting dimensionality of arrays in Java
                            
                                add and remove items between arrays in angular [duplicate]
                            
                                ExpandableListView with multiple choice save selected item into an array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Counting according to query

Tags:

arrays

algorithm

mat7

People also ask

2 Answers

kajacx

Creating the segment tree: O(n)

Creating the cumulative frequency array: O(n log n)

Answering `q` queries: O(q log n)

Implementation

Juan Lopes

Recent Activity

Donate For Us

Counting according to query

Tags:

arrays

algorithm

mat7

People also ask

2 Answers

kajacx

Creating the segment tree: O(n)

Creating the cumulative frequency array: O(n log n)

Answering q queries: O(q log n)

Implementation

Juan Lopes

Related questions

Recent Activity

Donate For Us

Answering `q` queries: O(q log n)