Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number of Distinct Subarrays

I want to find an algorithm to count the number of distinct subarrays of an array.

For example, in the case of A = [1,2,1,2], the number of distinct subarrays is 7:

{ [1] , [2] , [1,2] , [2,1] , [1,2,1] , [2,1,2], [1,2,1,2]}  

and in the case of B = [1,1,1], the number of distinct subarrays is 3:

{ [1] , [1,1] , [1,1,1] }

A sub-array is a contiguous subsequence, or slice, of an array. Distinct means different contents; for example:

[1] from A[0:1] and [1] from A[2:3] are not distinct.

and similarly:

B[0:1], B[1:2], B[2:3] are not distinct.

like image 878
Mod Avatar asked Jul 07 '13 15:07

Mod


People also ask

How do you count distinct Subarrays?

the possible lengths of subarrays are 1, 2, 3,……, j – i +1. So, the sum will be ((j – i +1)*(j – i +2))/2. We first find largest subarray (with distinct elements) starting from first element. We count sum of lengths in this subarray using above formula.

How do you find the number of Subarrays whose Max K is?

Calculate the number of subarrays with a maximum not greater than K-1 by calling function totalSubarrays(arr, N, K-1) and store in count1. Calculate the number of subarrays with a maximum not greater than K by calling function totalSubarrays(arr, N, K) and store in count2.

How many Subarrays are in an array of size k?

Continuing with the exact same reasoning, we can see that the answer for subarrays of length k must be n−(k−1)=n−k+1 since we're able to "start" the array anywhere except for the last k−1 positions.

How do I find all the subarrays of an array?

We can use substr function to find the all possible sub array.


2 Answers

Construct suffix tree for this array. Then add together lengths of all edges in this tree.

Time needed to construct suffix tree is O(n) with proper algorithm (Ukkonen's or McCreight's algorithms). Time needed to traverse the tree and add together lengths is also O(n).

like image 152
Evgeny Kluev Avatar answered Oct 19 '22 22:10

Evgeny Kluev


Edit: I think about how to reduce iteration/comparison number. I foud a way to do it: if you retrieve a sub-array of size n, then each sub-arrays of size inferior to n will already be added.

Here is the code updated.

    List<Integer> A = new ArrayList<Integer>();
    A.add(1);
    A.add(2);
    A.add(1);
    A.add(2);

    System.out.println("global list to study: " + A);

    //global list
    List<List<Integer>> listOfUniqueList = new ArrayList<List<Integer>>();      

    // iterate on 1st position in list, start at 0
    for (int initialPos=0; initialPos<A.size(); initialPos++) {

        // iterate on liste size, start on full list and then decrease size
        for (int currentListSize=A.size()-initialPos; currentListSize>0; currentListSize--) {

            //initialize current list.
            List<Integer> currentList = new ArrayList<Integer>();

            // iterate on each (corresponding) int of global list
            for ( int i = 0; i<currentListSize; i++) {
                currentList.add(A.get(initialPos+i));
            }

            // insure unicity
            if (!listOfUniqueList.contains(currentList)){
                listOfUniqueList.add(currentList);                      
            } else {
                continue;
            }
        }
    }

System.out.println("list retrieved: " + listOfUniqueList);
System.out.println("size of list retrieved: " + listOfUniqueList.size());

global list to study: [1, 2, 1, 2]

list retrieved: [[1, 2, 1, 2], [1, 2, 1], [1, 2], [1], [2, 1, 2], [2, 1], [2]]

size of list retrieved: 7

With a list containing the same patern many time the number of iteration and comparison will be quite low. For your example [1, 2, 1, 2], the line if (!listOfUniqueList.contains(currentList)){ is executed 10 times. It only raise to 36 for the input [1, 2, 1, 2, 1, 2, 1, 2] that contains 15 different sub-arrays.

like image 37
skoll Avatar answered Oct 19 '22 22:10

skoll