I want to find an algorithm to count the number of distinct subarrays of an array. For example, in the case of A = [1,2,1,2], the number of distinct subarrays is 7: <pre class="prettyprint"><code>{ [1] , [2] , [1,2] , [2,1] , [1,2,1] , [2,1,2], [1,2,1,2]} </code></pre> and in the case of B = [1,1,1], the number of distinct subarrays is 3: <pre class="prettyprint"><code>{ [1] , [1,1] , [1,1,1] } </code></pre> A sub-array is a contiguous subsequence, or slice, of an array. Distinct means different contents; for example: [1] from A[0:1] and [1] from A[2:3] are not distinct. and similarly: B[0:1], B[1:2], B[2:3] are not distinct.

Edit: I think about how to reduce iteration/comparison number. I foud a way to do it: if you retrieve a sub-array of size n, then each sub-arrays of size inferior to n will already be added. Here is the code updated. <pre class="prettyprint"><code> List<Integer> A = new ArrayList<Integer>(); A.add(1); A.add(2); A.add(1); A.add(2); System.out.println("global list to study: " + A); //global list List<List<Integer>> listOfUniqueList = new ArrayList<List<Integer>>(); // iterate on 1st position in list, start at 0 for (int initialPos=0; initialPos<A.size(); initialPos++) { // iterate on liste size, start on full list and then decrease size for (int currentListSize=A.size()-initialPos; currentListSize>0; currentListSize--) { //initialize current list. List<Integer> currentList = new ArrayList<Integer>(); // iterate on each (corresponding) int of global list for ( int i = 0; i<currentListSize; i++) { currentList.add(A.get(initialPos+i)); } // insure unicity if (!listOfUniqueList.contains(currentList)){ listOfUniqueList.add(currentList); } else { continue; } } } System.out.println("list retrieved: " + listOfUniqueList); System.out.println("size of list retrieved: " + listOfUniqueList.size()); </code></pre> global list to study: [1, 2, 1, 2] list retrieved: [[1, 2, 1, 2], [1, 2, 1], [1, 2], [1], [2, 1, 2], [2, 1], [2]] size of list retrieved: 7 With a list containing the same patern many time the number of iteration and comparison will be quite low. For your example [1, 2, 1, 2], the line if (!listOfUniqueList.contains(currentList)){ is executed 10 times. It only raise to 36 for the input [1, 2, 1, 2, 1, 2, 1, 2] that contains 15 different sub-arrays.

Number of Distinct Subarrays

Tags:

arrays

algorithm

I want to find an algorithm to count the number of distinct subarrays of an array.

For example, in the case of A = [1,2,1,2], the number of distinct subarrays is 7:

{ [1] , [2] , [1,2] , [2,1] , [1,2,1] , [2,1,2], [1,2,1,2]}

and in the case of B = [1,1,1], the number of distinct subarrays is 3:

{ [1] , [1,1] , [1,1,1] }

A sub-array is a contiguous subsequence, or slice, of an array. Distinct means different contents; for example:

[1] from A[0:1] and [1] from A[2:3] are not distinct.

and similarly:

B[0:1], B[1:2], B[2:3] are not distinct.

878

asked Jul 07 '13 15:07

Mod

2 Answers

Construct suffix tree for this array. Then add together lengths of all edges in this tree.

Time needed to construct suffix tree is O(n) with proper algorithm (Ukkonen's or McCreight's algorithms). Time needed to traverse the tree and add together lengths is also O(n).

152

answered Oct 19 '22 22:10

Evgeny Kluev

Edit: I think about how to reduce iteration/comparison number. I foud a way to do it: if you retrieve a sub-array of size n, then each sub-arrays of size inferior to n will already be added.

Here is the code updated.

    List<Integer> A = new ArrayList<Integer>();
    A.add(1);
    A.add(2);
    A.add(1);
    A.add(2);

    System.out.println("global list to study: " + A);

    //global list
    List<List<Integer>> listOfUniqueList = new ArrayList<List<Integer>>();      

    // iterate on 1st position in list, start at 0
    for (int initialPos=0; initialPos<A.size(); initialPos++) {

        // iterate on liste size, start on full list and then decrease size
        for (int currentListSize=A.size()-initialPos; currentListSize>0; currentListSize--) {

            //initialize current list.
            List<Integer> currentList = new ArrayList<Integer>();

            // iterate on each (corresponding) int of global list
            for ( int i = 0; i<currentListSize; i++) {
                currentList.add(A.get(initialPos+i));
            }

            // insure unicity
            if (!listOfUniqueList.contains(currentList)){
                listOfUniqueList.add(currentList);                      
            } else {
                continue;
            }
        }
    }

System.out.println("list retrieved: " + listOfUniqueList);
System.out.println("size of list retrieved: " + listOfUniqueList.size());

global list to study: [1, 2, 1, 2]

list retrieved: [[1, 2, 1, 2], [1, 2, 1], [1, 2], [1], [2, 1, 2], [2, 1], [2]]

size of list retrieved: 7

With a list containing the same patern many time the number of iteration and comparison will be quite low. For your example [1, 2, 1, 2], the line if (!listOfUniqueList.contains(currentList)){ is executed 10 times. It only raise to 36 for the input [1, 2, 1, 2, 1, 2, 1, 2] that contains 15 different sub-arrays.

answered Oct 19 '22 22:10

skoll

Related questions
                            
                                How do I initialize an array in a struct
                            
                                What is the difference between a NumPy array and a python list? [duplicate]
                            
                                Minimize the sum of errors of representative integers
                            
                                Shifting/aligning/rotating a circular buffer to zero in-place
                            
                                Java - Jackson nested arrays
                            
                                JSON.parse() on a large array of objects is using way more memory than it should
                            
                                ES6 reverse iterate an array using for..of, have I missed something in the spec?
                            
                                Find all n-dimensional lines and diagonals with NumPy
                            
                                Using Gzip to compress/decompress an array of bytes
                            
                                Numpy: Replacing values in a 2D array efficiently using a dictionary as a map
                            
                                Obtaining a pointer to the end of an array
                            
                                Perl: Sort characters within a string
                            
                                How do I collect unique elements of an array-valued field across multiple objects in jq?
                            
                                Difference between double pointer and array of pointers
                            
                                Sorting the [Any] array
                            
                                How to split a numpy array in fixed size chunks with and without overlap?
                            
                                deleting c++ array from heap and memory leak
                            
                                pd.Timestamp versus np.datetime64: are they interchangeable for selected uses?
                            
                                Confusion in multi dimensional array in Java
                            
                                Overhead of a Java JNI call [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With