Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the number of repeated subarray in an array

There is an array indexed from 0-n (i.e. size=n) containing elements from 0-m where m < n (assume m is 100 or 1000 times less than n i.e. m is much less than n), so many of the elements or sub array must be repeated and we have to find the number of such repeated sub array of size 1 or more than 1. For example, consider an array
1 2 4 1 2 4 1 5 6 3 5, here
1 is repeated 2 times
2 is repeated 1 time
4 is repeated 1 time
5 is repeated 1 time
1 2 is repeated 1 time
2 4 is repeated 1 time
4 1 is repeated 1 time
1 2 4 is repeated 1 time
2 4 1 is repeated 1 time
1 2 4 1 is repeated 1 time
So the final answer is sum of these i.e. 11
Can anybody please suggest some data structure or an efficient algorithm to solve this. I was thinking of hashing m and noting the index value where it occurs but didn't formalize it.
Assume n,m < 10^5

like image 419
justice league Avatar asked Jul 14 '13 10:07

justice league


1 Answers

  1. Create a hash that has integers as keys, and extendible lists (linked lists, e.g.) of integers as values.

  2. Read through your input list. Treat each input element being scanned as a key; append the index of the following element to the associated list of values.

  3. Now, your repeated 1-element count is n - |keys|

  4. Next, go through your keys. For each key, treat the indexed elements of the original list as a new input list. Create a new hash and continue as in step 1. Note that when we store indices we continue to use those of the original list.

Example: 1 2 4 1 2 4 1 5 6 3 5 (assume we're zero-indexed). Here, n=11.

  • 1 -> [1, 4, 7]
  • 2 -> [2, 5]
  • 3 -> [10]
  • 4 -> [3, 6]
  • 5 -> [8, nil]
  • 6 -> [9]

The number of repeated 1-elts is 11 - 6 = 5.

Now, we go through the keys. We start with 1. The index list is [1, 4, 7] which corresponds with elements [2, 2, 5].

  • 2 -> [2, 5]
  • 5 -> [8]

This picks up 3 - 2 = 1 duplicate 2-element list.

etc...

Running time: Linear in the input + output

like image 84
Dave Avatar answered Nov 13 '22 13:11

Dave