Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting inversions in an array

Tags:

algorithm

People also ask

How do you calculate inversion?

One way to help calculate the inversion number is to look at each position in the permutation and count how many smaller numbers are to the right, and then add those numbers up. An inversion in a permutation is a pair of numbers such that the larger number appears to the left of the smaller one in the permutation.

How many inversions does a sorted array have?

The inversions of an array indicate; how many changes are required to convert the array into its sorted form. When an array is already sorted, it needs 0 inversions, and in another case, the number of inversions will be maximum, if the array is reversed.

How do you find the inversion of an array in Java?

Given an array, find the total number of inversions of it. If (i < j) and (A[i] > A[j]) , then pair (i, j) is called an inversion of an array A . We need to count all such pairs in the array.


So here is O(n log n) solution in java.

long merge(int[] arr, int[] left, int[] right) {
    int i = 0, j = 0, count = 0;
    while (i < left.length || j < right.length) {
        if (i == left.length) {
            arr[i+j] = right[j];
            j++;
        } else if (j == right.length) {
            arr[i+j] = left[i];
            i++;
        } else if (left[i] <= right[j]) {
            arr[i+j] = left[i];
            i++;                
        } else {
            arr[i+j] = right[j];
            count += left.length-i;
            j++;
        }
    }
    return count;
}

long invCount(int[] arr) {
    if (arr.length < 2)
        return 0;

    int m = (arr.length + 1) / 2;
    int left[] = Arrays.copyOfRange(arr, 0, m);
    int right[] = Arrays.copyOfRange(arr, m, arr.length);

    return invCount(left) + invCount(right) + merge(arr, left, right);
}

This is almost normal merge sort, the whole magic is hidden in merge function. Note that while sorting algorithm remove inversions. While merging algorithm counts number of removed inversions (sorted out one might say).

The only moment when inversions are removed is when algorithm takes element from the right side of an array and merge it to the main array. The number of inversions removed by this operation is the number of elements left from the the left array to be merged. :)

Hope it's explanatory enough.


I've found it in O(n * log n) time by the following method.

  1. Merge sort array A and create a copy (array B)
  2. Take A[1] and find its position in sorted array B via a binary search. The number of inversions for this element will be one less than the index number of its position in B since every lower number that appears after the first element of A will be an inversion.

    2a. accumulate the number of inversions to counter variable num_inversions.

    2b. remove A[1] from array A and also from its corresponding position in array B

  3. rerun from step 2 until there are no more elements in A.

Here’s an example run of this algorithm. Original array A = (6, 9, 1, 14, 8, 12, 3, 2)

1: Merge sort and copy to array B

B = (1, 2, 3, 6, 8, 9, 12, 14)

2: Take A[1] and binary search to find it in array B

A[1] = 6

B = (1, 2, 3, 6, 8, 9, 12, 14)

6 is in the 4th position of array B, thus there are 3 inversions. We know this because 6 was in the first position in array A, thus any lower value element that subsequently appears in array A would have an index of j > i (since i in this case is 1).

2.b: Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed).

A = (6, 9, 1, 14, 8, 12, 3, 2) = (9, 1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 6, 8, 9, 12, 14) = (1, 2, 3, 8, 9, 12, 14)

3: Rerun from step 2 on the new A and B arrays.

A[1] = 9

B = (1, 2, 3, 8, 9, 12, 14)

9 is now in the 5th position of array B, thus there are 4 inversions. We know this because 9 was in the first position in array A, thus any lower value element that subsequently appears would have an index of j > i (since i in this case is again 1). Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed)

A = (9, 1, 14, 8, 12, 3, 2) = (1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 8, 9, 12, 14) = (1, 2, 3, 8, 12, 14)

Continuing in this vein will give us the total number of inversions for array A once the loop is complete.

Step 1 (merge sort) would take O(n * log n) to execute. Step 2 would execute n times and at each execution would perform a binary search that takes O(log n) to run for a total of O(n * log n). Total running time would thus be O(n * log n) + O(n * log n) = O(n * log n).

Thanks for your help. Writing out the sample arrays on a piece of paper really helped to visualize the problem.


I wonder why nobody mentioned binary-indexed trees yet. You can use one to maintain prefix sums on the values of your permutation elements. Then you can just proceed from right to left and count for every element the number of elements smaller than it to the right:

def count_inversions(a):
  res = 0
  counts = [0]*(len(a)+1)
  rank = { v : i+1 for i, v in enumerate(sorted(a)) }
  for x in reversed(a):
    i = rank[x] - 1
    while i:
      res += counts[i]
      i -= i & -i
    i = rank[x]
    while i <= len(a):
      counts[i] += 1
      i += i & -i
  return res

The complexity is O(n log n), and the constant factor is very low.


In Python

# O(n log n)

def count_inversion(lst):
    return merge_count_inversion(lst)[1]

def merge_count_inversion(lst):
    if len(lst) <= 1:
        return lst, 0
    middle = int( len(lst) / 2 )
    left, a = merge_count_inversion(lst[:middle])
    right, b = merge_count_inversion(lst[middle:])
    result, c = merge_count_split_inversion(left, right)
    return result, (a + b + c)

def merge_count_split_inversion(left, right):
    result = []
    count = 0
    i, j = 0, 0
    left_len = len(left)
    while i < left_len and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            count += left_len - i
            j += 1
    result += left[i:]
    result += right[j:]
    return result, count        


#test code
input_array_1 = []  #0
input_array_2 = [1] #0
input_array_3 = [1, 5]  #0
input_array_4 = [4, 1] #1
input_array_5 = [4, 1, 2, 3, 9] #3
input_array_6 = [4, 1, 3, 2, 9, 5]  #5
input_array_7 = [4, 1, 3, 2, 9, 1]  #8

print count_inversion(input_array_1)
print count_inversion(input_array_2)
print count_inversion(input_array_3)
print count_inversion(input_array_4)
print count_inversion(input_array_5)
print count_inversion(input_array_6)
print count_inversion(input_array_7)

I had a question similar to this for homework actually. I was restricted that it must have O(nlogn) efficiency.

I used the idea you proposed of using Mergesort, since it is already of the correct efficiency. I just inserted some code into the merging function that was basically: Whenever a number from the array on the right is being added to the output array, I add to the total number of inversions, the amount of numbers remaining in the left array.

This makes a lot of sense to me now that I've thought about it enough. Your counting how many times there is a greater number coming before any numbers.

hth.