Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting inversions in ranges

I participated in a programming contest in which I was unable to solve a problem, the problem was:

Given an array A of n integers, I need to count the number of inversions in given ranges. An integer m is provided which tells the number of ranges, then m lines follow, in each line two integers li and ri are given.

We have to count inversions in specified range only i.e. from li to ri inclusive(0 based indexing).

Two elements A[i] and A[j] add to inversion if A[i]>A[j] and i<j.

for example: A=[3 2 1 4]

The inversions are:

(2, 1), (3, 1), (3, 2) i.e. total number of inversions are 3.

Input:

3 2 1 4    //Array A 
3         // m - no. of ranges
1 2      // range
2 3
0 3

Output:

1
0
3

Constraints:

n<=2*10^4
m<=2*10^4
A[i]<=10^9

I know methods to compute inversion count in O(nlogn) (such as BIT or merge sort) on whole array and if I apply same here to every range the complexity would be O(mnlogn), that's surely not acceptable as time limit is 1 second.

like image 227
Akashdeep Saluja Avatar asked Feb 13 '14 19:02

Akashdeep Saluja


3 Answers

O((n + m) * sqrt(n) * log(n)) time, O(n + m) space algorithm with off-line queries (all query ranges must be known in advance):

  1. Divide array A into approximately sqrt(n) equal parts.
  2. For each part of the array do steps 3...7.
  3. Initialize 3 pointers (Pbegin, Pmid, Pend) so that they all point to the end of current part of the array (or even better to rightmost beginning of range belonging to this part).
  4. Advance Pend; update order-statistics tree that determines number of inversions between Pmid and Pend; when Pend coincides with end of some range which has beginning in current part of the array, do steps 5...7.
  5. Move Pbegin backwards until it coincides with the beginning of range found on step 4. Accumulate number of elements in order-statistics tree less than values pointed by Pbegin (but do not update the tree).
  6. Find number of inversions between Pbegin and Pmid (using either merge sort or separate order-statistics tree).
  7. Add together number of inversions found on steps 4, 5, 6. This is number of inversions for current range.

BIT/Fenwick tree may be used here as efficient implementation of order-statistics tree. In this case some pre-processing is needed: substitute array values by their indexes in sorted copy of the array to compress value range.


O((n + m) * sqrt(n)) time, O(n * sqrt(n)) space algorithm with on-line queries. As hinted by David Eisenstat.

Pre-processing:

  1. Substitute array values by their indexes in sorted copy of the array to compress value range.
  2. Divide array A into approximately sqrt(n) equal parts.
  3. For each part B of the array do steps 4...8.
  4. Zero-initialize bitset of size 'n'. Toggle bits corresponding to values of part 'B'. For this bitset compute array of prefix sums P and array of suffix sums S.
  5. For each part C preceding part B do step 6.
  6. Starting from last value v of part C and going backwards, add together all values P[v] and write sqrt(n) results to lookup table E. Complexity of this step is O(n * sqrt(n)).
  7. For each part D following part B do step 8.
  8. Starting from first value v of part D and going forwards, add together all values S[v] and write sqrt(n) results to lookup table F. Complexity of this step is O(n * sqrt(n)).
  9. Use incremental algorithm (Fenwick tree) to count number of inversions in all block suffixes (all sub-arrays starting at block boundary with length not greater than sqrt(n)). Write results to lookup table G. Complexity of this step is O(n * log n).
  10. Use incremental algorithm to count number of inversions in all block prefixes. Write results to lookup table H. Complexity of this step is O(n * log n).
  11. Use either of E or F and either of G or H to compute number of inversions in consecutive blocks (with any pair of blocks for starting and ending positions). Write results to lookup table R. Complexity of this step is O(n).

After pre-processing we have several LUTs containing number of inversions in block prefixes/suffixes (G, H, O(n) space), number of inversions between complete blocks and block prefixes/suffixes (E, F, O(n * sqrt(n)) space), and number of inversions in consecutive blocks (R, O(n) space). (Optionally we could merge E and F with R, which increases pre-processing time but allows O(1) time for first query step).

Query:

  1. Use R to get number of inversions in consecutive complete blocks, use E and F to add number of inversions between prefix/suffix and each of these complete blocks, use G and H to add number of inversions in prefix and in suffix.
  2. To get number of inversions between prefix and suffix, sort both with radix sort (sqrt(n) counters, 2 counting passes, and 2 passes to distribute values), and merge them.
  3. Add values from steps 1 and 2 to get number of inversions for query range.
like image 182
Evgeny Kluev Avatar answered Nov 09 '22 14:11

Evgeny Kluev


Here's an O((n + m) sqrt n log n)-time algorithm. That's probably not good enough to pass, but something seems not quite right here -- the usual programming contest tricks don't work. (O((n + m) sqrt n) might be achievable with more care.)

Divide the input array into sqrt n subarrays of length sqrt n, called blocks. Using an incremental algorithm for counting inversions, for each pair consisting of a block and a prefix of the array, compute the number of inversions where the first element comes from the former and the second element comes from the latter. (O(n sqrt n log n)) Do the same for prefix-block pairs.

For each input range, decompose it into the union of some blocks (blocked elements) and less than 2 sqrt n elements (unblocked elements). Using the precomputed results and inclusion-exclusion, find the number of inversions in the range where at least one element is blocked. (O(sqrt n)) Compute and add to this quantity the number of inversions in the range involving two unblocked elements. (O(sqrt n log n))

like image 43
David Eisenstat Avatar answered Nov 09 '22 13:11

David Eisenstat


Here's an elaboration of a previous answer, also with a potential gap filled. First you compute and store the number of inversions for all prefixes of your array, in O(n log n) time, by adding one element at a time from right to left and doing binary search tree search to find the element in the tree of all previous elements to determine the extra number of inversions added, and then insert the element into the binary tree (and maintain the tree as a self-balancing binary search tree). Then you similarly compute and store the number of inversions in all suffixes. Then, if you want to count the number of inversions in a range [L,R], you add the inversions for the prefix starting at L to the inversions in the suffix ending at R, and subtract the total number of inversions for the entire array. This almost gives you the answer but not quite, because it gives you the answer minus the number of inversions between the ranges [1,L-1] and [R+1,n]. So you need to be able to compute the number of inversions between an arbitrary prefix and suffix pair in your array. For this, you compute the number of inversions between arbitrary prefixes and the specific suffixes that start with a multiple of sqrt(n). You can do this in O(n^(3/2) log n) time by sorting each suffix and then, for each suffix, adding one element to the prefix from left to right, doing a binary search in the suffix to figure out how much to add to the number of inversions. Similarly you compute and store the number of inversions between each prefix that ends in a multiple of sqrt(n) and each element to the right of the prefix in O(n^(3/2) log n) time.

Then, for a given range, you take the prefix and the suffix and round the suffix off to end in the nearest multiple of sqrt(n) higher, and look up the number of inversions, in O(1) time. Then you take the remaining elements in the suffix and look up the number of inversions in the prefix that ends in the nearest multiple of sqrt(n) lower, in O(sqrt(n)) time. Then you take the remaining elements in the suffix and the remaining elements in the prefix (not included in the sqrt(n) endpoints), and brute-force compute the number of inversions between them in O(sqrt(n) log n) time. Total computation time is O(sqrt(n) log n) per range, giving overall runtime of O((m + n) sqrt(n) log n) time, which should satisfy the 1 second time limit.

like image 31
user2566092 Avatar answered Nov 09 '22 14:11

user2566092