Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find Kth Smallest Pair Distance - Analysis

Question:

This is a problem from LeetCode:

Given an integer array, return the k-th smallest distance among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

Example:

Input:
nums = [1,3,1]
k = 1
Output: 0 
Explanation:
Here are all the pairs:
(1,3) -> 2
(1,1) -> 0
(3,1) -> 2
Then the 1st smallest distance pair is (1,1), and its distance is 0.

My Problem

I solved it with a naive approach O(n^2) basically I find all distances and sort it then find the kth smallest. Now here is a better Solution. It is not my code I found it on the discussion forum on leetcode. But I am having trouble understanding a crucial part of the code.

The code below is basically doing a binary search. the low is the min distance and high is the max distance. calculate a mid like usual binary search. then it does countPairs(a, mid) to find number of pairs with absolute difference less than or equal to mid. then adjust low and high accordingly.

But WHY the binary Search result MUST be one of the distances? At first, low and high are got from the array, but the mid, is calculated by them, it may not be the distance. In the end we are returning low which the values changes during the binary search base on mid + 1. Why is mid + 1 guarantee to be one of the distance?

class Solution {
    // Returns index of first index of element which is greater than key
    private int upperBound(int[] a, int low, int high, int key) {
        if (a[high] <= key) return high + 1;
        while (low < high) {
            int mid = low + (high - low) / 2;
            if (key >= a[mid]) {
                low = mid + 1;
            } else {
                high = mid;
            }
        }
        return low;
    }

    // Returns number of pairs with absolute difference less than or equal to mid.
    private int countPairs(int[] a, int mid) {
        int n = a.length, res = 0;
        for (int i = 0; i < n; i++) {
            res += upperBound(a, i, n - 1, a[i] + mid) - i - 1;
        }
        return res;
    }

    public int smallestDistancePair(int a[], int k) {
        int n = a.length;
        Arrays.sort(a);

        // Minimum absolute difference
        int low = a[1] - a[0];
        for (int i = 1; i < n - 1; i++)
            low = Math.min(low, a[i + 1] - a[i]);

        // Maximum absolute difference
        int high = a[n - 1] - a[0];

        // Do binary search for k-th absolute difference
        while (low < high) {
            countPairs(a, mid)
            if (countPairs(a, mid) < k)
                low = mid + 1;
            else
                high = mid;
        }

        return low;
    }
}
like image 503
OLIVER.KOO Avatar asked Feb 11 '18 19:02

OLIVER.KOO


1 Answers

This type of binary search will find the first value x for which countPairs(a,x) >= k. (The topcoder tutorial explains this well.)

Therefore when the function terminates with final value low, we know that the number of pairs changes when the distance changes from low-1 to low, and therefore there must be a pair with distance low.

For example, suppose we have a target of 100 and know that:

countPairs(a,9) = 99
countPairs(a,10) = 100

There must be a pair of numbers with distance exactly 10, because if there was no such pair, then the number of pairs with distance less than or equal to 10 would be the same as the number of pairs with distance less than or equal to 9.

Note that this only applies because the loop is run until the interval under test is completely exhausted. If the code had instead used an early termination condition that quit the loop if the exact target value was found, then it could return incorrect answers.

like image 66
Peter de Rivaz Avatar answered Oct 27 '22 17:10

Peter de Rivaz