I am working on problem Contains Duplicate III - LeetCode
Given an array of integers, find out whether there are two distinct indices i and j in the array such that the absolute difference between nums[i] and nums[j]**is at most t and the **absolute difference between i and j is at most k.
Example 1:
Input: nums = [1,2,3,1], k = 3, t = 0 Output: true
Example 2:
Input: nums = [1,0,1,1], k = 1, t = 2 Output: true
Example 3:
Input: nums = [1,5,9,1,5,9], k = 2, t = 3 Output: false
Read a bucket sort solution from the discussion area
class Solution2:
def containsNearbyAlmostDuplicate(self, nums, k, t):
if t < 0: return False
lookup = {}
for i in range(len(nums)):
b_idx = nums[i] // (t+1)
if b_idx in lookup:
return True
if b_idx - 1 in lookup and abs(nums[i] - lookup[b_idx - 1]) < t+1:
return True
if b_idx + 1 in lookup and abs(nums[i] - lookup[b_idx + 1]) < t+1:
return True
lookup[b_idx] = nums[i]
if i >= k: del lookup[nums[i-k] // (t+1)]
return False
The explaination
The idea is like the bucket sort algorithm. Suppose we have consecutive buckets covering the range of nums with each bucket a width of (t+1). If there are two item with difference <= t, one of the two will happen:
(1) the two in the same bucket
(2) the two in neighbor buckets
I know the logic of bucket sort, but have no ideas how this solution works.
I think, it is necessary to compare between all the values with range of width, but the solution only compare the same bucket and the adjacent bucket,
del lookup[num[i-k]
, I cannot figure out what the purpose of this manipulation.
First Question: why only compare the same bucket and the adjacent bucket?
As author said, there are two situations if (a, b)
is a valid pair:
(1) the two in the same bucket
(2) the two in neighbor buckets
If b - a <= t
then only two situations as above said, you can understand it by bucket examples here:
<--a-- t+1 ---b-><----- t+1 -----> in same bucket
<----- t+1 --a--><---b- t+1 -----> in neighbor buckets
Bucket
is used here is because we want to divide the range into pariticular width, and decrease the compare times. This is a method of trading space for time.
Second Question: why del lookup[num[i-k]
?
Because the second restrict is the difference index i
, j
should at most k.
So in for i in range(len(nums)):
we should remove the previous index j from bucket, if i - j == k
. And difference equal k is included, so we should remove after logic.
If you don't do that, you will find the pairs which abs(nums[i]-nums[j])<=t
, but abs(i-j)>t
I hope I make it clear, and comment if you have further questions. : )
By the way, an advice: if you are confused or stucked, you can go through a example by print or debug, you will be more clear, especially for corner case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With