Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find a shortest distance between two buckets of numbers

I have two buckets (unordered, 1-dimentional data structures) of numbers and I want to calculate a minimal distance between any elements of the two buckets. Is there a way to find the shortest distance between any number from different buckets in O(1)? What is my best bet?

Input
[B1] 1, 5, 2, 347, 50
[B2] 21, 17, 345

Output
2 // abs(347 - 345)

Edits

  • I expect to have more lookups than inserts
  • Distance between smallest and largest elements in any bucket is less than 10^5
  • Number of elements in any bucket is less than 10^5
  • Numbers in buckets are "nearly" sorted - these are timestamps of events. There's probably less than 1% of elements in the buckets that are out of order
  • The number of elements in buckets is small, but I need to lookup at an average rate of 2k/sec, and periodically drop stale buckets and replace them with new buckets, hence I want my lookups be in O(1)

See why I need this and what I have thought of in the previous question edition.

like image 811
oleksii Avatar asked Nov 29 '16 16:11

oleksii


People also ask

How do you find the shortest distance between two points?

The shortest distance between two points is a straight line. This distance can be calculated by using the distance formula. The distance between two points ( x 1 , y 1 ) and ( x 2 , y 2 ) can be defined as d = ( x 2 − x 1 ) 2 + ( y 2 − y 1 ) 2 .

How do you find the shortest distance between two points in Python?

Use dist in a nested loop inside shortestDist to compare each element of the list of points with every element in the list after it. So, basically, find the shortest distance between points in a list. That finds the distance alright between two points.

How do you find the minimum distance between elements in an array?

Approach: The task is to find the distance between two given numbers, So find the distance between any two elements using nested loops. The outer loop for selecting the first element (x) and the inner loop for traversing the array in search for the other element (y) and taking the minimum distance between them.


2 Answers

Let there be n numbers in total.
1. Write all numbers in binary. ==> O(n)
2. Append 0 or 1 in each number, according to whether from B1 or B2. ==> O(n)
3. Quicksort them, ignoring the first bit. ==> O(n log n) in average
4. for the whole list, iterate through sorted order. For every two adjacent numbers u and v, if they came from both B1 or B2, ignore.
Otherwise, set tmp <-- abs(u-v) whenever tmp > abs(u-v). Thus, tmp is the minimal distance so far, within adjacent numbers.
The final tmp is answer. ==> O(n)

in total: ==> O(n log n) in average

like image 184
Violapterin Avatar answered Sep 23 '22 16:09

Violapterin


Create a bitvector of 10^5 elements for each bucket. Keep track of the min distance (initially 10^5 until both buckets are nonempty).

Now, say you're adding an element x to one of the buckets. Do the following:

1. Set the bit x of the same bucket.
2. Check whether the other bitvector has any set elements within min_distance-1 of x
3. Update min_distance as appropriate

Running time: On inserting it's O(min_distance), which is technically O(1) since min_distance is capped. On polling it's O(1) since you're just returning min_distance.

edit If the elements aren't capped at 10^5 but just the distance between the min and the max is, this will need to be modified but will still work. I can detail the necessary changes if this matters.

like image 35
Dave Avatar answered Sep 24 '22 16:09

Dave