Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to return x,y coordinates that are present in both list A and list B?

I have two lists (list A and list B) of x,y coordinates where 0 < x < 4000, 0 < y < 4000, and they will always be integers. I need to know what coordinates are in both lists. What would be your suggestion for how to approach this?

I have been thinking about representing the lists as two grids of bits and doing bitwise & possibly?

List A has about 1000 entries and changes maybe once every 10,000 requests. List B will vary wildly in length and will be different on every run through.

EDIT: I should mention that no coordinate will be in lists twice; 1,1 cannot be in list A more than once for example.

like image 421
boodle Avatar asked Mar 24 '11 16:03

boodle


2 Answers

Represent (x,y) as a single 24 bit number as described in the comments.

Maintain A in numerical order (you said it doesn't vary much, so this should be hardly any cost).

For each B do a binary search on the list. Since A is about 1000 items big, you'll need at most 10 integer comparisons (in the worst case) to check for membership.

If you have a bit more memory (about 2MB) to play with you could create a bit-vector to support all possible 24 bit numbers then then perform a single bit operation per item to test for membership. So A would be represented by a single 2^24 bit number with a bit-set if the value is there (otherwise 0). To test for membership you would just use an appropriate bit and operation.

like image 161
Jeff Foster Avatar answered Oct 13 '22 00:10

Jeff Foster


Put the coordinates of list A into some kind of a set (probably a hash, bst, or heap), then you can quickly see if the coordinate from list B is present.

Depending on whether you're expecting the list to be present or not present in the list would determine what underlying data structure you use.

Hashes are good at telling you if something is in it, though depending on how it's implemented, could behave poorly when trying to find something that isn't in it.

bst and heaps are equally good at telling you if something is in it or not, but don't perform theoretically as well as hashes when something is in it.

like image 22
helloworld922 Avatar answered Oct 12 '22 23:10

helloworld922