Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimize intersection of large sets

The premise is simple: I have two integers, a and b and I want to find i s.t. a + i and b + i are both in a given list. The list rs is very large (10e9 items). I have the following code:

def getlist(a,b):
    a1 = set([i - a for i in rs if i>a])
    b1 = set([i-b for i in rs if i>b]) 

    tomp = list(a1.intersection(b1))
    return tomp

The issue at hand is that a1 and b1 are pre-computed first which creates a memory problem. Can I optimize my code somehow? General comments about the method are also welcome.

Example input:

rs = [4,9,16]
a = 3
b = 8

Expected output:

getlist(3,8) = [1]
like image 413
S. L. Avatar asked Dec 29 '25 05:12

S. L.


1 Answers

You can optimize the memory usage by skipping the creation of the second set (and intermediate lists):

def getlist(a, b):
    a1 = {i - a for i in rs if i > a}
    return [i - b for i in rs if i > b and i - b in a1]

The time and space complexity of this solution is O(n).

like image 147
Eugene Yarmash Avatar answered Dec 31 '25 19:12

Eugene Yarmash