Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to avoid this memory error?

I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.

from itertools import combinations
import time

def findanums(n):
    l = []
    for i in range(1, n + 1):
        s = []
        for j in range(1, i):
            if i % j == 0:
                s.append(j)
        if sum(s) > i:
            l.append(i)
    return l

start = time.time() #start time

limit = 28123

anums = findanums(limit + 1) #abundant numbers (1..limit)
print "done finding abundants", time.time() - start

pairs = combinations(anums, 2)
print "done finding combinations", time.time() - start

sums = map(lambda x: x[0]+x[1], pairs)
print "done finding all possible sums", time.time() - start

print "start main loop"
answer = 0
for i in range(1,limit+1):
    if i not in sums:
        answer += i
print "ANSWER:",answer

When I run this I run into a MemoryError.

The traceback:

File "test.py", line 20, in <module>
    sums = map(lambda x: x[0]+x[1], pairs)

I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.

SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.

like image 905
Jesse Neff Avatar asked Apr 18 '13 19:04

Jesse Neff


People also ask

How do I get rid of Memory Error in Python?

In such cases, we can use the conda install command in python prompt and install those packages to fix the Memory Error. Another type of Memory Error occurs when the memory manager has used the Hard disk space of our system to store the data that exceeds the RAM capacity.

What causes Memory Error?

Memory errors can occur if your process corrupts the memory or tries to free the same memory twice, or uses a stale or invalid pointer. These silent errors can cause surprising, random application crashes.

What happens if Python runs out of memory?

A segfaulting program might be the symptom of a bug in C code–or it might be that your process is running out of memory. Crashing is just one symptom of running out of memory. Your process might instead just run very slowly, your computer or VM might freeze, or your process might get silently killed.


3 Answers

As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.

When using very big lists, it is common to use generators, there is a famous, great Stackoverflow answer explaining the concepts of yield and generator - What does the "yield" keyword do in Python?

The problem is that your pairs = combinations(anums, 2) doesn't generate a generator, but generates a large object which is much more memory consuming.

I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:

def generator_sol(anums1, s):
      for comb in itertools.combinations(anums1, s):
        yield comb

Now instead of pairs = combinations(anums, 2) which generates a large object. Use:

pairs = generator_sol(anums, 2)

Then, instead of using the lambda I would use another generator:

sums_sol = (x[0]+x[1] for x in pairs)

Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object which is more efficient in many cases (such as here).

like image 92
Ofiris Avatar answered Oct 12 '22 05:10

Ofiris


Let me suggest you to use generators. Try changing this:

sums = map(lambda x: x[0]+x[1], pairs)

to

sums = (a+b for (a,b) in pairs)

Ofiris solution is also ok but implies that itertools.combinations return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.

like image 40
Alex Avatar answered Oct 12 '22 04:10

Alex


The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:

import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()
like image 44
Patrick Mineault Avatar answered Oct 12 '22 06:10

Patrick Mineault