I'm currently working through the problems on Project Euler, and so far I've come up with this code for a problem.
from itertools import combinations
import time
def findanums(n):
l = []
for i in range(1, n + 1):
s = []
for j in range(1, i):
if i % j == 0:
s.append(j)
if sum(s) > i:
l.append(i)
return l
start = time.time() #start time
limit = 28123
anums = findanums(limit + 1) #abundant numbers (1..limit)
print "done finding abundants", time.time() - start
pairs = combinations(anums, 2)
print "done finding combinations", time.time() - start
sums = map(lambda x: x[0]+x[1], pairs)
print "done finding all possible sums", time.time() - start
print "start main loop"
answer = 0
for i in range(1,limit+1):
if i not in sums:
answer += i
print "ANSWER:",answer
When I run this I run into a MemoryError
.
The traceback:
File "test.py", line 20, in <module>
sums = map(lambda x: x[0]+x[1], pairs)
I've tried to prevent it by disabling garbage collection from what I've been able to get from Google but to no avail. Am I approaching this the wrong way? In my head this feels like the most natural way to do it and I'm at a loss at this point.
SIDE NOTE: I'm using PyPy 2.0 Beta2(Python 2.7.4) because it is so much faster than any other python implementation I've used, and Scipy/Numpy are over my head as I'm still just beginning to program and I don't have an engineering or strong math background.
In such cases, we can use the conda install command in python prompt and install those packages to fix the Memory Error. Another type of Memory Error occurs when the memory manager has used the Hard disk space of our system to store the data that exceeds the RAM capacity.
Memory errors can occur if your process corrupts the memory or tries to free the same memory twice, or uses a stale or invalid pointer. These silent errors can cause surprising, random application crashes.
A segfaulting program might be the symptom of a bug in C code–or it might be that your process is running out of memory. Crashing is just one symptom of running out of memory. Your process might instead just run very slowly, your computer or VM might freeze, or your process might get silently killed.
As Kevin mention in the comments, your algorithm might be wrong, but anyway your code is not optimized.
When using very big lists, it is common to use generators
, there is a famous, great Stackoverflow answer explaining the concepts of yield
and generator
- What does the "yield" keyword do in Python?
The problem is that your pairs = combinations(anums, 2)
doesn't generate a generator
, but generates a large object which is much more memory consuming.
I changed your code to have this function, since you iterating over the collection only once, you can lazy evaluate the values:
def generator_sol(anums1, s):
for comb in itertools.combinations(anums1, s):
yield comb
Now instead of pairs = combinations(anums, 2)
which generates a large object.
Use:
pairs = generator_sol(anums, 2)
Then, instead of using the lambda
I would use another generator
:
sums_sol = (x[0]+x[1] for x in pairs)
Another tip, you better look at xrange which is more suitable, it doens't generate a list but an xrange object
which is more efficient in many cases (such as here).
Let me suggest you to use generators. Try changing this:
sums = map(lambda x: x[0]+x[1], pairs)
to
sums = (a+b for (a,b) in pairs)
Ofiris solution is also ok but implies that itertools.combinations
return a list when it's wrong. If you are going to keep solving project euler problems have a look at the itertools documentation.
The issue is that anums is big - about 28000 elements long. so pairs must be 28000*28000*8 bytes = 6GB. If you used numpy you could cast anums as a numpy.int16 array, in which case the result pairs would be 1.5GB - more manageable:
import numpy as np
#cast
anums = np.array([anums],dtype=np.int16)
#compute the sum of all the pairs via outer product
pairs = (anums + anums.T).ravel()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With