I am working on Project Euler Problem 50, which states:
The prime 41, can be written as the sum of six consecutive primes:
41 = 2 + 3 + 5 + 7 + 11 + 13 This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
For determining the terms in prime P (if it at all can be written as a sum of primes) I use a sliding window of all the primes (in increasing order) up to (but not including) P, and calculate the sum of all these windows, if the sum is equal to the prime considered, I count the length of the window...
This works fine for all primes up to 1000, but for primes up to 10**6 it is very slow, so I was hoping memozation would help; when calculating the sum of sliding windows, a lot of double work is done...(right?)
So I found the standard memoizaton implemention on the net and just pasted it in my code, is this correct? (I have no idea how it is supposed to work here...)
primes = tuple(n for n in range(1, 10**6) if is_prime(n)==True)
count_best = 0
##http://docs.python.org/release/2.3.5/lib/itertools-example.html:
## Slightly modified (first for loop)
from itertools import islice
def window(seq):
for n in range(2, len(seq) + 1):
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
def memoize(function):
cache = {}
def decorated_function(*args):
if args in cache:
return cache[args]
else:
val = function(*args)
cache[args] = val
return val
return decorated_function
@memoize
def find_lin_comb(prime):
global count_best
for windows in window(primes[0 : primes.index(prime)]):
if sum(windows) == prime and len(windows) > count_best:
count_best = len(windows)
print('Prime: ', prime, 'Terms: ', count_best)
##Find them:
for x in primes[::-1]: find_lin_comb(x)
(btw, the tuple of prime numbers is generated "decently" fast)
All input is appreciated, I am just a hobby programmer, so please don´t get to advanced on me.
Thank you!
Edit: here is a working code paste that doesn´t have ruined indentations: http://pastebin.com/R1NpMqgb
This works fine for all primes up to 1000, but for primes up to 10**6 it is very slow, so I was hoping memozation would help; when calculating the sum of sliding windows, a lot of double work is done...(right?)
Yes, right. And of course it's slow for the primes up to 106.
Say you have n
primes up to N
, numbered in increasing order, p_1 = 2, p_2 = 3, ...
. When considering whether prime no. k
is the sum of consecutive primes, you consider all windows [p_i, ..., p_j]
, for pairs (i,j)
with i < j < k
. There are (k-1)*(k-2)/2
of them. Going through all k
to n
, you examine about n³/6
windows in total (counting multiplicity, you're examining w(i.j)
in total n-j
times). Even ignoring the cost of creating the window and summing it, you can see how it scales badly:
N = 1000
, there are n = 168
primes and about 790000 windows to examine (counting multiplicity).N = 10**6
, there are n = 78498
primes and about 8.3*10**13
windows to examine.Now factor in the work for creating and summing the windows, estimate it low at j-i+1
for summing the j-i+1
primes in w(i,j)
, the work for p_k
is about k³/6
, and the total work becomes roughly k**4/24
. Something like 33 million steps for N = 1000
, peanuts, but nearly 1.6*10**18
for N = 1000000
.
A year contains about 3.1*10**7
seconds, with a ~3GHz CPU, that's roughly 1017 clock cycles. So we're talking of an operation needing something like 100 CPU-years (may be a factor of 10 off or so).
You aren't willing to wait that long, I suppose;)
Now, with memoisation, you still look at each window multiple times, but you do the computation of each window only once. That means you need about n³/6
work for the computation of the windows, and look about n³/6
times at any window.
8.3*10**13
times, that's several hours even if looking cost only one cycle.8.3*10**13
windows to memoise. You don't have that much memory, unless you can use a really large HD.You can circumvent the memory problem by throwing away data you don't need anymore and only calculating data for the windows when it is needed, but once you know which data you may throw away when, you should be able to see a much better approach.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
What does this tell you about the window generating that sum? Where can it start, where can it stop? How can you use that information to create an efficient algorithm to solve the problem?
The memoize decorator adds a wrapper to a function to cache the return value for each value of the argument (each combination of values in case of multiple arguments). It is useful when the function is called multiple times with the same arguments. You can only use it with a pure function, i.e.
Your find_lin_comb function does not satisfy the above criteria. For one thing, it is called with a different argument every time, for another, the function does not return a value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With