How would I go about improving/making this run faster? [closed]

Question

I'm a beginner in Python trying to get better, and I stumbled across the following exercise:

Let n be an integer greater than 1 and s(n) the sum of the dividors of n. For example,
s(12) 1 + 2 + 3 + 4 + 6 + 12 = 28
Also,
s(s(12)) = s(28) = 1 + 2 + 4 + 7 + 14 + 28 = 56
And
s(s(s(12))) = s(56) = 1 + 2 + 4 + 7 + 8 + 14 + 28 + 56 = 120
We use the notations:
s^1(n) = s(n)
s^2(n) = s(s(n))
s^3(n) = s(s(s(n)))
s^ m (n) = s(s(. . .s(n) . . .)), m times
For the integers n for which exists a positive integer k so that
 s^m(n) = k * n
are called (m, k)-perfect, for instance 12 is (3, 10)-perfect since s^3(12) = s(s(s(12))) = 120 = 10 * 12

Special categories:

For m =1 we have multiperfect numbers

A special case of the above exist for m = 1 and k = 2 which are called perfect numbers.

For m = 2 and k = 2 we have superperfect numbers.

Write a program which finds and prints all (m, k)-perfect numbers for m <= MAXM, which are less or equal to (<=) MAXNUM. If an integer belongs to one of the special categories mentioned above the program should print a relevant message. Also, the program has to print how many different (m, k)-perfect numbers were found, what percentage of the tested numbers they were, in how many occurrences for the different pairs of (m, k), and how many from each special category were found(perfect numbers are counted as multiperfect as well).

Here's my code:

import time
start_time = time.time()
def s(n):
    tsum = 0
    i = 1
    con = n
    while i < con:
        if n % i == 0:
            temp = n / i
            tsum += i
            if temp != i:
                tsum += temp 
            con = temp
        i += 1                    
    return tsum
#MAXM
#MAXNUM

i = 2

perc = 0
perc1 = 0
perf = 0
multperf = 0
supperf = 0
while i <= MAXNUM:
    pert = perc1
    num = i
    for m in xrange(1, MAXM + 1):        
        tsum = s(num)                
        if tsum % i == 0:            
            perc1 += 1
            k = tsum / i            
            mes = "%d is a (%d-%d)-perfect number" % (i, m, k)
            if m == 1:
                multperf += 1
                if k == 2:
                    perf += 1 
                    print mes + ", that is a perfect number"
                else:
                    print mes + ", that is a multiperfect number"               
            elif m == 2 and k == 2:
                supperf += 1
                print mes + ", that is a superperfect number"
            else:
                print mes        
        num = tsum        
    i += 1
    if pert != perc1: perc += 1
print "Found %d distinct (m-k)-perfect numbers (%.5f per cent of %d ) in %d occurrences" % (
perc, float(perc) / MAXNUM * 100, MAXNUM, perc1)
print "Found %d perfect numbers" % perf
print "Found %d multiperfect numbers (including perfect numbers)" % multperf
print "Found %d superperfect numbers" % supperf  
print time.time() - start_time, "seconds"

It works fine, but I would like suggestions on how to make it run faster. For instance is it faster to use

I = 1
while I <= MAXM:
    …..
    I += 1

instead of

for I in xrange(1, MAXM + 1)

Would it be better if instead of defining s(n) as a function I put the code into the main program? etc. If you have anything to suggest for me to read on how to make a program run faster, I'd appreciate it. And one more thing, originally the exercise required the program to be in C (which I don't know), having written this in Python, how difficult would it be for it to be made into C?

Daniel Fischer · Accepted Answer

The biggest improvements come from using a better algorithm. Things like

Would it be better if instead of defining s(n) as a function I put the code into the main program?

or whether to use a while loop instead of for i in xrange(1, MAXM + 1): don't make much difference, so should not be considered before one has reached a state where algorithmic improvements are at least very hard to come by.

So let's take a look at your algorithm and how we can drastically improve it without caring about minuscule things like whether a while loop or a for iteration are faster.

def s(n):
    tsum = 0
    i = 1
    con = n
    while i < con:
        if n % i == 0:
            temp = n / i
            tsum += i
            if temp != i:
                tsum += temp 
            con = temp
        i += 1                    
    return tsum

That already contains a good idea, you know that the divisors of n come in pairs and add both divisors once you found the smaller of the pair. You even correctly handle squares.

It works very well for numbers like 120: when you find the divisor 2, you set the stop condition to 60, when you find 3, to 40, ..., when you find 8, you set it to 15, when you find 10, you set it to 12, and then you have only the division by 11, and stop when i is incremented to 12. Not bad.

But it doesn't work so well when n is a prime, then con will never be set to a value smaller than n, and you need to iterate all the way to n before you found the divisor sum. It's also bad for numbers of the form n = 2*p with a prime p, then you loop to n/2, or n = 3*p (n/3, unless p = 2) etc.

By the prime number theorem, the number of primes not exceeding x is asymptotically x/log x (where log is the natural logarithm), and you have a lower bound of

Ω(MAXNUM² / log MAXNUM)

just for computing the divisor sums of the primes. That's not good.

Since you already consider the divisors of n in pairs d and n/d, note that the smaller of the two (ignoring the case d = n/d when n is a square for the moment) is smaller than the square root of n, so once the test divisor has reached the square root, you know that you have found and added all divisors, and you're done. Any further looping is futile wasted work.

So let us consider

def s(n):
    tsum = 0
    root = int(n**0.5)  # floor of the square root of n, at least for small enough n
    i = 1
    while i < root + 1:
        if n % i == 0:
            tsum += i + n/i
        i += 1
    # check whether n is a square, if it is, we have added root twice
    if root*root == n:
        tsum -= root
    return tsum

as a first improvement. Then you always loop to the square root, and computing s(n) for 1 <= n <= MAXNUM is Θ(MAXNUM^1.5). That's already quite an improvement. (Of course, you have to compute the iterated divisor sums, and s(n) can be larger than MAXNUM for some n <= MAXNUM, so you can't infer a complexity bound of O(MAXM * MAXNUM^1.5) for the total algorithm from that. But s(n) cannot be very much larger, so the complexity can't be much worse either.)

But we can still improve on that by using what twalberg suggested, using the prime factorisation of n to compute the divisor sum.

First, if n = p^k is a prime power, the divisors of n are 1, p, p², ..., p^k, and the divisor sum is easily computed (a closed formula for the geometric sum is

(p^(k+1) - 1) / (p - 1)

but whether one uses that or adds the k+1 powers of p dividing n is not important).

Next, if n = p^k * m with a prime p and an m such that p does not divide m, then

s(n) = s(p^k) * s(m)

An easy way to see that decomposition is to write each divisor d of n in the form d = p^a * g where p does not divide g. Then p^a must divide p^k, i.e. a <= k, and g must divide m. Conversely, for every 0 <= a <= k and every g dividing m, p^a * g is a divisor of n. So we can lay out the divisors of n (where 1 = g_1 < g_2 < ... < g_r = m are the divisors of m)

 1*g_1   1*g_2  ...  1*g_r
 p*g_1   p*g_2  ...  p*g_r
  :        :           :
p^k*g_1 p^k*g_2 ... p^k*g_r

and the sum of each row is p^a * s(m).

If we have a list of primes handy, we can then write

def s(n):
    tsum = 1
    for p in primes:
        d = 1
        # divide out all factors p of n
        while n % p == 0:
            n = n//p
            d = p*d + 1
        tsum *= d
        if p*p > n: # n = 1, or n is prime
            break
    if n > 1:       # one last prime factor to account for
        tsum *= 1 + n
    return tsum

The trial division goes to the second largest prime factor of n [if n is composite] or the square root of the largest prime factor of n, whichever is larger. It has a worst-case bound for the largest divisor tried of n**0.5, which is reached for primes, but for most composites, the division stops much earlier.

If we don't have a list of primes handy, we can replace the line for p in primes: with for p in xrange(2, n): [the upper limit is not important, since it is never reached if it is larger than n**0.5] and get a not too much slower factorisation. (But it can easily be sped up a lot by avoiding even trial divisors larger than 2, that is using a list [2] + [3,5,7...] - best as a generator - for the divisors, even more by also skipping multiples of 3 (except 3), [2,3] + [5,7, 11,13, 17,19, ...] and if you want of a few further small primes.)

Now, that helped, but computing the divisor sums for all n <= MAXNUM still takes Ω(MAXNUM^1.5 / log MAXNUM) time (I haven't analysed, that could be also an upper bound, or the MAXNUM^1.5 could still be a lower bound, anyway, a logarithmic factor rarely makes much of a difference [beyond a constant factor]).

And you compute a lot of divisor sums more than once (in your example, you compute s(56) when investigating 12, again when investigating 28, again when investigating 56). To alleviate the impact of that, memoizing s(n) would be a good idea. Then you need to compute each s(n) only once.

And now we have already traded space for time, so we can use a better algorithm to compute the divisor sums for all 1 <= n <= MAXNUM in one go, with a better time complexity (and also smaller constant factors). Instead of trying out each small enough (prime) number whether it divides n, we can directly mark only multiples, thus avoiding all divisions that leave a remainder - which is the vast majority.

The easy method to do that is

def divisorSums(n):
    dsums = [0] + [1]*n
    for k in xrange(2, n+1):
        for m in xrange(k, n+1, k):
            dsums[m] += k
    return dsums

with an O(n * log n) time complexity. You can do it a bit better (O(n * log log n) complexity) by using the prime factorisation, but that method is somewhat more complicated, I'm not adding it now, maybe later.

Then you can use the list of all divisor sums to look up s(n) if n <= MAXNUM, and the above implementation of s(n) to compute the divisor sum for values larger than MAXNUM [or you may want to memoize the values up to a larger limit].

dsums = divisorSums(MAXNUM)
def memo_s(n):
    if n <= MAXNUM:
        return dsums[n]
    return s(n)

That's not too shabby,

Found 414 distinct (m-k)-perfect numbers (0.10350 per cent of 400000 ) in 496 occurrences
Found 4 perfect numbers
Found 8 multiperfect numbers (including perfect numbers)
Found 7 superperfect numbers
12.709428072 seconds

for

import time
start_time = time.time()


def s(n):
    tsum = 1
    for p in xrange(2,n):
        d = 1
        # divide out all factors p of n
        while n % p == 0:
            n = n//p
            d = p*d + 1
        tsum *= d
        if p*p > n: # n = 1, or n is prime
            break
    if n > 1:       # one last prime factor to account for
        tsum *= 1 + n
    return tsum
def divisorSums(n):
    dsums = [0] + [1]*n
    for k in xrange(2, n+1):
        for m in xrange(k, n+1, k):
            dsums[m] += k
    return dsums

MAXM = 6
MAXNUM = 400000

dsums = divisorSums(MAXNUM)
def memo_s(n):
    if n <= MAXNUM:
        return dsums[n]
    return s(n)

i = 2

perc = 0
perc1 = 0
perf = 0
multperf = 0
supperf = 0
while i <= MAXNUM:
    pert = perc1
    num = i
    for m in xrange(1, MAXM + 1):
        tsum = memo_s(num)
        if tsum % i == 0:
            perc1 += 1
            k = tsum / i
            mes = "%d is a (%d-%d)-perfect number" % (i, m, k)
            if m == 1:
                multperf += 1
                if k == 2:
                    perf += 1
                    print mes + ", that is a perfect number"
                else:
                    print mes + ", that is a multiperfect number"
            elif m == 2 and k == 2:
                supperf += 1
                print mes + ", that is a superperfect number"
            else:
                print mes
        num = tsum
    i += 1
    if pert != perc1: perc += 1
print "Found %d distinct (m-k)-perfect numbers (%.5f per cent of %d ) in %d occurrences" % (
perc, float(perc) / MAXNUM * 100, MAXNUM, perc1)
print "Found %d perfect numbers" % perf
print "Found %d multiperfect numbers (including perfect numbers)" % multperf
print "Found %d superperfect numbers" % supperf
print time.time() - start_time, "seconds"

By memoizing also the needed divisor sums for n > MAXNUM:

d = {}
for i in xrange(1, MAXNUM+1):
    d[i] = dsums[i]
def memo_s(n):
    if n in d:
        return d[n]
    else:
        t = s(n)
        d[n] = t
        return t

the time drops to

3.33684396744 seconds

How would I go about improving/making this run faster? [closed]

Tags:

performance

python

optimization

python-2.7

user2516856

1 Answers

Daniel Fischer

Recent Activity

Donate For Us

How would I go about improving/making this run faster? [closed]

Tags:

performance

python

optimization

python-2.7

user2516856

1 Answers

Daniel Fischer

Related questions

Recent Activity

Donate For Us