I'm relatively new to python and I'm confused about the performance of two relatively simple blocks of code. The first function generates a prime factorization of a number n given a list of primes. The second generates a list of all factors of n. I would have though prime_factor
would be faster than factors
(for the same n), but this is not the case. I'm not looking for better algorithms, but rather I would like to understand why prime_factor
is so much slower than factors
.
def prime_factor(n, primes):
prime_factors = []
i = 0
while n != 1:
if n % primes[i] == 0:
factor = primes[i]
prime_factors.append(factor)
n = n // factor
else: i += 1
return prime_factors
import math
def factors(n):
if n == 0: return []
factors = {1, n}
for i in range(2, math.floor(n ** (1/2)) + 1):
if n % i == 0:
factors.add(i)
factors.add(n // i)
return list(factors)
Using the timeit module,
{ i:factors(i) for i in range(1, 10000) }
takes 2.5 seconds
{ i:prime_factor(i, primes) for i in range(1, 10000) }
takes 17 seconds
This is surprising to me. factors
checks every number from 1 to sqrt(n), while prime_factor
only checks primes. I would appreciate any help in understanding the performance characteristics of these two functions.
Thanks
Edit: (response to roliu)
Here is my code to generate a list of primes from 2 to up_to
:
def primes_up_to(up_to):
marked = [0] * up_to
value = 3
s = 2
primes = [2]
while value < up_to:
if marked[value] == 0:
primes.append(value)
i = value
while i < up_to:
marked[i] = 1
i += value
value += 2
return primes
Without seeing what you used for primes
, we have to guess (we can't run your code).
But a big part of this is simply mathematics: there are (very roughly speaking) about n/log(n)
primes less than n
, and that's a lot bigger than sqrt(n)
. So when you pass a prime, prime_factor(n)
does a lot more work: it goes through O(n/log(n))
operations before finding the first prime factor (n
itself!), while factors(n)
gives up after O(sqrt(n))
operations.
This can be very significant. For example, sqrt(10000)
is just 100, but there are 1229 primes less than 10000. So prime_factor(n)
can need to do over 10 times as much work to deal with the large primes in your range.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With