Been trying to implement Rabin-Miller Strong Pseudoprime Test today.
Have used Wolfram Mathworld as reference, lines 3-5 sums up my code pretty much.
However, when I run the program, it says (sometimes) that primes (even low such as 5, 7, 11) are not primes. I've looked over the code for a very long while and cannot figure out what is wrong.
For help I've looked at this site aswell as many other sites but most use another definition (probably the same, but since I'm new to this kind of math, I can't see the same obvious connection).
My Code:
import random
def RabinMiller(n, k):
# obviously not prime
if n < 2 or n % 2 == 0:
return False
# special case
if n == 2:
return True
s = 0
r = n - 1
# factor n - 1 as 2^(r)*s
while r % 2 == 0:
s = s + 1
r = r // 2 # floor
# k = accuracy
for i in range(k):
a = random.randrange(1, n)
# a^(s) mod n = 1?
if pow(a, s, n) == 1:
return True
# a^(2^(j) * s) mod n = -1 mod n?
for j in range(r):
if pow(a, 2**j*s, n) == -1 % n:
return True
return False
print(RabinMiller(7, 5))
How does this differ from the definition given at Mathworld?
A number of the points I'll make below were noted in other answers, but it seems useful to have them all together.
In the section
s = 0
r = n - 1
# factor n - 1 as 2^(r)*s
while r % 2 == 0:
s = s + 1
r = r // 2 # floor
you've got the roles of r and s swapped: you've actually factored n − 1 as 2sr. If you want to stick to the MathWorld notation, then you'll have to swap r
and s
in this section of the code:
# factor n - 1 as 2^(r)*s, where s is odd.
r, s = 0, n - 1
while s % 2 == 0:
r += 1
s //= 2
In the line
for i in range(k):
the variable i
is unused: it's conventional to name such variables _
.
You pick a random base between 1 and n − 1 inclusive:
a = random.randrange(1, n)
This is what it says in the MathWorld article, but that article is written from the mathematician's point of view. In fact it is useless to pick the base 1, since 1s = 1 (mod n) and you'll waste a trial. Similarly, it's useless to pick the base n − 1, since s is odd and so (n − 1)s = −1 (mod n). Mathematicians don't have to worry about wasted trials, but programmers do, so write instead:
a = random.randrange(2, n - 1)
(n needs to be at least 4 for this optimization to work, but we can easily arrange that by returning True
at the top of the function when n = 3, just as you do for n = 2.)
As noted in other replies, you've misunderstood the MathWorld article. When it says that "n passes the test" it means that "n passes the test for the base a". The distinguishing fact about primes is that they pass the test for all bases. So when you find that as = 1 (mod n), what you should do is to go round the loop and pick the next base to test against.
# a^(s) = 1 (mod n)?
x = pow(a, s, n)
if x == 1:
continue
There's an opportunity for optimization here. The value x that we've just computed is a20 s (mod n). So we could test it immediately and save ourselves one loop iteration:
# a^(s) = ±1 (mod n)?
x = pow(a, s, n)
if x == 1 or x == n - 1:
continue
In the section where you calculate a2j s (mod n) each of these numbers is the square of the previous number (modulo n). It's wasteful to calculate each from scratch when you could just square the previous value. So you should write this loop as:
# a^(2^(j) * s) = -1 (mod n)?
for _ in range(r - 1):
x = pow(x, 2, n)
if x == n - 1:
break
else:
return False
It's a good idea to test for divisibility by small primes before trying Miller–Rabin. For example, in Rabin's 1977 paper he says:
In implementing the algorithm we incorporate some laborsaving steps. First we test for divisibility by any prime p < N, where, say N = 1000.
Putting all this together:
from random import randrange
small_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31] # etc.
def probably_prime(n, k):
"""Return True if n passes k rounds of the Miller-Rabin primality
test (and is probably prime). Return False if n is proved to be
composite.
"""
if n < 2: return False
for p in small_primes:
if n < p * p: return True
if n % p == 0: return False
r, s = 0, n - 1
while s % 2 == 0:
r += 1
s //= 2
for _ in range(k):
a = randrange(2, n - 1)
x = pow(a, s, n)
if x == 1 or x == n - 1:
continue
for _ in range(r - 1):
x = pow(x, 2, n)
if x == n - 1:
break
else:
return False
return True
In addition to what Omri Barel has said, there is also a problem with your for loop. You will return true
if you find one a
that passes the test. However, all a
have to pass the test for n
to be a probable prime.
I'm wondering about this piece of code:
# factor n - 1 as 2^(r)*s
while r % 2 == 0:
s = s + 1
r = r // 2 # floor
Let's take n = 7
. So n - 1 = 6
. We can express n - 1
as 2^1 * 3
. In this case r = 1
and s = 3
.
But the code above finds something else. It starts with r = 6
, so r % 2 == 0
. Initially, s = 0
so after one iteration we have s = 1
and r = 3
. But now r % 2 != 0
and the loop terminates.
We end up with s = 1
and r = 3
which is clearly incorrect: 2^r * s = 8
.
You should not update s
in the loop. Instead, you should count how many times you can divide by 2 (this will be r
) and the result after the divisions will be s
. In the example of n = 7
, n - 1 = 6
, we can divide it once (so r = 1
) and after the division we end up with 3 (so s = 3
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With