Consider the following method:
public static boolean isPrime(int n) {
return ! (new String(new char[n])).matches(".?|(..+?)\\1+");
}
I've never been a regular expression guru, so can anyone fully explain how this method actually works? Furthermore, is it efficient compared to other possible methods for determining whether an integer is prime?
Java Guava | isPrime() method of IntMath ClassThe isPrime(int n) method of Guava's IntMath class is used to check whether the parameter passed to it is a prime number or not. If the parameter passed to it is prime, then it returns True otherwise it returns False.
First, note that this regex applies to numbers represented in a unary counting system, i.e.
1 is 1
11 is 2
111 is 3
1111 is 4
11111 is 5
111111 is 6
1111111 is 7
and so on. Really, any character can be used (hence the .
s in the expression), but I'll use "1".
Second, note that this regex matches composite (non-prime) numbers; thus negation detects primality.
Explanation:
The first half of the expression,
.?
says that the strings "" (0) and "1" (1) are matches, i.e. not prime (by definition, though arguable.)
The second half, in simple English, says:
Match the shortest string whose length is at least 2, for example, "11" (2). Now, see if we can match the entire string by repeating it. Does "1111" (4) match? Does "111111" (6) match? Does "11111111" (8) match? And so on. If not, then try it again for the next shortest string, "111" (3). Etc.
You can now see how, if the original string can't be matched as a multiple of its substrings, then by definition, it's prime!
BTW, the non-greedy operator ?
is what makes the "algorithm" start from the shortest and count up.
Efficiency:
It's interesting, but certainly not efficient, by various arguments, some of which I'll consolidate below:
As @TeddHopp notes, the well-known sieve-of-Eratosthenes approach would not bother to check multiples of integers such as 4, 6, and 9, having been "visited" already while checking multiples of 2 and 3. Alas, this regex approach exhaustively checks every smaller integer.
As @PetarMinchev notes, we can "short-circuit" the multiples-checking scheme once we reach the square root of the number. We should be able to because a factor greater than the square root must partner with a factor lesser than the square root (since otherwise two factors greater than the square root would produce a product greater than the number), and if this greater factor exists, then we should have already encountered (and thus, matched) the lesser factor.
As @Jesper and @Brian note with concision, from a non-algorithmic perspective, consider how a regular expression would begin by allocating memory to store the string, e.g. char[9000]
for 9000. Well, that was easy, wasn't it? ;)
As @Foon notes, there exist probabilistic methods which may be more efficient for larger numbers, though they may not always be correct (turning up pseudoprimes instead). But also there are deterministic tests that are 100% accurate and far more efficient than sieve-based methods. Wolfram's has a nice summary.
The unary characteristics of primes and why this works has already been covered. So here's a test using conventional approaches and this approach:
public class Main {
public static void main(String[] args) {
long time = System.nanoTime();
for (int i = 2; i < 10000; i++) {
isPrimeOld(i);
}
time = System.nanoTime() - time;
System.out.println(time + " ns (" + time / 1000000 + " ms)");
time = System.nanoTime();
for (int i = 2; i < 10000; i++) {
isPrimeRegex(i);
}
time = System.nanoTime() - time;
System.out.println(time + " ns (" + time / 1000000 + " ms)");
System.out.println("Done");
}
public static boolean isPrimeRegex(int n) {
return !(new String(new char[n])).matches(".?|(..+?)\\1+");
}
public static boolean isPrimeOld(int n) {
if (n == 2)
return true;
if (n < 2)
return false;
if ((n & 1) == 0)
return false;
int limit = (int) Math.round(Math.sqrt(n));
for (int i = 3; i <= limit; i += 2) {
if (n % i == 0)
return false;
}
return true;
}
}
This test computes whether or not the number is prime up to 9,999, starting from 2. And here's its output on a relatively powerful server:
8537795 ns (8 ms)
30842526146 ns (30842 ms)
Done
So it is grossly inefficient once the numbers get large enough. (For up to 999, the regex runs in about 400 ms.) For small numbers, it's fast, but it's still faster to generate the primes up to 9,999 the conventional way than it is to even generate primes up to 99 the old way (23 ms).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With