Is the big integer implementation in the num crate slow?

Question

I implemented the Miller-Rabin Strong Pseudoprime Test in Rust using BigUint to support arbitrary large primes. To run through the numbers between 5 and 10^6, it took about 40s with cargo run --release.

I implemented the same algorithm with Java's BigInteger and the same test took 10s to finish. Rust appears to be 4 times slower. I assume this is caused by the implementation of num::bigint.

Is this just the current state of num::bigint, or can anyone spot any obvious improvement in my code? (Mainly about how I used the language. Regardless whether my implementation of the algorithm is good or bad, it is almost implemented exactly the same in both languages - so does not cause the difference in performance.)

I did notice there are lots of clone() required, due to Rust's ownership model, that could well impact the speed to some level. But I guess there is no way around that, am I right?

Here is the code:

extern crate rand;
extern crate num;
extern crate core;
extern crate time;

use std::time::{Duration};
use time::{now, Tm};

use rand::Rng;
use num::{Zero, One};
use num::bigint::{RandBigInt, BigUint, ToBigUint};
use num::traits::{ToPrimitive};
use num::integer::Integer;
use core::ops::{Add, Sub, Mul, Div, Rem, Shr};

fn find_r_and_d(i: BigUint) -> (u64, BigUint) {
    let mut d = i;
    let mut r = 0;
    loop {
        if d.clone().rem(&2u64.to_biguint().unwrap()) == Zero::zero() {
            d = d.shr(1usize);
            r = r + 1;
        } else {
            break;
        }
    }
    return (r, d);
}

fn might_be_prime(n: &BigUint) -> bool {
    let nsub1 = n.sub(1u64.to_biguint().unwrap());
    let two = 2u64.to_biguint().unwrap();

    let (r, d) = find_r_and_d(nsub1.clone());
    'WitnessLoop: for kk in 0..6u64 {
        let a = rand::thread_rng().gen_biguint_range(&two, &nsub1);
        let mut x = mod_exp(&a, &d, &n);
        if x == 1u64.to_biguint().unwrap() || x == nsub1 {
            continue;
        }
        for rr in 1..r {
            x = x.clone().mul(x.clone()).rem(n);
            if x == 1u64.to_biguint().unwrap() {
                return false;
            } else if x == nsub1 {
                continue 'WitnessLoop;
            } 
        }
        return false;
    }
    return true;
}

fn mod_exp(base: &BigUint, exponent: &BigUint, modulus: &BigUint) -> BigUint {
    let one = 1u64.to_biguint().unwrap();
    let mut result = one.clone();
    let mut base_clone = base.clone();
    let mut exponent_clone = exponent.clone();

    while exponent_clone > 0u64.to_biguint().unwrap() {
        if exponent_clone.clone() & one.clone() == one {
            result = result.mul(&base_clone).rem(modulus);
        } 
        base_clone = base_clone.clone().mul(base_clone).rem(modulus);
        exponent_clone = exponent_clone.shr(1usize);
    }
    return result;
}

fn main() {  
    let now1 = now();

    for n in 5u64..1_000_000u64 {
        let b = n.to_biguint().unwrap();
        if might_be_prime(&b) {
            println!("{}", n);
        }
    }

    let now2 = now();
    println!("{}", now2.to_timespec().sec - now1.to_timespec().sec);
}

Paolo Falabella · Accepted Answer

You can remove most of the clones pretty easily. BigUint has all ops traits implemented also for operations with &BigUint, not just working with values. With that, it becomes faster but still about half as fast as Java...

Also (not related to performance, just readability) you don't need to use add, sub, mul and shr explicitly; they override the regular +, -, * and >> operators.

For instance you could rewrite might_be_prime and mod_exp like this, which already gives a good speedup on my machine (from 40 to 24sec on avg):

fn might_be_prime(n: &BigUint) -> bool {
    let one = BigUint::one();
    let nsub1 = n - &one;
    let two = BigUint::new(vec![2]);
    let mut rng = rand::thread_rng();

    let (r, mut d) = find_r_and_d(nsub1.clone());
    let mut x;
    let mut a: BigUint;
    'WitnessLoop: for kk in 0..6u64 {
        a = rng.gen_biguint_range(&two, &nsub1);
        x = mod_exp(&mut a, &mut d, &n);
        if &x == &one || x == nsub1 {
            continue;
        }
        for rr in 1..r {
            x = (&x * &x) % n;
            if &x == &one {
                return false;
            } else if x == nsub1 {
                continue 'WitnessLoop;
            } 
        }
        return false;
    }
    true
}

fn mod_exp(base: &mut BigUint, exponent: &mut BigUint, modulus: &BigUint) -> BigUint {
    let one = BigUint::one();
    let zero = BigUint::zero();
    let mut result = BigUint::one();

    while &*exponent > &zero {
        if &*exponent & &one == one {
           result = (result * &*base) % modulus;
        }
        *base = (&*base * &*base) % modulus;
        *exponent = &*exponent >> 1usize;
    }
    result
}

Note that I've moved the println! out of the timing, so that we're not benchmarking IO.

fn main() {  
    let now1 = now();

    let v = (5u64..1_000_000u64)
        .filter_map(|n| n.to_biguint())
        .filter(|n| might_be_prime(&n))
        .collect::<Vec<BigUint>>();

    let now2 = now();
    for n in v {
        println!("{}", n);
    }
    println!("time spent seconds: {}", now2.to_timespec().sec - now1.to_timespec().sec);
}

Is the big integer implementation in the num crate slow?

Tags:

performance

performance-testing

rust

biginteger

Peter Pei Guo

1 Answers

Paolo Falabella

Recent Activity

Donate For Us

Is the big integer implementation in the num crate slow?

Tags:

performance

performance-testing

rust

biginteger

Peter Pei Guo

1 Answers

Paolo Falabella

Related questions

Recent Activity

Donate For Us