Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prime Iterator in Julia

Tags:

primes

julia

Is there an (efficient) iterator to generate prime numbers in Julia? The inbuilt function primes[N] generates all primes up to N at once, rather than as required, and may not be usable when N is very large, or unknown.

like image 533
u003f Avatar asked May 10 '16 15:05

u003f


3 Answers

You can filter a counter going through the (big) ints (the Base.Count{BigInt} iterator) using the probabilistic primality test

iterprimes = filter(isprime,countfrom(big(2),1))

Then for example

julia> collect(take(iterprimes, 5))
5-element Array{Any,1}:
  2
  3
  5
  7
 11

This is not so effective in total as a sieve but does not keep a huge structure in memory. I recall that isprime has at least no false positives up to 2^64 with the standard choice of repetitions.

Edit:

A second possibility is to generate (see Generator) chunks of primes(N*(i-1)+1,N*i) and Base.flatten them into a single list:

Base.flatten(primes(1000000*(i-1)+1,1000000*i) for i in countfrom(1))

On this machine this iterator actually beats plain primes for computing the first 10^9 primes.

Edit 2:

An Iterator using gmpz's nextprime.

type 
   PrimeIter
end
function nextprime(y::BigInt)
    x = BigInt()
    ccall((:__gmpz_nextprime,:libgmp), Void, (Ptr{BigInt},Ptr{BigInt}), &x, &y)
    x
end
Base.start(::PrimeIter) = big(2)
Base.next(::PrimeIter, state) = state, nextprime(state)
Base.done(::PrimeIter, _) = false
Base.iteratorsize(::PrimeIter) = Base.IsInfinite()


> first(drop(PrimeIter(), 10^5))
1299721
like image 153
mschauer Avatar answered Nov 19 '22 17:11

mschauer


You can check out Lazy.jl, which gives you prime iteration on demand. It works for an unknown large number. The assumption is that you want to use all prime numbers lesser than an upper bound, and have the space to store them.

Quote from their readme:-

# isprime defined in terms of the prime numbers:
isprime(n) =
  @>> primes begin
    takewhile(x -> x<=sqrt(n))
    map(x -> n % x == 0)
    any; !
  end

# the prime numbers defined in terms of isprime:
primes = filter(isprime, range(2));

take(20, primes)
#> (2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71)

To explain the code, firstly isprime function is defined using the list of all primes primes (which haven't been defined yet at that point in time), by taking all primes lesser than the square root of n, check whether any of them divides n, and negate the result logically.

Then prime is defined as a filter of isprime over all integers from 2 onward.

To get all prime numbers below n, you can just run @>> primes takewhile(p -> p <= n) instead of take.

like image 2
ShaoWei Teo Avatar answered Nov 19 '22 19:11

ShaoWei Teo


You don't say what you consider a reasonable range for your iterator or how long you would like to process it. Such algorithms generally come in two flavors, as in: A) short but slower (a range of a million in the order of a second), and B) more complex but much faster (about a hundred times faster than A). Following are examples of each.

A) a version of an iterator based on the built-in "Primes" package (pkg add "Primes") and specifically the nextprime function:

using Primes: nextprime

mutable struct PrimesGen
    lastprime :: UInt64
    PrimesGen() = new()
end
Base.eltype(::Type{PrimesGen}) = Int64
Base.IteratorSize(::PrimesGen) = Base.IsInfinite()
function Base.iterate(PG::PrimesGen, st::UInt64 = UInt64(1)) # :: Union{Nothing,Tuple{UInt64,UInt64}}
    next = nextprime(st + 1)
    next, next
end

EDIT_ADD: the above code is a slightly faster solution than @mschauer's first solution (updated to current Julia version 1.0) due to his multiple nested iterators, as follows:

using Primes: isprime
PrimesGen() = Iterators.filter(isprime, Iterators.countfrom(UInt64(2)))

but it's short and can be used the same way... END_EDIT_ADD

with which you can do something like the following:

using Printf
@time let sm = 0
          for p in PrimesGen() p >= 2_000_000 && break; sm += p end
          Printf.@printf("%d\n",sm)
      end

to produce the following:

142913828922
  0.651754 seconds (327.05 k allocations: 4.990 MiB)

This is adequate to use for these smaller ranges such as solving Euler Problem 10 above (run on an Intel x5-Z8350 at 1.92 Gigahertz).

The above iterator actually has an "infinite" range of the UInt64 number but that won't be reached for over 300 thousand years so we don't really need to worry about it...

B) For "industrial strength" problems involving ranges of a billion or more, one needs an iterator of (or direct function calls into) an implementation of the page segmented Sieve of Eratosthenes at about a hundred times faster, with an implementation as follows:

const Prime = UInt64
const BasePrime = UInt32
const BasePrimesArray = Array{BasePrime,1}
const SieveBuffer = Array{UInt8,1}

# contains a lazy list of a secondary base primes arrays feed
# NOT thread safe; needs a Mutex gate to make it so...
abstract type BPAS end # stands in for BasePrimesArrays, not defined yet
mutable struct BasePrimesArrays <: BPAS
    thunk :: Union{Nothing,Function} # problem with efficiency - untyped function!!!!!!!!!
    value :: Union{Nothing,Tuple{BasePrimesArray, BPAS}}
    BasePrimesArrays(thunk::Function) = new(thunk)
end
Base.eltype(::Type{BasePrimesArrays}) = BasePrime
Base.IteratorSize(::Type{BasePrimesArrays}) = Base.SizeUnknown() # "infinite"...
function Base.iterate(BPAs::BasePrimesArrays, state::BasePrimesArrays = BPAs)
    if state.thunk !== nothing
        newvalue :: Union{Nothing,Tuple{BasePrimesArray, BasePrimesArrays}} =
            state.thunk() :: Union{Nothing,Tuple{BasePrimesArray
                                                 , BasePrimesArrays}}
        state.value = newvalue
        state.thunk = nothing
        return newvalue
    end
    state.value
end

# count the number of zero bits (primes) in a byte array,
# also works for part arrays/slices, best used as an `@view`...
function countComposites(cmpsts::AbstractArray{UInt8,1})
    foldl((a, b) -> a + count_zeros(b), cmpsts; init = 0)
end

# converts an entire sieved array of bytes into an array of UInt32 primes,
# to be used as a source of base primes...
function composites2BasePrimesArray(low::Prime, cmpsts::SieveBuffer)
    limiti = length(cmpsts) * 8
    len :: Int = countComposites(cmpsts)
    rslt :: BasePrimesArray = BasePrimesArray(undef, len)
    i :: Int = 0
    j :: Int = 1
    @inbounds(
    while i < limiti
        if cmpsts[i >>> 3 + 1] & (1 << (i & 7)) == 0
            rslt[j] = low + i + i
            j += 1
        end
        i += 1
    end)
    rslt
end

# sieving work done, based on low starting value for the given buffer and
# the given lazy list of base prime arrays...
function sieveComposites(low::Prime, buffer::Array{UInt8,1},
                                     bpas::BasePrimesArrays)
    lowi :: Int = (low - 3) ÷ 2
    len :: Int = length(buffer)
    limiti :: Int = len * 8 - 1
    nexti :: Int = lowi + limiti
    for bpa::BasePrimesArray in bpas
        for bp::BasePrime in bpa
            bpint :: Int = bp
            bpi :: Int = (bpint - 3) >>> 1
            starti :: Int = 2 * bpi * (bpi + 3) + 3
            starti >= nexti && return
            if starti >= lowi starti -= lowi
            else
                r :: Int = (lowi - starti) % bpint
                starti = r == 0 ? 0 : bpint - r
            end
            lmti :: Int = limiti - 40 * bpint
            @inbounds(
            if bpint <= (len >>> 2) starti <= lmti
                for i in 1:8
                    if starti > limiti break end
                    mask = convert(UInt8,1) << (starti & 7)
                    c = starti >>> 3 + 1
                    while c <= len
                        buffer[c] |= mask
                        c += bpint
                    end
                    starti += bpint
                end
            else
                c = starti
                while c <= limiti
                    buffer[c >>> 3 + 1] |= convert(UInt8,1) << (c & 7)
                    c += bpint
                end
            end)
        end
    end
    return
end

# starts the secondary base primes feed with minimum size in bits set to 4K...
# thus, for the first buffer primes up to 8293,
# the seeded primes easily cover it as 97 squared is 9409.
function makeBasePrimesArrays() :: BasePrimesArrays
    cmpsts :: SieveBuffer = Array{UInt8,1}(undef, 512)
    function nextelem(low::Prime, bpas::BasePrimesArrays) ::
                                    Tuple{BasePrimesArray, BasePrimesArrays}
        # calculate size so that the bit span is at least as big as the
        # maximum culling prime required, rounded up to minsizebits blocks...
        reqdsize :: Int = 2 + isqrt(1 + low)
        size :: Int = (reqdsize ÷ 4096 + 1) * 4096 ÷ 8 # size in bytes
        if size > length(cmpsts) cmpsts = Array{UInt8,1}(undef, size) end
        fill!(cmpsts, 0)
        sieveComposites(low, cmpsts, bpas)
        arr :: BasePrimesArray = composites2BasePrimesArray(low, cmpsts)
        next :: Prime = low + length(cmpsts) * 8 * 2
        arr, BasePrimesArrays(() -> nextelem(next, bpas))
    end
    # pre-seeding breaks recursive race,
    # as only known base primes used for first page...
    preseedarr :: BasePrimesArray = # pre-seed to 100, can sieve to 10,000...
        [ 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41
        , 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97
        ]
    nextfunc :: Function = () ->
        (nextelem(convert(Prime,101), makeBasePrimesArrays()))
    firstfunc :: Function = () -> (preseedarr, BasePrimesArrays(nextfunc))
    BasePrimesArrays(firstfunc)
end

# an iterator over successive sieved buffer composite arrays,
# returning a tuple of the value represented by the lowest possible prime
# in the sieved composites array and the array itself;
# array has a 16 Kilobytes minimum size (CPU L1 cache), but
# will grow so that the bit span is larger than the
# maximum culling base prime required, possibly making it larger than
# the L1 cache for large ranges, but still reasonably efficient using
# the L2 cache: very efficient up to about 16e9 range;
# reasonably efficient to about 2.56e14 for two Megabyte L2 cache = > 1 week...
struct PrimesPages
    baseprimes :: BasePrimesArrays
    PrimesPages() = new(makeBasePrimesArrays())
end
Base.eltype(::Type{PrimesPages}) = SieveBuffer
Base.IteratorSize(::Type{PrimesPages}) = Base.IsInfinite()
function Base.iterate(PP::PrimesPages,
                      state :: Tuple{Prime,SieveBuffer} =
                            ( convert(Prime,3), Array{UInt8,1}(undef,16384) ))
    (low, cmpsts) = state
    # calculate size so that the bit span is at least as big as the
    # maximum culling prime required, rounded up to minsizebits blocks...
    reqdsize :: Int = 2 + isqrt(1 + low)
    size :: Int = (reqdsize ÷ 131072 + 1) * 131072 ÷ 8 # size in bytes
    if size > length(cmpsts) cmpsts = Array{UInt8,1}(undef, size) end
    fill!(cmpsts, 0)
    sieveComposites(low, cmpsts, PP.baseprimes)
    newlow :: Prime = low + length(cmpsts) * 8 * 2
    ( low, cmpsts ), ( newlow, cmpsts )
end

function countPrimesTo(range::Prime) :: Int64
    range < 3 && ((range < 2 && return 0) || return 1)
    count :: Int64 = 1
    for ( low, cmpsts ) in PrimesPages() # almost never exits!!!
        if low + length(cmpsts) * 8 * 2 > range
            lasti :: Int = (range - low) ÷ 2
            count += countComposites(@view cmpsts[1:lasti >>> 3])
            count += count_zeros(cmpsts[lasti >>> 3 + 1] |
                                 (0xFE << (lasti & 7)))
            return count
        end
        count += countComposites(cmpsts)
    end
    count
end

which can be called as follows:

using Printf
@time let sm = 0
          for p in PrimesPaged() p >= 2_000_000 && break; sm += p end
          Printf.@printf("%d\n",sm)
      end

to produce the following:

142913828922
  0.016245 seconds (60 allocations: 23.891 KiB)

but than is barely enough to "warm it up"; it can be called with an argument a hundred times as large to produce the following:

1075207199997334
  1.381198 seconds (2.35 k allocations: 103.875 KiB)

and can count all the primes to a billion with the following code:

println(@time let count = 0
                  for p in PrimesPaged()
                      p > 1_000_000_000 && break
                      count += 1
                  end; count end)

to produce the following:

  6.802044 seconds (11.51 k allocations: 396.734 KiB)
50847534

However, it takes much longer to iterate over the sieved primes than it does to sieve them in the first place. This can be shown by calling the provided optimized counting function to eliminate most of the enumeration time as follows:

println(@time countPrimesTo(Prime(1_000_000_000)))

to produce the following:

  1.959057 seconds (65 allocations: 39.266 KiB)
50847534

In a similar fashion, one could write a sumPrimesTo function to sum the sieved primes to a billion in just a little more time (say twice)...

All tests here are run on the same x5-Z8350 CPU at 1.92 Gigahertz.

This shows that for truly huge problems, one shouldn't use an iterator and should rather use custom functions that operate directly on the culled page segments as the countPrimesTo function does here. When this is done, it would be worth making further optimizations such as maximum wheel factorization (far a gain in a further four times in speed for the sieving) and multi-threading (for a gain of the number of effective CPU cures used (not including Hyper Threading ones) with the end result not being much slower if any than Kim Walich's "primesieve".

Again, this has the same UInt64 "infinite" limit, but it still will take hundreds of years to get there so we still don't have to worry about it.

like image 1
GordonBGood Avatar answered Nov 19 '22 18:11

GordonBGood