Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently generate all composite numbers less than N (with their factorizations)

I'd like to build an efficient Python iterator/generator that yields:

  • All composite numbers less than N
  • Along with their prime factorization

I'll call it "composites_with_factors()"

Assume we already have a list of primes less than N, or a primes generator that can do the same.

Note that I:

  • DO NOT need the numbers to be yielded in numerical order
  • DO NOT care if 1 is yielded at the beginning or not
  • DO NOT care if primes are yielded, too

I figure this can be done with a clever recursive generator...

So, for example, a call to composites_with_factors(16) may yield:

# yields values in form of "composite_value, (factor_tuple)"
2, (2)
4, (2, 2)
8, (2, 2, 2)
6, (2, 3)
12, (2, 2, 3)
10, (2, 5)
14, (2, 7)
3, (3)
9, (3, 3)
15, (3, 5)
5, (5)
7, (7)
11, (11)
13, (13)

As you can see from the order of my output, I conceive of this working by starting with the smallest prime on the available primes generator, and outputting all powers of that prime less than N, then try again through the powers of that prime but at each stage seeing if I can apply powers of additional primes (and still be less than N). When all combinations with THAT prime are done, drop it, and repeat with the next lowest prime number available on the primes generator.

My attempts to do this with "recursive generators" have gotten me very confused on when to pop out of the recursion with "yield ", or "raise StopIteration", or "return", or simply fall out of the recursed function.

Thanks for your wisdom!

ADDITIONAL NOTE:

I do have one way to do this now: I have written a function to factor numbers, so I can factor them down to primes, and yield the results. No problem. I keep this blazingly fast by relying on a cache of "what is the lowest prime factor of number N"... for N up to 10 million.

However, once I'm out of the cache, we'll, it devolves to "naive" factoring. (Yuck.)

The point of this post is:

  • I'm assuming that "generating large composites from their factors" will be faster than "factoring large composites"... especially since I DON'T care about order, and
  • How can you have a Python generator "recursively" call itself, and yield a single stream of generated things?
like image 809
Dan H Avatar asked Apr 11 '12 15:04

Dan H


People also ask

What are the composite numbers less than 21?

prime numbers: 2, 3, 5, 7, 11, 13, 17, 19. composite numbers: 4, 6, 8, 9, 10, 12, 14, 15, 16, 18.

What are composite numbers less than100?

4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 25, 26, 27, 28, 30, 32, 33, 34, 35, 36, 38, 39, 40, 42, 44, 45, 46, 48, 49, 50, 51, 52, 54, 55, 56, 57, 58, 60, 62, 63, 64, 65, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 80, 81,82, 84, 85, 86, 87, 88, 90, 91, 92, 93, 94, 95, 96, 98, 99, 100.

What is the composite number less than 4?

The first few composite numbers (sometimes called "composites" for short) are 4, 6, 8, 9, 10, 12, 14, 15, 16, ... (OEIS A002808), whose prime decompositions are summarized in the following table. Note that the number 1 is a special case which is considered to be neither composite nor prime.


2 Answers

Assuming primesiter(n) creates an iterator over all primes up to n (1 should NOT be included in primesiter, or following code well enter inf. loop)

def composite_value(n, min_p = 0):
    for p in primesiter(n):
        # avoid double solutions such as (6, [2,3]), and (6, [3,2])
        if p < min_p: continue
        yield (p, [p])
        for t, r in composite_value(n//p, min_p = p): # uses integer division
            yield (t*p, [p] + r)

Output

>> list(composite_value(16))
[(2, [2]),
 (4, [2, 2]),
 (8, [2, 2, 2]),
 (16, [2, 2, 2, 2]),
 (12, [2, 2, 3]),
 (6, [2, 3]),
 (10, [2, 5]),
 (14, [2, 7]),
 (3, [3]),
 (9, [3, 3]),
 (15, [3, 5]),
 (5, [5]),
 (7, [7]),
 (11, [11]),
 (13, [13])]

NOTE: it includes n (= 16) as well, and I used list instead of tuples. Both can easily be resolved if needed, but I will leave that as an exercise.

like image 59
catchmeifyoutry Avatar answered Nov 11 '22 06:11

catchmeifyoutry


Here is a sieve-based implementation (please excuse the un-pythonic code :) ):

def sieve(n):
    # start each number off with an empty list of factors
    #   note that nums[n] will give the factors of n
    nums = [[] for x in range(n)]
    # start the counter at the first prime
    prime = 2
    while prime < n:
        power = prime
        while power < n:
            multiple = power
            while multiple < n:
                nums[multiple].append(prime)
                multiple += power
            power *= prime
        # find the next prime
        #   the next number with no factors
        k = prime + 1
        if k >= n:    # no primes left!!!
            return nums
        # the prime will have an empty list of factors
        while len(nums[k]) > 0:
            k += 1
            if k >= n:    # no primes left!!!
                return nums
        prime = k
    return nums


def runTests():
    primes = sieve(100)
    if primes[3] == [3]:
        print "passed"
    else:
        print "failed"
    if primes[10] == [2,5]:
        print "passed"
    else:
        print "failed"
    if primes[32] == [2,2,2,2,2]:
        print "passed"
    else:
        print "failed"

Tests:

>>> runTests()
passed
passed
passed

On my machine, this took 56 seconds to run:

primes = sieve(14000000) # 14 million!

Examples:

>>> primes[:10]
[[], [], [2], [3], [2, 2], [5], [2, 3], [7], [2, 2, 2], [3, 3]]

>>> primes[10000]
[2, 2, 2, 2, 5, 5, 5, 5]

>>> primes[65536]
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

>>> primes[6561]
[3, 3, 3, 3, 3, 3, 3, 3]

>>> primes[233223]
[3, 17, 17, 269]

Memory consumption: about 50 million integers, in 14 million lists:

>>> sum(map(len, primes))
53303934
like image 34
Matt Fenwick Avatar answered Nov 11 '22 05:11

Matt Fenwick