Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reducing Integer Fractions Algorithm

(This is derived from a recently completed programming competition)

You are given two arrays of 10^5 ints in the range 1..10^7 inclusive:

int N[100000] = { ... }
int D[100000] = { ... }

Imagine the rational number X be the result of multiplying together all the elements of N and dividing by all the elements of D.

Modify the two arrays without changing the value of X (and without assigning any element out of range) such that the product of N and the product of D have no common factor.

A naive solution (I think) would work would be...

for (int i = 0; i < 100000; i++)
    for (int j = 0; j < 100000; j++)
    {
        int k = gcd(N[i], D[j]); // euclids algorithm

        N[i] /= k;
        D[j] /= k;
    }

...but this is too slow.

What is a solution that takes less than around 10^9 operations?

like image 536
Andrew Tomazos Avatar asked Sep 10 '12 19:09

Andrew Tomazos


3 Answers

Factorise all numbers in the range 1 to 107. Using a modification of a Sieve of Eratosthenes, you can factorise all numbers from 1 to n in O(n*log n) time (I think it's a bit better, O(n*(log log n)²) or so) using O(n*log log n) space. Better than that is probably creating an array of just the smallest prime factors.

// Not very optimised, one could easily leave out the even numbers, or also the multiples of 3
// to reduce space usage and computation time
int *spf_sieve = malloc((limit+1)*sizeof *spf_sieve);
int root = (int)sqrt(limit);
for(i = 1; i <= limit; ++i) {
    spf_sieve[i] = i;
}
for(i = 4; i <= limit; i += 2) {
    spf_sieve[i] = 2;
}
for(i = 3; i <= root; i += 2) {
    if(spf_sieve[i] == i) {
        for(j = i*i, step = 2*i; j <= limit; j += step) {
            if (spf_sieve[j] == j) {
                spf_sieve[j] = i;
            }
        }
    }
}

To factorise a number n > 1 using that sieve, look up its smallest prime factor p, determine its multiplicity in the factorisation of n (either by looking up recursively, or by simply dividing until p doesn't evenly divide the remaining cofactor, which is faster depends) and the cofactor. While the cofactor is larger than 1, look up the next prime factor and repeat.

Create a map from primes to integers

Go through both arrays, for each number in N, add the exponent of each prime in its factorisation to the value in the map, for the numbers in D, subtract.

Go through the map, if the exponent of the prime is positive, enter p^exponent to the array N (you may need to split that across several indices if the exponent is too large, and for small values, combine several primes into one entry - there are 664579 primes less than 107, so the 100,000 slots in the arrays may not be enough to store each appearing prime with the correct power), if the exponent is negative, do the same with the D array, if it's 0, ignore that prime.

Any unused slots in N or D are then set to 1.

like image 70
Daniel Fischer Avatar answered Nov 04 '22 18:11

Daniel Fischer


Factorize each element of either array, sort, cancel. Factorization is constant time for ints of bounded size, sorting is n log n, and cancellation will be linear. The constant factors may be large, though.

If you're trying for lower actual execution time instead of lower asymptotic complexity, it probably wouldn't hurt to preprocess the arrays by manually cancelling small factors, such as powers of 2, 3, 5, and 7. With high probability (i.e. except for pathological inputs), this will speed up most algorithms immensely, at the cost of a few linear-time passes.

One more sophisticated method, integrating the above approaches, would be to start by building a list of primes up to sqrt(10^7) ~= 3162. There should be about 3162/ln(3162) ~= 392 such primes, by the prime number theorem. (In fact, to save running time, you could/should precompute this table.)

Then, for each such integer in N, and for each prime, reduce the integer by that prime until it no longer divides evenly, and each time increment a count for that prime. Do the same for D, decrementing instead. Once you've gone through the table of primes, the current int will be non-1 if and only if it is a prime larger than 3162. This should be about 7% of the total integers in each array. You can keep these in a heap or somesuch. Set them to ones in the array as well, as you go along.

Finally, you iterate over the positive factors and put their product into N. You will probably need to split this across multiple array slots, which is fine. Put the negative factors into D, and you're done!

The runtime on this will take me a minute to work out. Hopefully, it's reasonable.

like image 44
Thom Smith Avatar answered Nov 04 '22 19:11

Thom Smith


Lets prime factorize every element in N & D in O(sqrt(10^7) * 10^5) as

N[i]=p1^kn1 * p2^kn2 ... 
D[i]=p1^kd1 * p2^kd2 ...

Maintain 2 Power arrays where

Power_N[p1]=sum of kn1 for all i's
Power_D[p1]=sum of kd1 for all i's

Divide the N and D by p1^(min(Power_N[p1],Power_D[p2])) in O(10^5) each
like image 31
Sajal Jain Avatar answered Nov 04 '22 18:11

Sajal Jain