Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an algorithm to find the nearest number with only small factors?

I need to do some real time DFT and the algorithm I am using is the most efficient when the number of samples can be broken down into small factors.

Suppose that I have a number n and factors 2, 3, 5. How to find the nearest number (compared to n) whose prime factorisation consists of no numbers other than 2,3,5?

n is almost always below 50,000 so brute forcing might be a good idea.

like image 671
Henricus V. Avatar asked Aug 19 '16 22:08

Henricus V.


People also ask

How do you find the smallest factor of a number?

The quickest way to find the factors of a number is to divide it by the smallest prime number (bigger than 1) that goes into it evenly with no remainder.

What is the integer factorization problem?

The integer factorization problem is defined as follows: given a composite number N, find two integers x and y such that x · y = N. Factoring is an important problem because if it can be done efficiently, then it can be shown that RSA encryption is insecure.


2 Answers

There are exactly 265 numbers that are divisible only by 2,3,5 in the range 1 to 50000. So you could make a small table, and look up the answer in the table. However, on my computer, it takes an average of 6.5 microseconds to find the nearest 2-3-5-factorable number for a given number, so I'd say brute force is good enough.

int isValid( int n )
{
    while ( n%2 == 0 )
        n /= 2;
    while ( n%3 == 0 )
        n /= 3;
    while ( n%5 == 0 )
        n /= 5;

    return n == 1;
}

int findBest( int n )
{
    for ( int i = 0; i < n; i++ )
    {
        if ( isValid(n+i) )
            return n+i;
        if ( isValid(n-i) )
            return n-i;
    }
    return 0;   // should never get here
}

int main( void )
{
    for ( int n = 1; n <= 50000; n++ )
        printf( "%d->%d\n", n,findBest(n) );
}
like image 83
user3386109 Avatar answered Oct 07 '22 15:10

user3386109


A fast algorithm for pairs of factors

This doesn't quite solve the problem as stated -- given some integer x, it will only find the nearest greater-or-equal number that has no factors besides 2 and 3 (or any other given pair of factors). But I think it's cute and since it runs in O(log x) time and O(1) space it might be useful regardless! It's similar in concept to the Bresenham line algorithm. In pseudocode:

  1. Start with b = y = 2^RoundUp(log2(x)), which ensures that b = y >= x.
  2. If y < x then set y = y * 3 and go to 2.
  3. If y < b then set b = y. (b records the best candidate so far.)
  4. If y is odd then stop and return b.
  5. Otherwise, set y = y / 2 and go to 2.

Correctness

Why does this work? Notice that at all times, we have y = 2^i*3^j for some i, j >= 0, and that over time, i only ever decreases, while j only ever increases. The invariant we maintain on entry to step 2 is "Any value z = 2^a*3^b having a > i or b < j is known to be uninteresting (i.e., invalid or no better than some already-discovered solution), and so doesn't need to be considered". This is clearly true the first time we arrive at step 2, since y is a power of 2 and already >= x, so any number z = 2^a*3^b with a > i would then be at least 2*y (regardless of b) which is worse than y; and no integer z can have fewer than the j = 0 powers of 3 in y.

(Another way to state this invariant is "Either we have already found the optimal solution, or it is some number z = 2^a*3^b with a <= i and b >= j.")

If the condition at step 2, "y < x", is satisfied, then y = 2^i*3^j is not a valid solution as it's too low. More strongly, it's clear that for any a <= i, 2^a*3^j cannot be a valid solution either, as any such solution is at least as low as y. So now we know (from the invariant) that any pair (a, b) satisfying (a > i OR b < j) is uninteresting, and we know from the "y < x" test that just succeeded that any pair (a, b) satisfying (a <= i AND b = j) is uninteresting too. Now consider any pair (a, b) satisfying the slightly different condition (a > i OR b < j+1): if (a, b) doesn't satisfy the first condition (from the invariant) for uninterestingness then we have ((a > i OR b < j+1) AND !(a > i OR b < j)), which simplifies to ((a > i OR b < j+1) AND (a <= i AND b >= j)) via De Morgan's rule, and then to (b < j+1 AND a <= i AND b >= j) (because making (a <= i AND b >= j) true requires (a <= i) to be true, forcing (a > i) to be false and thus allowing its elimination from the OR), which is clearly the same as (a <= i AND b = j) -- but that is exactly the condition for which we have just established a second kind of uninterestingness, via the success of the "y < x" test. So this establishes that any pair (a, b) satisfying (a > i OR b < j+1) is uninteresting -- which, after incrementing j, becomes exactly the invariant we want to preserve.

The justification for decrementing i in step 5 is almost the same, but in reverse, so I won't go into detail on it. The slight difference is that if we get to step 5, instead of having an invalid solution, we simply have a solution that is at least as high as the best solution in b (notice that we updated b if necessary so that this would continue to hold), and so it and every higher solution is (from this point on) uninteresting to us.

Each value of y generated by the algorithm either has one less factor of 2 or one more factor of 3 than any previously generated value, so it's clear that all generated y values are distinct. The reasoning in the earlier paragraphs establishes that the only y values not generated are those that have been proven to be uninteresting. Thus if the algorithm always halts in a finite time, it is correct. The next section will imply that this is indeed the case.

Running time

Step 5 (which has the effect of decreasing i by 1) never executes more than log2(x)+1 times, since i starts at this value or less, nothing else affects i, and when i reaches 0, y will be odd, causing step 4 to terminate the function. But how many times can the condition in step 2 that increases j by 1 fire?

Initially y >= x, so achieving the y < x condition needed for step 2 to fire requires an execution of step 5. But as soon as y < x is achieved by some execution of step 5, it is immediately undone at the next execution of step 2: this is because in order for step 5 to execute at all, we must have had y >= x, and if we divide y by 2 (at that step 5) and then multiply it by 3 (at the next step 2), it will necessarily be at least as large as it was previously, implying y >= x again, in turn implying that step 2 will never fire twice in a row without an execution of step 5 in between. So step 2 will fire at most as many times as step 5 does, i.e. at most log2(x)+1 times. This bounds the overall runtime of the algorithm at O(log x).

Notes

  • You can avoid floating-point arithmetic by replacing the log2() in step 1 with a loop that starts y at 1 and keeps doubling it until it is >= x. This is O(log x) and so doesn't hurt the time complexity.
  • You can use any other pair of factors. The only real change is that, if f is the factor "replacing" 2 in the code, then step 4 should instead halt when y % f != 0.
like image 27
j_random_hacker Avatar answered Oct 07 '22 14:10

j_random_hacker