Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding this polynomial division algorithm in a functional language

**This is an algorithm that is implemented in a rather archaic language, standard ML. I have difficulty understanding this algorithm:

fun polyquotremd ts ((n,b) :: us) = 
    let fun quo [] qs = (rev qs, [])
        | quo ((m,a) :: ts) qs = 
            if m < n then (rev qs, (m,a) :: ts)
            else
                quo (polysum ts
                    (map (termproduct(m-n, ~a/b)) us))
                    ((m-n, a/b) :: qs)
    in
        quo ts []
    end;

Here is the objective. We represent a polynomial densely through using a list of tuples, each tuple containing the power at index 0 and the coefficient at index 1. For example, $x^2 + 1$ is represented as [(2, 1.0), (0, 1.0)]. This is given by the type (int*real)list as seen in the helper functions below. We want to obtain an output (quo, rem), each of which is a list of tuples indicating the quotient and remainder respectively. ts is the polynomial to be divided, while us is the divisor.

Here is how I imagine the algorithm works:

  1. At first, if ts = [] that means we have no polynomials to divide, and so we return qs, our cumulative quotient result as the answer. Since we keep on adding ((m-n, a/b) :: qs) to the qs list's head, we need to reverse the list in order to get the list to start with the greatest exponent.

  2. The first if condition handles the case when the exponent of the divisor, n, is larger than the current power m we are handling. In which case we just not add the element and return (rev qs, (m,a) :: ts), which is (quo, rem)

  3. The else condition is the part I am confused with. I know the second bracket ((m-n, a/b) :: qs) is the cumulative result we want to add, but what is polysum ts (map (termproduct(m-n, ~a/b)) us)) actually doing? What is the mathematics behind this?

Ordinarily, we would divide the polynomial divisors together at once, i.e. $x+1$ together. But this algorithm is dividing using $x$ first, then $1$. How does this even work? Why is there ~a/b (which, for your reference, means -a/b).


Helper functions:

fun take ([], _) = []
| take (x::xs, i) = if i>0
then x :: take(xs, i-1)
else [];

fun drop ([], _) = []
| drop (x::xs, i) = if i>0 then drop(xs,i-1)
else x::xs;

fun termproduct (m,a) (n,b) = (m+n, a*b) : (int*real);

fun polyproduct [] us = []
    | polyproduct [(m,a)] us = map (termproduct(m,a)) us
    | polyproduct ts us =
    let 
        val k = length ts div 2
    in 
        polysum (polyproduct (take(ts,k)) us)
                (polyproduct (drop(ts,k)) us)
    end;

Can someone explain point 3 to me? I am very confused by this, especially by the math of this algorithm. But it works.. FYI this code compiles and you can test it out in standard ML.

like image 877
oldselflearner1959 Avatar asked Jan 29 '26 17:01

oldselflearner1959


1 Answers

I think your current understanding is mostly correct and moreover I'm not sure what exactly you don't understand in the else branch.

Basic idea and the invariant

To avoid confusion over name shadowing let ts0 be the value of the ts passed to the polyquotremd and us0 = ((n,b) :: us). Then for quo ts qs with us0 captured from the outside context on each step of the recursion, except very last i.e. the ones handled by returns quo [] qs = (rev qs, []) and if m < n then (rev qs, (m,a) :: ts) branches, the following invariant holds

ts0 = us0 * rev qs + ts

In other words we iteratively (via recursion) produce next term in the answer (qs) and new ts of a lower highest power that represents the current rest of the division. Obviously if we can reduce the highest power of ts via this process such that its highest power m is less than us0 highest power n, then we have full answer in our qs and ts. So now all we need to do is to actually reduce the highest power of ts. And what we'll get is just the standard long division algorithm.

Long division algorithm details

One step of the long division is following:

  1. For the highest terms in the dividend (current rest) ((m,a) :: ts) and the divisor ((n,b) :: us) i.e. for (m,a) and (n,b) check if m < n. If it is - finish the algorithm.

  2. Divide (m,a) by (n,b) and find their quotient. It is obviously (m-n, -a/b)

  3. Multiply whole divisor by this quotient i.e. (map (termproduct(m-n, a/b)) ((n,b) :: us))). Note that this is not the line you have in your code. Just wait a bit and we'll get there as well.

Sidenote or how this is done in code. Just in case this is not clear: map is a classical higher-order function that takes another function and a collection, and returns new collection of the same size in which each element is a result of the application of that function to the corresponding source element. Now (termproduct(m-n, a/b)) is called partial application: it takes a function of two arguments termproduct and produces a new function of one argument (the second one in the original function) with the first argument fixed with the value of (m-n, a/b). So that whole construct reads as following: multiply each term of the ((n,b) :: us) (or just us in your code) by (m-n, a/b). Obviously this is just a way to multiply polynomial by a single term.

  1. Subtract this product from the current rest. We can substitute this subtraction with sum by additionally multiplying by -1 on the previous step. So now it is
(polysum ((m,a) :: ts)
                (map (termproduct(m-n, ~a/b)) ((n,b) :: us)))

And now we can notice that the highest terms will actually cancel each other. It is not surprise because we calculated the multiplier as their quotient i.e. exactly so they would cancel each other. So now we can remove them and replace that line with the one you have in your code:

(polysum ts
                (map (termproduct(m-n, ~a/b)) us))
  1. And finally we need to add this quotient/multiplier of (m-n, a/b) to our result polynomial. And the only minor issue is that we fill it in the reverse order i.e. we first find the highest term, the next, and so on. This works OK on the paper but lists in ML are filled in the reverse order. This is why both our "exit" branches contain rev qs in them.

If you want an example - you can follow one in the wiki article on Polynomial long division that I referenced above. Just to re-iterate: each step of that algorithm is another recursive all of the quo.

Hope this helps. Let me know if something is still not clear.

Update (answer to the comment)

I don't fully get the part on how mathematically the ~a/b will cancel each other out.

First of all, have you tried to follow the example in the wiki that I referenced above? If so, what exactly is not clear for you there?

Going back to your question there are several way to look at that. From the purely mechanical point of view the question here is: what is the result of termproduct (m-n, ~a/b) (n,b)? Obviously it is (m-n + n, ~a/b*b) = (m, -a). So this is exactly opposite to the highest term in ts.

From the higher level of the algorithm, they will cancel out exactly because we calculated the multiplier (m-n, ~a/b) with the sole purpose that they cancel out. It seems that you don't get what I described in the "Basic idea and the invariant" section and I'm not sure I can restate it differently but I'll try. Division might be seen as a subtraction made multiple time with counter how many times we did it. If we look this way, it is clear that we should try to subtract some multiplies of the divisor in a way to cancel out the highest term of the divisor.

Example 1: assume we are dividing 2*x^2 + 3*x + 4 by x^2 + x + 1. If we want to remove the 2*x^2 term we have to multiply the divisor by 2 and then subtract. So the result is quotient = 2 and the rest is 2*x^2 + 3*x + 4 - 2*(x^2 + x + 1) = x + 2.

Example 2: assume we are dividing 2*x^2 + 3*x + 4 by x^2 + 2*x + 3. Note that although divisor is different from the previos example, we still have to multiply it by 2 to remove the highest term in the dividend. This is because the only thing that determines this multiply is the ration of the highest terms. Obvioulsy other terms affect the final result (rest) but they do not affect the current term of the result. Note that the result is still quotient = 2 but the rest is 2*x^2 + 3*x + 4 - 2*(x^2 + 2*x + 3) = -x - 2

Example 3: Let's divide 2*x^3+3*x^2 by x+1 and x+2 to see the difference. Note that to cancel the x^3, our first multiplier (i.e. the highest term in the quotient) should be the same in both cases 2*x^3/x = 2*x^2 So for x+1 we get that

x+1: 2*x^3+3*x^2 = 2*x^2*(x+1) + (2*x^3+3*x^2) - 2*x^2*(x+1) = 2*x^2*(x+1) + x^2

Similarly for x+2:

x+2: 2*x^3+3*x^2 = 2*x^2*(x+2) + (2*x^3+3*x^2) - 2*x^2*(x+2) = 2*x^2*(x+2) - x^2

From now on the results will be different as we continue with different intermediate rests +x^2 for x+1 and -x^2 for x+2. It means that next multipliers will be +x^2/x = +x and -x^2/x = -x

x+1: 2*x^3+3*x^2 = (2*x^2+x)*(x+1) + x^2 - x*(x+1) = (2*x^2+x)*(x+1) - x
x+2: 2*x^3+3*x^2 = (2*x^2-x)*(x+2) - x^2 - (-x)*(x+2) = (2*x^2-x)*(x+2) + 2*x

And now the final step with rests -x for x+1 => multiplier is -1 and 2*x for x+2 => multiplier is 2

x+1: 2*x^3+3*x^2 = (2*x^2+x-1)*(x+1) - x - (-1)(x+1) = (2*x^2+x-1)*(x+1) + 1
x+2: 2*x^3+3*x^2 = (2*x^2-x+2)*(x+2) + 2*x - (2)*(x+2) = (2*x^2-x+2)*(x+2) - 4

And again you can see that change in a non-highest term of the divisor affects the result but not the highest term of the result.

like image 64
SergGr Avatar answered Jan 31 '26 10:01

SergGr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!