Non-trivial algorithm conversion from imperative to functional

Question

To enforce my (weak) functional programming skills, I am studying The NURBS book by Piegl and Tiller converting all the algorithms to Haskell. It is a very nice and instructive process but I got stuck on algorithm 2.2, here is (a C-like reworked-by-me version of) the pseudo-code:

double[] BasisFuns( int i, double u, int p, double U[]) {   double N[p+1];   double left[p+1];   double ritght[p+1];   N[0]=1.0;   for (j=1; j<=p; j++) {     left[j]  = u - U[i+1-j];     right[j] = U[i+j] - u;     double saved = 0.0;     for (r=O; r<j; r++) {       double temp= N[r]/(right[r+1]+left[j-r]);       N[r] = saved+right[r+1]*temp;       saved = left[j-r]*temp;     }     N[j] = saved;   }   return N; }

The outer loop looks easy, but the inner one, with all those necessarily-ordered modifications to the elements of N is giving me a headache.

I started to set it up like this:

baseFunc :: RealFrac a => Int -> Int -> a -> [a] -> [a] baseFunc i p u knots   [ ??? | j <- [1..p], r <- [0..j] ]   where   left = [ u - knots !! (i+1-j) | j <- [ 1 .. p ]   right= [ knots !! (i+j) - u   | j <- [ 1 .. p ]

but I feel that I may be completely off road.

I have already written a completely different and inefficient version of this function based on Eq. 2.5 in the book, therefore here I am looking to maintain the performance of the imperative version.

leftaroundabout · Accepted Answer

Although it is certainly possible to translate numerical algorithms into Haskell starting from Fortran/Matlab/C “everything is an array” style (and, using unboxed mutable vectors, performance will generally be not much worse) this is really missing the point about using a functional language. The underlying math is actually much closer to functional than to imperative programming, so the best thing is to start right there. Specifically, the recurrence formula

equation 2.5 from the NURBS book

can be translated almost literally into Haskell, much better than into an imperative language:

baseFuncs :: [Double]  -- ^ Knots, $\{u_i\}_i$           -> Int       -- ^ Index $i$ at which to evaluate           -> Int       -- ^ Spline degree $p$           -> Double    -- ^ Position $u$ at which to evaluate           -> Double baseFuncs us i 0 u   | u >= us!!i, u < us!!(i+1)  = 1   | otherwise                  = 0 baseFuncs us i p u       = (u - us!!i)/(us!!(i+p) - us!!i) * baseFuncs us i (p-1) u        + (us!!(i+p+1) - u)/(us!!(i+p+1) - us!!(i+1)) * baseFuncs us (i+1) (p-1) u

Unfortunately this will actually not be efficient though, for multiple reasons.

First, lists suck at random-access. A simple fix is to switch to unboxed (but pure) vectors. While we're at is let's wrap them in a newtype because the u_i are supposed to be strictly increasing. Talking about types: the direct accesses are unsafe; we could fix this by bringing p and the number of segments to the type level and only allowing indices i < n-p, but I won't go into that here.

Also, it's awkward to pass us and u around all the way down the recursion, better just bind it once and then use a helper function to go down:

import Data.Vector.Unboxed (Vector, (!)) import qualified Data.Vector.Unboxed as VU  newtype Knots = Knots {getIncreasingKnotsSeq :: Vector Double}  baseFuncs :: Knots     -- ^ $\{u_i\}_i$           -> Int       -- ^ Index $i$ at which to evaluate           -> Int       -- ^ Spline degree $p$           -> Double    -- ^ Position $u$ at which to evaluate           -> Double baseFuncs (Knots us) i₀ p₀ u = go i₀ p₀  where go i 0         | u >= us!i         , i>=VU.length us-1 || u < us!(i+1)  = 1         | otherwise                          = 0        go i p            = (u - us!i)/(us!(i+p) - us!i) * go i (p-1)             + (us!(i+p+1) - u)/(us!(i+p+1) - us!(i+1)) * go (i+1) (p-1)

The other thing that's not optimal is that we don't share the lower-level evaluations between neighbouring recursive calls. (The evaluation is effectively spanning a directed graph with ^p²⁄₂ nodes, but we evaluate it as a tree with 2^p nodes.) That's a massive inefficiency for large p, but actually quite harmless for the typical low-degree splines.

The way to avoid this inefficiency is to memoise. The C version does this explicitly with the N array, but – this being Haskell – we can be lazy to save the effort of allocating the correct size by using a generic memoisation library, e.g. memo-trie:

import Data.MemoTrie (memo2)  baseFuncs (Knots us) i₀ p₀ u = go' i₀ p₀  where go i 0         | u >= us!i         , i>=VU.length us || u < us!(i+1)  = 1         | otherwise                        = 0        go i p            = (u - us!i)/(us!(i+p) - us!i) * go' i (p-1)             + (us!(i+p+1) - u)/(us!(i+p+1) - us!(i+1)) * go' (i+1) (p-1)        go' = memo2 go

That was the no-brains version (“just memoise the entire domain of go”). As dfeuer remarks, it is easy enough to explicitly memoise only the regian that actually gets evaluated, and then we can again use an efficient unboxed vector:

baseFuncs (Knots us) i₀ p₀ u = VU.unsafeHead $ gol i₀ p₀  where gol i 0 = VU.generate (p₀+1) $ \j ->         if u >= us!(i+j)             && (i+j>=VU.length us || u < us!(i+j+1))          then 1 else 0        gol i p = case gol i (p-1) of         res' -> VU.izipWith          (\j l r -> let i' = i+j             in (u - us!i')/(us!(i'+p) - us!i') * l             + (us!(i'+p+1) - u)/(us!(i'+p+1) - us!(i'+1)) * r)          res' (VU.unsafeTail res')

(I can safely use unsafeHead and unsafeTail here, because at each recursion level the zipping reduces the length by 1, so at the top-level I still have p₀ - (p₀-1) = 1 elements left.)

This version should, I think, have the same asymptotics as the C version. With some more small improvements like precomputing the interval lengths and pre-checking that the arguments are in the allowed range so all accesses can be made unsafe, it is probably very close in performance to the C version.

As – again – dfeuer remarks, it might not even be necessary to use vectors there because I just zip together the result. For this kind of stuff, GHC can be very good at optimising code even when using plain lists. But, I won't investigate the performance any further here.

The test I used to confirm it actually works:

plot of the base splines

https://gist.github.com/leftaroundabout/4fd6ef8642029607e1b222783b9d1c1e

danidiaz · Answer

(Disclaimer: I have zero idea of what is being calculated here.)

The access pattern in the array U seems to be as follows: from the index i outwards, we consume values to the left and to the right. We could imagine having two lists which consisted precisely in those sequences of elements. In fact, we could construct such lists out of a source list like this:

pryAt :: Int -> [a] -> ([a], [a]) pryAt i xs = go i ([], xs)   where     go 0 a = a     go n (us, v : vs) = go (pred n) (v : us, vs) -- pryAt 5 ['a'..'x'] -- ("edcba","fghijklmnopqrstuvwx")

For random-access containers, we could have specialized versions of pryAt, because traversing the whole list until we reach the ith element will be inefficient.

In the outer loop, we have arrays N, left and right which grow with each iteration (N seems to be fully reconstructed at each iteration, as well). We could represent them as lists. In each iteration, we also consume a pair of elements of U, to the left and right.

The following datatype represents the situation at the beginning of an iteration of the outer loop:

data State = State   { ns :: [Double],     left :: [Double],     right :: [Double],     us :: ([Double], [Double])   }

Assuming we already had the outerStep :: State -> State implemented, we could simply turn the crank p times:

basis :: Int -> Double -> Int -> [Double] -> [Double] basis i u p us =   ns $ iterate outerStep initial !! p   where     initial =       State         { ns = [1.0],           left = [],           right = [],           us = pryAt i us         }

What is done at outerStep? We add new elements to left and right, then we re-create the whole N list from the beginning, while carrying a saved accumulator along the way. This is a mapAccumR. We need some extra info: the right values (in the same direction as N) and the left values (in reverse direction) so we need to zip them beforehand:

    outerStep (State {ns, left, right, us = (ul : uls, ur : urs)}) =       let left' = u - ul : left           right' = ur - u : right           (saved', ns') = mapAccumR innerStep 0.0 $ zip3 ns right' (reverse left')        in State             { ns = saved' : ns',               left = left',               right = right',               us = (uls, urs)             }

And here are the computations of the inner step:

    innerStep saved (n, r, l) =          let temp = n / (r - l)             n' = saved + r             saved' = l * temp          in (saved', n')

In addition to correcting possible bugs, more work would remain because the basis function in its current form is likely to leak memory (in particular, that mapAccumR will create lots of thunks). Perhaps it could be rewritten to use functions like iterate' or foldl' that keep their accumulators strict.

Non-trivial algorithm conversion from imperative to functional

Tags:

algorithm

list

functional-programming

haskell

DarioP

Video Answer

2 Answers

leftaroundabout

danidiaz

Recent Activity

Donate For Us

Non-trivial algorithm conversion from imperative to functional

Tags:

algorithm

list

functional-programming

haskell

DarioP

Video Answer

2 Answers

leftaroundabout

danidiaz

Related questions

Recent Activity

Donate For Us