As a short exercise in using Haskell arrays I wanted to implement a function giving the first n (odd) prime numbers. The code below (compiled with GHC 7.10.3) produces a loop error at runtime. "A Gentle Introduction to Haskell" uses recursive calls in array creation to compute Fibonacci numbers (https://www.haskell.org/tutorial/arrays.html, 13.2, code below for reference), which works just fine. My question is:
Where is the difference between the two ways of recursive creation? Which recursive calls are generally allowed when creating arrays?
My code:
import Data.Array.Unboxed
main = putStrLn $ show $ (primes 500)!500 --arbitrary example
primes :: Int -> UArray Int Int
primes n = a
where
a = array (1,n) $ primelist 1 [3,5..]
primelist i (m:ms) =
if all (not . divides m) [ a!j | j <- [1..(i-1)]]
then (i ,m) : primelist (succ i) ms
else primelist i ms
divides m k = m `mod` k == 0
Code from "A Gentle Introduction to Haskell":
fibs :: Int -> Array Int Int
fibs n = a where a = array (0,n) ([(0, 1), (1, 1)] ++
[(i, a!(i-2) + a!(i-1)) | i <- [2..n]])
Thanks in advance for any answers!
Update: I think I finally understood what's going on. array is lazy on the list elements, but is unnecessarily strict on its spine!
This causes a <<loop>> exception, for instance
test :: Array Int Int
test = array (1,2) ((1,1) : if test!1 == 1 then [(2,2)] else [(2,100)])
unlike
test :: Array Int Int
test = array (1,2) ((1,1) : [(2, if test!1 == 1 then 2 else 100)])
So, recursion works as long as it only affects the values.
A working version:
main :: IO ()
main = do
putStrLn $ show $ (primes 500)!500 --arbitrary example
-- A spine-lazy version of array
-- Assumes the list carries indices lo..hi
arraySpineLazy :: (Int, Int) -> [(Int, a)] -> Array Int a
arraySpineLazy (lo,hi) xs = array (lo,hi) $ go lo xs
where
go i _ | i > hi = []
go i ~((_,e):ys) = (i, e) : go (succ i) ys
primes :: Int -> Array Int Int
primes n = a
where
a :: Array Int Int
a = arraySpineLazy (1,n) $ primelist 1 (2: [3,5..])
primelist :: Int -> [Int] -> [(Int, Int)]
primelist i _ | i > n = []
primelist _ [] = [] -- remove warnings
primelist i (m:ms) =
if all (not . divides m) [ a!j | j <- [1..(i-1)]]
then (i ,m) : primelist (succ i) ms
else primelist i ms
divides m k = m `mod` k == 0
Arguably, we should instead write a lazier variant of listArray instead, since our array variant discard the first components of the pair.
This is a strictness issue: you can't generate unboxed arrays recursively, only boxed (regular) ones, since only boxed ones have a lazy semantics.
Forget arrays, and consider the following recursive pair definition
let (x,y) = (0,x)
This defines x=0 ; y=0, recursively. However, for the recursion to work, it is necessary that the pair is lazy. Otherwise, it generates an infinite recursion, much as the following would do:
let p = case p of (x,y) -> (0,x)
Above, p evaluates itself before it can expose the (,) pair constructor, so an infinite loop arises. By comparison,
let p = (0, case p of (x,y) -> x)
would work, since p produces the (,) before calling itself. Note however that this relies on the constructor (,) not evaluating the components before returning -- it has to be lazy, and return immediately leaving the components to be evaluated later.
Operationally, a pair is constructed having inside tho thunks: two pointers to code, which will evaluate the result later on. Hence the pair is not really a pair of integers, but a pair of indirections-to-integer. This is called "boxing", and is needed to achieve laziness, even if it carries a little computational cost.
By definition, unboxed data structures, like unboxed arrays, avoid boxing, so they are strict, not lazy, and they can not support the same recursion approaches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With