Array in Haskell

Question

How is an array created in haskell using the constructor array? I mean, does it create the first element and so on? In that case how does it read the associated list?

For example if we consider the following two programs:-

ar :: Int->(Array Int Int)
ar n = array (0,(n-1)) (((n-1),1):[(i,((ar n)!(i+1))) | i<-[0..(n-2)]])

ar :: Int->(Array Int Int)
ar n = array (0,(n-1)) ((0,1):[(i,((ar n)!(i-1))) | i<-[1..(n-1)]])

Will these two have different time complexity?

Daniel Fischer · Accepted Answer

That depends on the implementation, but in a reasonable implementation, both have the same complexity (linear in the array size).

In GHC's array implementation, if we look at the code

array (l,u) ies
    = let n = safeRangeSize (l,u)
      in unsafeArray' (l,u) n
                      [(safeIndex (l,u) n i, e) | (i, e) <- ies]

{-# INLINE unsafeArray' #-}
unsafeArray' :: Ix i => (i,i) -> Int -> [(Int, e)] -> Array i e
unsafeArray' (l,u) n@(I# n#) ies = runST (ST $ \s1# ->
    case newArray# n# arrEleBottom s1# of
        (# s2#, marr# #) ->
            foldr (fill marr#) (done l u n marr#) ies s2#)

{-# INLINE fill #-}
fill :: MutableArray# s e -> (Int, e) -> STRep s a -> STRep s a
-- NB: put the \s after the "=" so that 'fill' 
--     inlines when applied to three args 
fill marr# (I# i#, e) next 
 = \s1# -> case writeArray# marr# i# e s1# of 
             s2# -> next s2#

we can see that first a new chunk of memory is allocated for the array, that is then sequentially filled with arrEleBottom (which is an error call with message "undefined array element"), and then the elements supplied in the list are written to the respective indices in the order they appear in the list.

In general, since it is a boxed array, what is written to the array on construction is a thunk that specifies how to compute the value when it is needed (explicitly specified values, like the literal 1 in your examples, will result in a direct pointer to that value written to the array).

When the evaluation of such a thunk is forced, it may force also the evaluation of further thunks in the array - if the thunk refers to other array elements, like here. In the specific examples here, forcing any thunk results in forcing all thunks later resp. earlier in the array until the end with the entry that doesn't refer to another array element is reached. In the first example, if the first array element that is forced is the one at index 0, that builds a thunk of size proportional to the array length that is then reduced, so forcing the first array element then has complexity O(n), then all further elements are already evaluated, and forcing them is O(1). In the second example, the situation is symmetric, there forcing the last element first incurs the total evaluation cost. If the elements are demanded in a different order, the cost of evaluating all thunks is spread across the requests for different elements, but the total cost is the same. The cost of evaluating any not-yet-evaluated thunk is proportional to its distance from the next already evaluated thunk, and includes evaluating all thunks in between.

Since array access is constant time (except for cache effects, but those should not make a difference if you fill the array either forward or backward, they could make a big difference if the indices were in random order, but that still wouldn't affect time complexity), both have the same complexity.

Note however, that your using ar n to define the array elements carries the risk of multiple arrays being allocated (if compiled without optimisations, GHC does that - just tested: even with optimisations that can happen). To make sure that only one is constructed, make it

ar n = result
  where
    result = array (0,n-1) (... result!index ...)

Array in Haskell

Tags:

arrays

haskell

Chandan

1 Answers

Daniel Fischer

Recent Activity

Donate For Us

Array in Haskell

Tags:

arrays

haskell

Chandan

1 Answers

Daniel Fischer

Related questions

Recent Activity

Donate For Us