Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store a Haskell data type in an Unboxed Vector in continuous memory

Tags:

haskell

I would like to store a non-parametric, unpacked data type like

data Point3D = Point3D {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !Int

In an Unboxed vector. Data.Vector.Unboxed says:

In particular, unboxed vectors of pairs are represented as pairs of unboxed vectors.

Why is that? I would prefer to have my Point3D laid out one after another in memory to get fast cache-local access when sequentially iterating over them - the equivalent of mystruct[1000] in C.

Using Vector.Unboxed or otherwise, how can I achieve that?


By the way: With vector-th-unbox the same happens, since with that you just transform your data type to the (Unbox a, Unbox b) => Unbox (a, b) instance.

like image 744
nh2 Avatar asked Apr 05 '14 14:04

nh2


1 Answers

I don't know why vectors of pairs are stored as pairs of vectors, but you can easily write instances for your datatype to store the elements sequentially.

{-# LANGUAGE TypeFamilies, MultiParamTypeClasses #-}

import qualified Data.Vector.Generic as G 
import qualified Data.Vector.Generic.Mutable as M 
import Control.Monad (liftM, zipWithM_)
import Data.Vector.Unboxed.Base

data Point3D = Point3D {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !Int

newtype instance MVector s Point3D = MV_Point3D (MVector s Int)
newtype instance Vector    Point3D = V_Point3D  (Vector    Int)
instance Unbox Point3D

At this point the last line will cause an error since there are no instances for vector types for Point3D. They can be written as follows:

instance M.MVector MVector Point3D where 
  basicLength (MV_Point3D v) = M.basicLength v `div` 3 
  basicUnsafeSlice a b (MV_Point3D v) = MV_Point3D $ M.basicUnsafeSlice (a*3) (b*3) v 
  basicOverlaps (MV_Point3D v0) (MV_Point3D v1) = M.basicOverlaps v0 v1 
  basicUnsafeNew n = liftM MV_Point3D (M.basicUnsafeNew (3*n))
  basicUnsafeRead (MV_Point3D v) n = do 
    [a,b,c] <- mapM (M.basicUnsafeRead v) [3*n,3*n+1,3*n+2]
    return $ Point3D a b c 
  basicUnsafeWrite (MV_Point3D v) n (Point3D a b c) = zipWithM_ (M.basicUnsafeWrite v) [3*n,3*n+1,3*n+2] [a,b,c]

instance G.Vector Vector Point3D where 
  basicUnsafeFreeze (MV_Point3D v) = liftM V_Point3D (G.basicUnsafeFreeze v)
  basicUnsafeThaw (V_Point3D v) = liftM MV_Point3D (G.basicUnsafeThaw v)
  basicLength (V_Point3D v) = G.basicLength v `div` 3
  basicUnsafeSlice a b (V_Point3D v) = V_Point3D $ G.basicUnsafeSlice (a*3) (b*3) v 
  basicUnsafeIndexM (V_Point3D v) n = do 
    [a,b,c] <- mapM (G.basicUnsafeIndexM v) [3*n,3*n+1,3*n+2]
    return $ Point3D a b c 

I think most of the function definitions are self explanatory. The vector of points is stored as a vector of Ints and the nth point is the 3n,3n+1,3n+2 Ints.

like image 50
user2407038 Avatar answered Sep 28 '22 08:09

user2407038