Strictness optimization and memory allocation in Haskell

Tags:

I've been learning some Haskell by implementing a feature selection algorithm.

I've gotten the performance from 20s on a benchmark dataset down to 5s, where the C program handles the same dataset in 0.5s. The dataset can be found here. To run, call the compiled binary like so: ./Mrmr 10 test_nci9_s3.csv.

The code is here, and I'm interested in optimizing mutualInfoInnerLoop:

mutualInfoInnerLoop :: Double -> Data.Vector.Unboxed.Vector (Int, Int) -> Double -> (Int, Int, Double) -> Double
mutualInfoInnerLoop n xys !acc (!i, !j, !px_py)
    | n == 0 || px_py == 0 || pxy == 0 = acc
    | otherwise                        = pxy * logBase 2 ( pxy / px_py ) + acc
    where
        pxy = ( fromIntegral . U.foldl' accumEq2 0 $ xys ) / n
        accumEq2 :: Int -> (Int, Int) -> Int
        accumEq2 !acc (!i', !j')
            | i' == i && j' == j = acc + 1
            | otherwise          = acc

The profiler says:

COST CENTRE                    MODULE               %time %alloc

mutualInfoInnerLoop            Main                  75.0   47.9
mutualInfo                     Main                  14.7   32.1
parseCsv                       Main                   5.9   13.1
CAF                            GHC.Float              1.5    0.0
readInt                        Main                   1.5    1.2
doMrmr                         Main                   1.5    4.0

Which shows mutualInfoInnerLoop as making 50% of the allocations, with 75% of the runtime in the program. The allocations are disconcerting.

Also, the Core for that function has a signature:

mutualInfoInnerLoop_rXG
  :: GHC.Types.Double
     -> Data.Vector.Unboxed.Base.Vector (GHC.Types.Int, GHC.Types.Int)
     -> GHC.Types.Double
     -> (GHC.Types.Int, GHC.Types.Int, GHC.Types.Double)
     -> GHC.Types.Double
[GblId,
 Arity=4,
 Caf=NoCafRefs,
 Str=DmdType U(L)LU(L)U(U(L)U(L)U(L))m]

Showing most of the parameters as being Lazily evaluated and boxed (as opposed to strict and unboxed).

I've tried BangPatterns, I've tried MagicHash, and I can't seem to make it go faster.

Anyone have any suggestions?

607

asked Dec 11 '11 17:12

LanceH

1 Answers

I'm by far no expert at this, but I do see one small improvement. In your source I see this:

mutualInfo n ... = foldl' (mutualInfoInnerLoop n $ U.zip xs ys) ...

You don't need to check n == 0 every single time the function is invoked, since you never change the n argument when you invoke it. The xys argument doesn't change, either, meaning that pxy does not change across invocations, since it depends solely on xys and n. Let's take advantage of these things to make sure a closure is created which evaluates these things only once.

mutualInfoInnerLoop n xys
  | n == 0 || pxy == 0 = const
  | otherwise          = go
  where pxy = (fromIntegral . U.foldl' accumEq2 0 $ xys) / n
        accumEq2 :: Int -> (Int, Int) -> Int
        accumEq2 !acc (!i', !j')
              | i' == i && j' == j = acc + 1
              | otherwise          = acc
        go !acc (!i, !j, !px_py)
          | px_py == 0 = acc
          | otherwise  = pxy * logBase 2 ( pxy / px_py ) + acc

I'm not sure if GHC is smart enough to perform this optimization on its own, nor am I sure this saves much time/space, but it's the best I've got. With those bang patterns sprinkled all over, I wonder if this is a case of too much strictness.

133

answered Sep 20 '22 19:09

Dan Burton

Related questions
                            
                                Should I use one big SQL Select statement or several small ones?
                            
                                Why is std::accumulate so slow? [duplicate]
                            
                                What setup does REP do?
                            
                                Mysql OPTIMIZE TABLE for all fragmented tables
                            
                                Force compiler to not optimize side-effect-less statements
                            
                                Prevent Proguard to remove specific drawables
                            
                                Are preg_match() and preg_replace() slow?
                            
                                Does it improve performance to index a date column?
                            
                                Practical limit to length of SQL query (specifically MySQL)
                            
                                PHP include/require inside functions
                            
                                Will C++ linker automatically inline functions (without "inline" keyword, without implementation in header)?
                            
                                What is the most character-efficient way to increase CSS specificity?
                            
                                Is optimizing JavaScript for loops really necessary?
                            
                                optimized itoa function
                            
                                gcc 4.3.3 compiler options enabled by default
                            
                                Flashing between page loads
                            
                                Is there something like pngcrush for GIFs?
                            
                                Rails Migration Create Table Primary Key
                            
                                Is there a Java bytecode optimizer that removes useless gotos?
                            
                                Android: variable has incorrect value in while loop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Strictness optimization and memory allocation in Haskell

Tags:

optimization

functional-programming

haskell

LanceH

People also ask

1 Answers

Dan Burton

Recent Activity

Donate For Us