Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing a single record field to be strict leads to worse performance

Tags:

haskell

I have a program that uses haskell-src-exts, and to improve performance I decided to make some record fields strict. This resulted in much worse performance.

Here's the complete module that I'm changing:

{-# LANGUAGE DeriveDataTypeable, BangPatterns #-}
module Cortex.Hackage.HaskellSrcExts.Language.Haskell.Exts.SrcSpan(
    SrcSpan, srcSpan, srcSpanFilename, srcSpanStartLine,
    srcSpanStartColumn, srcSpanEndLine, srcSpanEndColumn,
    ) where

import Control.DeepSeq
import Data.Data

data SrcSpan = SrcSpanX
    { srcSpanFilename    :: String
    , srcSpanStartLine   :: Int
    , srcSpanStartColumn :: Int
    , srcSpanEndLine     :: Int
    , srcSpanEndColumn   :: Int
    }
  deriving (Eq,Ord,Show,Typeable,Data)

srcSpan :: String -> Int -> Int -> Int -> Int -> SrcSpan
srcSpan fn !sl !sc !el !ec = SrcSpanX fn sl sc el ec

instance NFData SrcSpan where
    rnf (SrcSpanX x1 x2 x3 x4 x5) = rnf x1

Note that the only way to construct a SrcSpan is by using the srcSpan function which is strict in all the Ints.

With this code my program (sorry, I can't share it) runs in 163s.

Now change a single line, e.g.,

    , srcSpanStartLine   :: !Int

I.e., the srcSpanStartLine field is now marked as strict. My program now takes 198s to run. So making that one field strict increases the running time by about 20%.

How is this possible? The code for the srcSpan function should be the same regardless since it is already strict. The code for the srcSpanStartLine selector should be a bit simpler since it no longer has to evaluate.

I've experimented with -funbox-strict-fields and -funbox-small-strict-field on and off. It doesn't make any noticeable difference. I'm using ghc 7.8.3.

Has anyone seen something similar? Any bright ideas what might cause it?

like image 333
augustss Avatar asked Aug 01 '14 22:08

augustss


1 Answers

With some more investigation I can answer my own question. The short answer is uniplate.

Slightly longer answer. In one place I used uniplate to get the children of a Pat (haskell-src-exts type for patterns). The call looked like children p and the type of this instance of children was Pat SrcSpanInfo -> [Pat SrcSpanInfo]. So it's doing no recursion, just returning the immediate children of a node.

Uniplate uses two very different methods depending on if there are strict fields in the type your operating on. Without strict fields it reasonable fast, with strict fields it switches to using gfoldl and is incredibly slow. And even though my use of uniplate didn't directly involve a strict field, it slowed down.

Conclusion: Beware uniplate if you have a strict field anywhere in sight!

like image 135
augustss Avatar answered Nov 17 '22 12:11

augustss