Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How long can the name of a type constructor be?

Tags:

Is there a limit to how long a constructor name can be? What are the consequences of having absurdly long constructor names?

data
like image 701
J. Abrahamson Avatar asked Jun 20 '14 21:06

J. Abrahamson


1 Answers

If we check the source for ghc, we can find the type used for defining data constructors. It is named DataCon, and it has the following field:

dcName    :: Name,  -- This is the name of the *source data con* 

Going down the rabbit hole, Name contains an OccName:

n_occ  :: !OccName,     -- Its occurrence name 

An OccName contains a FastString for the name:

data OccName = OccName     { occNameSpace :: !NameSpace     , occNameFS :: !FastString     }     deriving Typeable 

Finally, a FastString is just a ByteString, also with a precalculated length, and a int to tag it for quick comparison:

data FastString = FastString {       uniq :: {-# UNPACK #-} !Int, -- unique id       n_chars :: {-# UNPACK #-} !Int, -- number of chars       fs_bs :: {-# UNPACK #-} !ByteString,       fs_ref :: {-# UNPACK #-} !(IORef (Maybe FastZString))   } deriving Typeable 

There is no limit on the size of the string using this data type (except obviously maxBound :: Int). However that doesn't rule out a bug somewhere else in the code that may cause problems.

So we need a program to test this:

{-# LANGUAGE BangPatterns #-} {-# LANGUAGE TemplateHaskell #-} module Main where import Control.Applicative ((<$>)) import Control.Monad (forM_) import System.IO (hPutStr, hFileSize, hClose) import System.Exit (ExitCode(..)) import System.IO.Temp (withSystemTempFile) import Data.Time.Clock.POSIX (getPOSIXTime) import System.Process (readProcessWithExitCode)  -- timing functions (from criterion) getTime :: IO Double getTime = (fromRational . toRational) `fmap` getPOSIXTime  time :: IO a -> IO (Double, a) time act = do   start <- getTime   result <- act   end <- getTime   let !delta = end - start   return (delta, result)    -- make a constructor like -- data C = FFFFFF makeConstructor :: Int -> String makeConstructor size = "data C = " ++ replicate size 'F'  wrapWithMainModule :: String -> String wrapWithMainModule code = unlines ["module Main where", "main = return ()", code]  data CompileResults = CompileResults {   timeTaken :: Double,   success :: Bool,   outputFileSize :: Integer   } deriving (Show)    compileHsCode :: String -> IO CompileResults compileHsCode sourceCode = withSystemTempFile "test.hs" $ \path handle -> do   withSystemTempFile "output.o" $ \outputPath outputHandle -> do     hPutStr handle $ wrapWithMainModule sourceCode     hClose handle     (timeTaken, (exitCode, _, _)) <- time $ readProcessWithExitCode "ghc" ["-c", "-o", outputPath, path] ""     let success = exitCode == ExitSuccess      size <- if success then hFileSize outputHandle else return 0     return $ CompileResults {       timeTaken = timeTaken       , success = success       , outputFileSize = size       }   testConstructorSizes :: [Int] -> IO () testConstructorSizes sizes = forM_ sizes $ \size -> do   info <- compileHsCode $ makeConstructor size   putStrLn $ "For Size " ++ show size ++ "\t: " ++ show info    -- Up to 10 million sizesToTest :: [Int] sizesToTest = take 7 (iterate (*10) 10)  main = testConstructorSizes $ sizesToTest 

Here is the output of running main:

For Size 10     : CompileResults {timeTaken = 0.1390078067779541, success = True, outputFileSize = 1818} For Size 100    : CompileResults {timeTaken = 0.14700841903686523, success = True, outputFileSize = 2086} For Size 1000   : CompileResults {timeTaken = 0.1390080451965332, success = True, outputFileSize = 4786} For Size 10000  : CompileResults {timeTaken = 0.1520085334777832, success = True, outputFileSize = 31786} For Size 100000 : CompileResults {timeTaken = 0.31201791763305664, success = True, outputFileSize = 301786} For Size 1000000        : CompileResults {timeTaken = 2.26712965965271, success = True, outputFileSize = 3001786} For Size 10000000       : CompileResults {timeTaken = 109.2182469367981, success = True, outputFileSize = 30001786} 

Few interesting points:

  1. Note how the time taken increases massively after beyond 1 million. You would expect an increase of x10 if the change was linear, but it is a change of x50. This would likely mean for a 100 million character constructor, it would take about 5000 seconds to compile (which I didn't test).
  2. The file size for the all of the entries was exactly (1786 + (constructorSize * 3). So each char takes up three bytes when used in a constructor.
like image 197
David Miani Avatar answered Oct 07 '22 14:10

David Miani