I have 8 GB of RAM, but Haskell programs seemingly can only use 1.3 GB.
I'm using this simple program to determine how much memory a GHC program can allocate:
import System.Environment
import Data.Set as Set
main = do
args <- getArgs
let n = (read $ args !! 0) :: Int
s = Set.fromList [0..n]
do
putStrLn $ "min: " ++ (show $ findMin s)
putStrLn $ "max: " ++ (show $ findMax s)
Here's what I'm finding:
./mem.exe 40000000 +RTS -s
succeeds and reports 1113 MB total memory in use
./mem.exe 42000000 +RTS -s
fails with out of memory error
./mem.exe 42000000 +RTS -s -M4G
errors out with -M4G: size outside allowed range
./mem.exe 42000000 +RTS -s -M3.9G
fails with out of memory error
Monitoring the process via the Windows Task Manager shows that the max memory usage is about 1.2 GB.
My system: Win7, 8 GB RAM, Haskell Platform 2011.04.0.0, ghc 7.0.4.
I'm compiling with: ghc -O2 mem.hs -rtsopts
How can I make use of all of my available RAM? Am I missing something obvious?
Haskell's laziness can be a nice way of using very low total memory. Laziness means that an expression is evaluated only when the result is needed.
Haskell computations produce a lot of memory garbage - much more than conventional imperative languages. It's because data are immutable so the only way to store every next operation's result is to create new values. In particular, every iteration of a recursive computation creates a new value.
Currently, on Windows, GHC is a 32-bit GHC - I think a 64-bit GHC for windows is supposed to be available when 7.6 comes.
One consequence of that is that on Windows, you can't use more than 4G - 1BLOCK
of memory, since the maximum allowed as a size-parameter is HS_WORD_MAX
:
decodeSize(rts_argv[arg], 2, BLOCK_SIZE, HS_WORD_MAX) / BLOCK_SIZE;
With 32-bit Words, HS_WORD_MAX = 2^32-1
.
That explains
running ./mem.exe 42000000 +RTS -s -M4G errors out with -M4G: size outside allowed range
since decodeSize()
decodes 4G
as 2^32
.
This limitation will remain also after upgrading your GHC, until finally a 64-bit GHC for Windows is released.
As a 32-bit process, the user-mode virtual address space is limited to 2 or 4 GB (depending on the status of the IMAGE_FILE_LARGE_ADDRESS_AWARE
flag), cf Memory limits for Windows Releases.
Now, you are trying to construct a Set
containing 42 million 4-byte Int
s. A Data.Set.Set
has five words of overhead per element (constructor, size, left and right subtree pointer, pointer to element), so the Set
will take up about 0.94 GiB of memory (1.008 'metric' GB). But the process uses about twice that or more (it needs space for the garbage collection, at least the size of the live heap).
Running the programme on my 64-bit linux, with input 21000000 (to make up for the twice as large Int
s and pointers), I get
$ ./mem +RTS -s -RTS 21000000
min: 0
max: 21000000
31,330,814,200 bytes allocated in the heap
4,708,535,032 bytes copied during GC
1,157,426,280 bytes maximum residency (12 sample(s))
13,669,312 bytes maximum slop
2261 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 59971 colls, 0 par 2.73s 2.73s 0.0000s 0.0003s
Gen 1 12 colls, 0 par 3.31s 10.38s 0.8654s 8.8131s
INIT time 0.00s ( 0.00s elapsed)
MUT time 12.12s ( 13.33s elapsed)
GC time 6.03s ( 13.12s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 18.15s ( 26.45s elapsed)
%GC time 33.2% (49.6% elapsed)
Alloc rate 2,584,429,494 bytes per MUT second
Productivity 66.8% of total user, 45.8% of total elapsed
but top
reports only 1.1g
of memory use - top
, and presumably the Task Manager, reports only live heap.
So it seems IMAGE_FILE_LARGE_ADDRESS_AWARE
is not set, your process is limited to an address space of 2GB, and the 42 million Set
needs more than that - unless you specify a maximum or suggested heap size that is smaller:
$ ./mem +RTS -s -M1800M -RTS 21000000
min: 0
max: 21000000
31,330,814,200 bytes allocated in the heap
3,551,201,872 bytes copied during GC
1,157,426,280 bytes maximum residency (12 sample(s))
13,669,312 bytes maximum slop
1154 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 59971 colls, 0 par 2.70s 2.70s 0.0000s 0.0002s
Gen 1 12 colls, 0 par 4.23s 4.85s 0.4043s 3.3144s
INIT time 0.00s ( 0.00s elapsed)
MUT time 11.99s ( 12.00s elapsed)
GC time 6.93s ( 7.55s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 18.93s ( 19.56s elapsed)
%GC time 36.6% (38.6% elapsed)
Alloc rate 2,611,793,025 bytes per MUT second
Productivity 63.4% of total user, 61.3% of total elapsed
Setting the maximal heap size below what it would use naturally, actually lets it fit in hardly more than the space needed for the Set
, at the price of a slightly longer GC time, and suggesting a heap size of -H1800M
lets it finish using only
1831 MB total memory in use (0 MB lost due to fragmentation)
So if you specify a maximal heap size below 2GB (but large enough for the Set
to fit), it should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With