I type this code into the interpreter and memory is rapidly consumed:
last [1..10^7] `seq` ()
I can't see why this needs more than O(1) space. If i do just (which should be the same, because Show forces weak head normal form, so seq is redundant?):
last [1..10^7]
...it works fine.
I'm unable to reproduce this situation outside the interpreter.
What's going on here?
Here are some test cases: http://hpaste.org/69234
Things to note:
wtf<n>
in ghci.ghc --make wtf.hs && ./wtf
.last
can be substituted for a sum
with strict accumulator or a function that finds the max element in the list, and the space leak still happens$!
instead of seq
.()
parameter because I thought maybe this is a CAF problem. Changes nothing.Enum
, because I can reproduce the behaviour with wtf5
and later, which don't use Enum
at all.Num
, Int
, or Integer
, because I can reproduce the behaviour without them in wtf14
and wtf16
.I tried reproducing the problem with Peano arithmetic to take lists and integers out of the equation (fetching the at the end of of 10^9), but ran into other sharing/space leak problems I don't understand when trying to implement *
.
We need to look at the instance of enumFromTo
for Integer, and last:
last [] = errorEmptyList "last"
last (x:xs) = last' x xs
where last' y [] = y
last' _ (y:ys) = last' y ys
It is defined in GHC.Enum as:
enumFrom x = enumDeltaInteger x 1
enumFromThen x y = enumDeltaInteger x (y-x)
enumFromTo x lim = enumDeltaToInteger x 1 lim
where
enumDeltaInteger :: Integer -> Integer -> [Integer]
enumDeltaInteger x d = x `seq` (x : enumDeltaInteger (x+d) d)
-- strict accumulator, so
-- head (drop 1000000 [1 .. ]
-- works
and
enumDeltaToInteger :: Integer -> Integer -> Integer -> [Integer]
enumDeltaToInteger x delta lim
| delta >= 0 = up_list x delta lim
| otherwise = dn_list x delta lim
up_list :: Integer -> Integer -> Integer -> [Integer]
up_list x0 delta lim = go (x0 :: Integer)
where
go x | x > lim = []
| otherwise = x : go (x+delta)
last
is fully lazy, as expected.
For the Integer Enum class, we have a strict accumulator (explicitly) for enumFrom
. In the bounded case (e.g. [1..n]
), it calls enumDeltaToInteger
and then into up_list
, which uses a worker to unfold a list until its limit is reached.
But up_list
is strict in x
in the go
helper (see the comparison against lim
).
When run in GHCi none of this is optimized, yielding naive calls to enumFromTo, before returning ()
.
let
it_ax6 :: ()
it_ax6 =
case last
@ GHC.Integer.Type.Integer
(GHC.Enum.enumFromTo
@ GHC.Integer.Type.Integer
GHC.Num.$fEnumInteger
(GHC.Integer.smallInteger 1)
(GHC.Real.^
@ GHC.Integer.Type.Integer
@ GHC.Integer.Type.Integer
GHC.Num.$fNumInteger
GHC.Real.$fIntegralInteger
(GHC.Integer.smallInteger 10)
(GHC.Integer.smallInteger 7)))
of _ -> GHC.Unit.()
in
GHC.Base.thenIO
@ ()
@ [()]
(System.IO.print @ () GHC.Show.$fShow() it_ax6)
(GHC.Base.returnIO
@ [()] (GHC.Types.: @ () it_ax6 (GHC.Types.[] @ ())))
So, why are we retaining the list in the seq
case, and not in the regular case? The regular case runs nicely in constrant space, relying on the laziness of enumFromTo
for Integer
and for last
. The GHCi core for that case looks like:
let {
it_aKj :: GHC.Integer.Type.Integer
[LclId,
Unf=Unf{Src=<vanilla>, TopLvl=False, Arity=0, Value=False,
ConLike=False, Cheap=False, Expandable=False,
Guidance=IF_ARGS [] 170 0}]
it_aKj =
GHC.List.last
@ GHC.Integer.Type.Integer
(GHC.Enum.enumFromTo
@ GHC.Integer.Type.Integer
GHC.Num.$fEnumInteger
(GHC.Integer.smallInteger 1)
(GHC.Real.^
@ GHC.Integer.Type.Integer
@ GHC.Integer.Type.Integer
GHC.Num.$fNumInteger
GHC.Real.$fIntegralInteger
(GHC.Integer.smallInteger 10)
(GHC.Integer.smallInteger 7))) } in
GHC.Base.thenIO
@ ()
@ [()]
(System.IO.print
@ GHC.Integer.Type.Integer GHC.Num.$fShowInteger it_aKj)
(GHC.Base.returnIO
@ [()]
(GHC.Types.:
@ ()
(it_aKj
`cast` (UnsafeCo GHC.Integer.Type.Integer ()
:: GHC.Integer.Type.Integer ~ ()))
(GHC.Types.[] @ ())))
So these are almost identical, with the differences being:
seq
version, last (enumFromTo ..)
is forced inside a case
.let
.seq
version, the value is computed then discarded, yielding a ()
-- nothing looks at the resultWhat is odd is that there's nothing magic about:
let x = case last (enumFromTo 1 n) of _ -> ()
that makes it retain values.
As we see, the implementation of up_list
is strict in its accumulator (since it compares against lim
, and the list is unfolded lazily -- so last
should be able to consume it in constant space). Writing the expression by hand confirms this.
Doing a heap profile of the ghci execution shows the entire list being retained:
which tells us at least that it isn't a chain of thunks, but rather, the entire list is being built strictly and held on to, until being discarded.
So the mystery is: what is holding onto the list argument to last
in ghci, and not in ghc?
I suspect some internal (or subtle) detail of ghci now -- I think this is worth a ghci ticket.
I think @n.m is right. Nothing is forcing the value in the list, so the 1+1+1+1+... thunk eventually kills space.
I will whip up a quick test.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With