Scala has an amazing way of converting a collection into another collection using map
construct.
val l = List(1, 2, 3, 4)
l.map(_*_)
will return the squares of the elements in list l
I come across various instances where multiple maps are chained together say,
val l = List(1, 2, 3, 4)
val res = l.map(_ * _).map(_ + 1).filter(_ < 3)
What i believe happens underneath is equivalent to something below.
val l = List(1, 2, 3, 4)
val l1 = l.map(_*_)
val l2 = l1.map(_ + 1)
val res = l2.filter(_ < 3)
creating l1
and l2
might cause memory issues if the collection is too big.
To tackle this problem, does Scala compiler have any optimizations?
val l = List(1, 2, 3, 4)
val res = l1.map( _*_ + 1).filter(_ < 3)
in general if f
, g
, h
are functions
val l = List(/*something*/)
val res = l.map(f(_)).map(g(_)).map(h(_))
can be converted into
val res = l.map(f _ andThen g _ andThen h _)
Scala offers Stream
, which is a lazy ordered collection.
val s = Stream(1, 2, 3, 4)
// note i've changed your sequence of transformations
// a bit, so that it compiles and yields more than one result
val res = s.map(i => i * i).map(_ + 1).filter(_ < 11)
res
is now a Stream
. No actual evaluation has been performed yet, no blocks of memory related to the size of s
have been used.
If you intend to use the elements of res
one at a time, no more work is required. You can use res
in a for statement or comprehension directly, for example.
for ( elem <- res ) println( s"A value is ${elem}" )
If you want res
as a List
, you can just call .toList
at the end of the sequence of transformations. Instead of the above, use
val res = s.map(i => i * i).map(_ + 1).filter(_ < 11).toList
s
will only be traversed once in creating the new List
.
No, because this would require the compiler to know about the semantics of map
and treat the standard library classes which implement it specially (since nobody stops you from writing a class where this doesn't hold). There is a research proposal which might end up implementing this... eventually.
There is also Scala-Blitz which optimizes some collection operations, but fusion and deforestation are listed as future work in this presentation and I don't think they are implemented yet.
As Steve Waldman's answer says, using Stream
(or, better yet, Iterator
) can help, but it won't eliminate the intermediate collections completely.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With