I'm developing some log analyzing tool with kotlin. I have a large amount of incoming logs so it impossible to load them all into the memory, I need to process them in "pipeline" manner. And I found out two things disappointing me:
filter, map and so on) are not lazy. E.g. I have 1 GB of logs and want to get lengths of first ten lines that are matches the given regexp. If I write it as is, filtering and transforming will be applied to whole gigabyte of strings in memory.l.stream(), where l defined as val l = ArrayList<String>(). Compiler says: "Unresolved reference: stream".So the questions are: are you going to make collection functions lazy? And why can't I access the stream() method?
Kotlin does not use Java 8 Streams, instead there is lazy Sequence<T>. It has API mostly unified with Iterable<T>, so you can learn more about its usage here.
Sequence<T> is similar to Stream<T>, but it offers more when it comes to sequential data (e.g. takeWhile), though it has no parallel operations support at the moment*.
Another reason for introducing a replacement for Stream API is that Kotlin targets Java 6, which has no Streams, so they were dropped from Kotlin stdlib in favor of Sequence<T>.
A Sequence<T> can be created from an Iterable<T> (which every Collection<T> is) with asSequence() method:
val l = ArrayList<String>()
val sequence = l.asSequence()
This is equivalent to .stream() in Java and will let you process a collection lazily. Otherwise, transformations are eagerly applied to a collection.
* If you need it, the workaround is to rollback to Java 8 Streams:
(collection as java.lang.Collection<T>).parallelStream()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With