There is a wealth of information and tutorials online regarding streams in Java 8. Most of what I have found does a good job of explaining the how the various elements of a stream work on a conceptual level. However, I have not encountered much material which describes how streams are actually implemented and executed by the JVM under the hood.
Consider comparing an operation on a Collection
between using a stream and doing it the old school pre-Java 8 way. Would the underlying Bytecodes look the same between the two approaches? Would the performance be the same?
To make this concrete, consider the following example, where I have a need to find all fish whose name contains the word "fish", and then capitalize the first letter of each matching fish. (Yes, I know that Hagfish isn't really a fish, but I ran out of matching fish names.)
List<String> fishList = Arrays.asList("catfish", "hagfish", "salmon", "tuna", "blowfish");
// Pre Java-8 solution
List<String> hasFishList = new ArrayList<String>();
for (String fish : fishList) {
if (fish.contains("fish")) {
String fishCap = fish.substring(0, 1).toUpperCase() + fish.substring(1);
hasFishList.add(fishCap);
}
}
// Java-8 solution using streams
List<String> hasFishList = fishList.stream()
.filter(f -> f.contains("fish"))
.map(f -> f.substring(0, 1).toUpperCase() + f.substring(1))
.collect(Collectors.toList());
Any insight you might have into how these two approaches might differ under the hood, at the Bytecode level, would be great. And some actual byte code would be even better.
Introduced in Java 8, the Stream API is used to process collections of objects. A stream is a sequence of objects that supports various methods which can be pipelined to produce the desired result. A stream is not a data structure instead it takes input from the Collections, Arrays or I/O channels.
Java 8 offers the possibility to create streams out of three primitive types: int, long and double. As Stream<T> is a generic interface, and there is no way to use primitives as a type parameter with generics, three new special interfaces were created: IntStream, LongStream, DoubleStream.
In Java8 Streams, performance is achieved by parallelism, laziness, and using short-circuit operations, but there is a downside as well, and we need to be very cautious while choosing Streams, as it may degrade the performance of your application. Let us look at these factors which are meant for Streams' performance.
However, Java 8 provides us many functional interfaces by default for different use cases under the package java. util. function. Many of these functional interfaces provide support for function composition in terms of default and static methods.
The answer has grown quite a bit over time so I will start with a summary:
invokeinterface
instructions - our 2 lambdas and 4 accept()
calls on sinks).invokedynamic
instruction. Rather than creating a new object, it gives a prescription how to create the lambda at runtime. There is nothing special about calling the lambda methods on the created lambda objects afterwards (invokeinterface
instruction).filter()
and map()
wrap their operations in anonymous subclasses of StatelessOp
which in turn extend ReferencePipeline
, AbstractPipeline
and ultimately BaseStream
. The actual evaluation is done when executing collect()
.Spliterator
rather then Iterator
. Notice the many forks checking isParallel()
- the parallel branches would leverage Spliterator
's methods.The pseudocode below captures a trace through the execution of the version using streams. See the bottom of this post for explanation how to read the trace.
Stream stream1 = fishList.stream();
// Collection#stream():
Spliterator spliterator = fishList.spliterator();
return Spliterators.spliterator(fishList.a, 0);
return new ArraySpliterator(fishList, 0);
return StreamSupport.stream(spliterator, false)
return new ReferencePipeline.Head(spliterator, StreamOpFlag.fromCharacteristics(spliterator), false)
Predicate fishPredicate = /* new lambda f -> f.contains("fish") */
Stream stream2 = stream1.filter(fishPredicate);
return new StatelessOp(this, StreamShape.REFERENCE, StreamOpFlag.NOT_SIZED) { /* ... */ }
Function fishFunction = /* new lambda f.substring(0, 1).toUpperCase() + f.substring(1) */
Stream stream3 = stream2.map(fishFunction);
return new StatelessOp(this, StreamShape.REFERENCE, StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) { /* ... */ }
Collector collector = Collectors.toList();
Supplier supplier = /* new lambda */
BiConsumer accumulator = /* new lambda */
BinaryOperator combiner = /* new lambda */
return new CollectorImpl<>(supplier, accumulator, combiner, CH_ID);
List hasFishList = stream3.collect(collector)
// ReferencePipeline#StatelessOp#collect(Collector):
List container;
if (stream3.isParallel() && /* not executed */) { /* not executed */ }
else {
/*>*/TerminalOp terminalOp = ReduceOps.makeRef(collector)
Supplier supplier = Objects.requireNonNull(collector).supplier();
BiConsumer accumulator = collector.accumulator();
BinaryOperator combiner = collector.combiner();
return new ReduceOp(StreamShape.REFERENCE) { /* ... */ }
/*>*/container = stream3.evaluate(terminalOp);
// AbstractPipeline#evaluate(TerminalOp):
if (linkedOrConsumed) { /* not executed */ }
linkedOrConsumed = true;
if (isParallel()) { /* not executed */ }
else {
/*>*/Spliterator spliterator2 = sourceSpliterator(terminalOp.getOpFlags())
// AbstractPipeline#sourceSpliterator(int):
if (sourceStage.sourceSpliterator != null) { /* not executed */ }
/* ... */
if (isParallel()) { /* not executed */ }
return spliterator;
/*>*/terminalOp.evaluateSequential(stream3, spliterator2);
// ReduceOps#ReduceOp#evaluateSequential(PipelineHelper, Spliterator):
ReducingSink sink = terminalOp.makeSink()
return new ReducingSink()
Sink sink = terminalOp.wrapAndCopyInto(sink, spliterator)
Sink wrappedSink = wrapSink(sink)
// AbstractPipeline#wrapSink(Sink)
for (/* executed twice */) { p.opWrapSink(p.previousStage.combinedFlags, sink) }
return new Sink.ChainedReference(sink)
terminalOp.copyInto(wrappedSink, spliterator);
// AbstractPipeline#copyInto()
if (!StreamOpFlag.SHORT_CIRCUIT.isKnown(getStreamAndOpFlags())) {
/*>*/wrappedSink.begin(spliterator.getExactSizeIfKnown());
/*>*/ /* not important */
/*>*/supplier.get() // initializes ArrayList
/*>*/spliterator.forEachRemaining(wrappedSink)
// Spliterators#ArraySpliterator#foreachRemaining(Consumer):
// ... unimportant code
!! do {
/*>*/action.accept((String)a[i])
} while (++i < hi) // for each fish :)
/*>*/wrappedSink.end() // no-op
} else { /* not executed */}
return sink;
return sink.get()
}
/*>*/if (collector.characteristics().contains(Collector.Characteristics.IDENTITY_FINISH)) { return container; }
/*>*/else { /* not executed */ }
The exclamation marks point to the actual workhorse: a do-while loop in fishList
's Spliterator
. Here is more detailed trace of the do-while loop:
do {
/*>*/action.accept((String)a[i])
if (predicate.test(u)) { downstream.accept(u); } // predicate is our fishPredicate
downstream.accept(mapper.apply(u)); // mapper is our fishFunction
accumulator.accept(u)
// calls add(u) on resulting ArrayList
} while (++i < hi) // for each fish :)
Let's look at how the relevant parts of the executed code look like in the bytecode. The interesting part is how
fishList.stream().filter(f -> f.contains("fish")).map(f -> f.substring(0, 1).toUpperCase() + f.ubstring(1)).collect(Collectors.toList());
is translated. You can find the full version on pastebin. I will focus only on filter(f -> f.contains("fish"))
here:
invokedynamic #26, 0 // InvokeDynamic #0:test:()Ljava/util/function/Predicate; [
java/lang/invoke/LambdaMetafactory.metafactory(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
(Ljava/lang/Object;)Z,
FishTest.lambda$fish8$0(Ljava/lang/String;)Z,
(Ljava/lang/String;)Z
]
invokeinterface #27, 2 // InterfaceMethod java/util/stream/Stream.filter:(Ljava/util/function/Predicate;)Ljava/util/stream/Stream;
There is nothing specific to streams API there, but the new invokedynamic
instruction is used to create lambdas. The Java 7 equivalent of lambdas would be creation of anonymous inner class implementing Predicate
. This would be translated to bytecode as:
new FishTest$1 // create new instance of Predicate
dup
invokespecial FishTest$1.<init>()V // call constructor
Creating a lambda in Java 8 is translated as a single invokedynamic
instruction instead, without creation of a new object. The purpose of invokedynamic
instruction is to defer creation of lambda to runtime (as opposed to compile time). This enables features like caching lambda instances:
The use of invokedynamic lets us defer the selection of a translation strategy until run time. The runtime implementation is free to select a strategy dynamically to evaluate the lambda expression. ... The invokedynamic mechanics allow this to be done without the performance costs that this late binding approach might otherwise impose. ... For example, ... we generate the class the first time a given lambda factory site is called. Thereafter, future calls to that lambda factory site will re-use the class generated on the first call.
Arguments of invokedynamic
give a "recipe" for constructing an instance of the respective functional interface. They represent the metafactory for runtime instance creation, reference to the method it implements (i.e. Predicate.test()
) and implementation of the method.
In our case, the implementation is invocation of static method boolean lambda$fish8$0(String)
which the compiler sneaks in to our class. It contains the actual bytecode for f.contains("fish")
. If you used lambda capturing method references (e.g. list::add
), captured variables from the outer scope, etc., things would get more complex - look for occurences of "indy" in this document for more info.
The other parts of bytecode are less interesting. The do-while loop, apart from the obvious looping, contains an invokeinterface
instruction calling accept()
on the respective Consumer
.
accept()
call propagates down the sinks, calling our lambdas along the way. Nothing special here, both lambda calls, and propagation through sinks are simple invokeinterface
instructions.
Indentation is used to show the unfolded body of call above the indented code. Code beggining with /*>*/
represents continuation of the current call (when needed for better readability). Therefore call
Objects.requireNonNull(new Object());
would be written in the trace pseudocode as:
Object o = new Object(); // extracted variable to improve visibility of new instance creation
Objects.requireNonNull(o);
// this is the body of Objects.requireNonNull():
if (o == null) {
/*>*/throw new NullPointerException(); // this line is still part of requireNonNull() body
}
return o;
I also skipped some unimportant calls like null checks, omitted generic parameters, extracted inlined expressions to variables where appropriate, etc., to improve readability.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With