Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In project reactor or akka streams, what is the conceptual difference between sink and subscriber?

The concepts of sink and subscriber seem similar to me. Also, I don't see the concept of sink being explicitly defined in the reactive streams spec.

like image 287
Jatin Avatar asked Jan 09 '18 06:01

Jatin


People also ask

What is sink in project reactor?

Sinks are constructs through which Reactive Streams signals can be programmatically pushed, with Flux or Mono semantics. These standalone sinks expose tryEmit methods that return an Sinks.

What is sink in Akka?

Source[String,akka. NotUsed] = ... Sink: This is the exit point of your stream. There must be at least one in every stream.The Sink is the last element of our Stream . Basically it's a subscriber of the data sent/processed by a Source .

Can you describe what are 3 main components of Akka streams?

Akka streams consist of 3 major components in it – Source, Flow, Sink – and any non-cyclical stream consist of at least 2 components Source, Sink and any number of Flow element.

What is Akka streams used for?

Akka Streams is a library to process and transfer a sequence of elements using bounded buffer space. This latter property is what we refer to as boundedness, and it is the defining feature of Akka Streams.


1 Answers

I see that Oleh Dokuka, from Project Reactor (missing disclaimer there), posted an answer already, however much of its assumptions about Akka Streams and Reactive Streams are incorrect, so allow me to clarify below.

Disclaimer: I participated in Reactive Streams since it's early days, and authored most of its Technology Compatibility Kit. I also maintain Akka and Akka Streams.

Also note that: Reactive Streams have been included in Java 9, and are there known as java.util.concurrent.Flow.* so all the comments below regarding RS stand exactly the same way about j.u.c.Flow.Subscriber and the other types.


The answer

Reactive Streams is a Service Provider Interface (SPI) Specification

Reactive Streams, and specifically the Publisher / Subscriber / Subscription / Processor types, are a Service Provider Interface. This is confirmed even in the earliest discussions about the specification dating back to 2014.

In the earliest days of the specification even the spec's types attempted to hide Publisher, Subscriber and the other types. Sadly the types would leak regardless in the back then considered API, thus the API(!) was removed and the SPI types are all that remained.

Nowadays you see some implementations of Reactive Streams claim that their direct extending of these types is a benefit for some reason. This is not correct, as such was not, and is not the goal of the Reactive Streams interfaces. It is rather a misunderstanding of what these types are -- strictly the inter-op interfaces that Reactive Streams libraries agree to understand and "speak" (a protocol).

For reference, RxJava 2.0 as well as Reactor do directly extend these types, while Akka Streams remains true to the RS's design and principles by hiding them as a application developer programming interface -- which is why Sink does not extend Subscriber. This has nothing to do with being "native support" how I've seen people claim the direct IS-A relationship is (rather, claiming an inter-op library being your "native" is a misunderstanding of the concept).

Sinks and Subscribers, Sources and Publishers

The concepts of sink and subscriber seem similar to me.

Correct, they are, on purpose and by design, similar.

As a Sink is a lifted representation of something that effectively yields a Subscriber. To simplify, you can think of it as a "Subscriber factory" (more specifically, the Sink is the "blueprint", and the Materializer takes the sink's blueprint and creates the appropriate RS stages, including Publishers for Sources and Subscribers for Sinks. So when you say Sink.ignore it actually is a factory that will end up creating a Subscriber that does all the requesting and ignoring, as according to Reactive Streams. The same with all other methods declared on Sink.

The same applies to Source, which relates 1:1 to a Reactive Streams Publisher. So a Source.single(1) is something that will internally materialize into a Publisher that does it's job - emits that 1 element if it's allowed to do so by it's downstream.

A.K.A. Why there is no Sink in Reactive Streams?

As mentioned above, Akka's Sink does not directly extend a Subscriber. It is however basically a factory for them.

You may ask: "Does the user never see these Publisher/Subscriber types at all though during normal usage?" And the answer is: yes indeed, and this is a feature as well as design goal (in accordance with what Reactive Streams is). If the underlying Publisher and Subscriber instances were exposed to users all the time directly, one may call them incorrectly causing bugs and confusion. If these types are never exposed unless explicitly asked for, there is less chances for accidental mistakes!

Some have misunderstood that design, and claimed that there is no "native" support for it in Akka Streams (which is not true). Let's see through what being detached from the Subscriber in the API gains us:

Also, I don't see the concept of sink being explicitly defined in the reactive streams spec.

Indeed, Sinks are not part of Reactive Streams, and that's absolutely fine.

Benefits from avoiding the "Sink IS-A Subscriber"

Sinks are part of Akka Streams, and their purpose is to provide the fluent DSL, as well as be factories for Subscribers. In other words, if Subscriber is the LEGO blocks, Sink is what builds them (and the Akka Stream Materializer is what puts the various LEGO blocks together in order to "run" them).

In fact, it is beneficial to users that Sink does not carry any definitive IS-A with a Subscriber (sic!) like other libraries do:

This is because since org.reactivestreams.Subscriber has now been included in Java 9, and has become part of Java itself, libraries should migrate to using the java.util.concurrent.Flow.Subscriber instead of org.reactivestreams.Subscriber. Libraries which selected to expose and directly extend the Reactive Streams types will now have a tougher time to adapt the JDK9 types -- all their classes that extend Subscriber and friends will need to be copied or changed to extend the exact same interface, but from a different package. In Akka we simply expose the new type when asked to -- already supporting JDK9 types, from the day JDK9 was released.

With Reactive Streams being an SPI -- a Service Provider Interface -- it is intended for libraries to share such that they can "talk the same types and protocol". All communication that Akka Streams do, and other Reactive Streams libraries do, adhere to those rules, and if you want to connect some other library to Akka Streams, you'd do just that -- give Akka Streams the inter-op type, which is the Subscriber, Processor, or Publisher; not the Sink, since that's the Akka's "Akka specific" DSL (domain specific language), which adds convenience and other niceties on top of it, hiding (on purpose!) the Subscriber type.

Another reason Akka (and to be honest other RS implementations were encouraged to do so as well, but chose not to do so) hides these types is because they are easy to do the wrong thing with. If you pass out a Subscriber anyone could call things on it, and even un-knowingly break rules and guarantees that the Reactive Streams Specification requires from anyone interacting with the type.

In order to avoid mistakes from happening, the Reactive Streams types in Akka Streams are "hidden" and only exposed when explicitly asked for—minimizing the risk of people making mistakes by accidentally calling methods on “raw” Reactive Streams types without following their protocol.

like image 119
Konrad 'ktoso' Malawski Avatar answered Oct 11 '22 10:10

Konrad 'ktoso' Malawski