Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does prefetch mean in Project Reactor?

I am using Project Reactor and am using Flux.flatMapIterable. (I basically have a stream of object A; for each object A I flatmap it into a stream of object B and I make a new flux out of it.)

I am trying to understand what the prefetch setting does. In my case, each object A could potentially be converted into any number of object B (between 0 to N; N is large).

I just want to understand:

  • What is prefetch? Does it only apply to the initial request? (i.e. if I set it to 1, is Project Reactor intelligent enough to increase the request size if it finds out that 1 is too small?)

  • Is it relevant for my situation here? I was thinking about setting the prefetch to 1 in order to be conservative, since a single A object has the potential to be flatmapped into a large stream of B objects.

like image 835
CowZow Avatar asked Jun 19 '19 20:06

CowZow


People also ask

What is sink in reactive programming?

Sinks are constructs through which Reactive Streams signals can be programmatically pushed, with Flux or Mono semantics. These standalone sinks expose tryEmit methods that return an Sinks.

What is flux in reactor?

Mono and Flux are both reactive streams. They differ in what they express. A Mono is a stream of 0 to 1 element, whereas a Flux is a stream of 0 to N elements.

How does Project reactor work?

Project Reactor is a direct implementation of the Reactive Streams Specification. The main feature of Reactive Streams Specification is that it provides a medium of communication between the stream producer and stream consumer so that a consumer can demand the stream according to its processing capabilities.

How do you know if a flux is empty?

You can use hasElements method of Flux to check whether Flux completes empty. It emits a single boolean true if this Flux sequence has at least one element.


1 Answers

Prefetch will affect how many items Reactor requests from the Publisher in it's first request. It is an upper bound that affects subsequent requests where subsequent requests will be triggered when 75% of the prefetch amount has been emitted so it will not increase automatically.

It is typically used for scenarios where consumer(s) request a large amount of data but the data source behaves better or can be optimized with smaller requests (eg. database paging, etc...).

Whether it is relevant for your use case depends on the characteristics of your publisher.

From the [documentation] on configuring back pressure 1

You might also have noticed that some operators have variants that take an int input parameter called prefetch. This is another category of operators that modify the downstream request. These are usually operators that deal with inner sequences, deriving a Publisher from each incoming element (like flatMap).

Prefetch is a way to tune the initial request made on these inner sequences. If unspecified, most of these operators start with a demand of 32.

These operators usually also implement a replenishing optimization: once the operator has seen 25% of the prefetch request fulfilled, it re-requests 25% from upstream. This is a heuristic optimization made so that these operators proactively anticipate the upcoming requests.

like image 184
McGin Avatar answered Oct 13 '22 20:10

McGin