Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where to draw the line with reactive programming [closed]

I have been using RxJava in my project for about a year now. With time, I grew to love it very much - now I'm thinking maybe too much...

Most methods I write now have some form of Rx in it, which is great! (until it's not). I now notice that some methods require a lot of work to combine the different observable producing methods. I get the feeling that although I understand what I write now, the next programmer will have a really hard time understanding my code.

Before I get to the bottom line let me give an example straight from my code in Kotlin (Don't dive too deep into it):

private fun <T : Entity> getCachedEntities(
      getManyFunc: () -> Observable<Timestamped<List<T>>>,
      getFromNetwork: () -> Observable<ListResult<T>>,
      getFunc: (String) -> Observable<Timestamped<T>>,
      insertFunc: (T) -> Unit,
      updateFunc: (T) -> Unit,
      deleteFunc: (String) -> Unit)
      = concat(
      getManyFunc().filter { isNew(it.timestampMillis) }
          .map { ListResult(it.value, "") },
      getFromNetwork().doOnNext {
        syncWithStorage(it.entities, getFunc, insertFunc, updateFunc, deleteFunc)
      }).first()
      .onErrorResumeNext { e ->  // If a network error occurred, return the cached data and the error
        concat(getManyFunc().map { ListResult(it.value, "") }, error(e))
      }

Briefly what this does is:

  • Retrieve some timestamped data from storage
    • If data is not new, fetch data from network
      • Sync network data again with the storage (to update it)
    • If a network error occured, again retrieve the older data and the error

And here comes my actual question: Reactive programming offers some really powerful concepts. But as we know with great power comes great responsibility.

Where do we draw the line? Is it OK to fill our entire programs with awesome reactive oneliners or should we save it only for really mundane operations?

Obviously this is very subjective, but I hope someone with more experience can share his knowledge and pitfalls. Let me phrase it better

How do I design my code to be reactive yet easy to read?

like image 421
maxandron Avatar asked Mar 13 '16 18:03

maxandron


2 Answers

When you pick up Rx, it becomes this awesome shiny hammer and everything starts looking like a rusty nail just waiting for you to bang in.

Personally, I think the biggest clue is in the name, reactive framework. Given a requirement, you need to reflect upon whether a reactive solution truly makes sense.

In any Rx proposition, you are looking to introduce one or more event streams and carry out some action in response to an event.

I think there are two key questions to ask:

  • Are you in control of the event stream?
  • To what degree must you complete responses at the rate of the event stream?

If you do not have control of the event stream and you must respond at the rate of the event stream then Rx is a good candidate.

In any other circumstance, it is probably a poor choice.

I have seen many examples where people have jumped through hoops to create the illusion of a lack of control in order to justify Rx - which seems crazy to me. Why give up the control that you have?

Some examples:

  1. You have to extract data from a fixed list of files and store it in a database. You decide to push each file name into a subject and create a reactive pipeline that opens each file and projects the data, then processes the data in some way and finally writes it to the database.

    This fails the control test and the rate test. It would be far easier to iterate over the files and pull them in and process them as fast as you can. The phrase "decide to push" is the giveaway here.

  2. You need to display stock prices from a stock exchange.

    Clearly this is a good choice for Rx. If you can't keep up with the rate of prices in general, you are screwed. It might be the case that you conflate prices (perhaps to provide an update only once every second) - but this still qualifies as keeping up. The one thing you can't do is ask the stock exchange to slow down.

These (real world) examples pretty much fall at opposite ends of the spectrum and don't have much grey area. But there is a lot of grey area out there where control isn't clear.

Sometimes you are wearing the client hat in a client/server system and it can be easy to fall into the trap of sacrificing control, or putting control in the wrong place - which can easily be fixed with correct design. Consider this:

  1. A client application displays news updates from a server.

    • News updates are submitted to the server at any time and are created in high volume.
    • The client should be refreshed at an interval set by the client.
    • Refresh interval can be changed at any time and the user can always request an immediate refresh.
    • The client only shows updates tagged with particular keywords, as specified by the user.
    • The news updates are sometimes lengthy and the client should not store the full content of news updates, but rather display the headline and summary.
    • At user request, the full content of an article can be shown.

Here, the frequency of news updates is not in control of the client. But the desired refresh rate and the tags of interest are.

For the client to receive all the news updates as they arrive and filter them client side isn't going to work. But there are plenty of options:

  • Should the server send a data stream of updates taking into account the client refresh rate? What if the client goes offline?
  • What if there are thousands of clients? What if the client wants an immediate refresh?

There are lots of valid ways to tackle this problem that include more or less reactive elements. But any good solution should take account of the client's control of tags and desired refresh rate, and the lack of control of news update frequency (by client or server). You might want the server to react to changes in client interest by updating the events that it pushes to the client - which it pushes only as long as the client is listening (detected via a heartbeat). When the user wants a full article, then the client would pull the article down.

There is much debate in the Rx community about back-pressure. This is the idea that the client should inform the server when it is overloaded and the server respond by somehow reducing the event stream. I think this is a misguided approach that can lead to confusing designs.

To my mind, as soon as a client needs to give this feedback, it has failed the response rate test. At this point, you are not in a reactive situation, you are in an async enumerable situation. i.e. The client should be saying "I am ready" when it is ready for more and then waiting in a non-blocking fashion for server to respond.

This would be appropriate if the first scenario were modified to be files arriving in a drop-folder, of varying lengths and complexity to process. The client should make a non-blocking call for the next file, process it, and repeat. (Add parallelism as required) - and not be responding to a stream of file-arrived events.

Wrap up

I've deliberately avoided other valid concerns such as maintainability of code, performance of Rx itself etc. Most because they are addressed elsewhere and more importantly because I think the ideas here are more divisive than those concerns.

So if you reflect on the elements of control and response rate in your scenario you and will probably stay on the right track.

The response rate issue can be subtle - and the degree aspect is important. Arrival rate can fluctuate, and there is going to be some acceptable degree of fluctuation in response rate - clearly, if you don't ultimately have a way to "catch up" then at some point the client will blow up.

like image 175
James World Avatar answered Sep 21 '22 16:09

James World


I find that there are two things I keep in mind when writing Rx (or any mildly sophisticated/new technology)

  1. Can I test it?
  2. Can I easily hire someone that can maintain it. Not struggle to maintain it, but will be fine left alone to maintain it?

To this end, I also find that just because you can, doesn't always mean you should. As a guide I try to avoid creating queries that are over say 7 lines of code. Queries bigger than this, I try to separate into sub queries that I compose.

If code you have provided is at the core of the code base, and is at the extreme end of the complexity, then It may be fine. However, if you find all of your Rx code carries that much complexity, you may be creating a difficult to work with code base.

like image 39
Lee Campbell Avatar answered Sep 20 '22 16:09

Lee Campbell