Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do we configure shinyserver open source to support concurrent users

I have an R Shiny app which I want to host to support around 50 concurrent users, using an open source solution. I came across shinyserver by RStudio which can be used to deploy shiny apps to web. I want to use the open source version of shinyserver.

The documentation says that we can use the simple scheduler to define the number of concurrent connections.

The Simple Scheduler is the only scheduler available in the Open Source edition of Shiny Server. It associates a single R process with a single Shiny application. This scheduler accepts a single parameter which specifies the maximum number of concurrent sessions. Once this number is reached, users attempting to create a new session on this application will receive a 503 error page.

The documentation for simple scheduler says,

simple_scheduler A basic scheduler which will spawn one single-threaded R worker for each application. If no scheduler is specified, this is the default scheduler.

It says that open source shinyserver supports a single R process but at the same time it mentions that there will be 1 single threaded R worker for each application. So if I want to support 50 concurrent users for 1 application, how do I achieve it? Do I need to create 50 instances of the application on the same server or will one instance of the application be serviced by 50 worker threads?

Also the default number of concurrent connections mentioned is 100. What is the maximum?

Can someone explain how do we go about this?

like image 992
Avi Avatar asked Dec 04 '19 05:12

Avi


1 Answers

You have several options, with pros and cons:

a) Classic simple shiny app with shinyserver

What you have. As you had read, in shinyserver, shiny apps run in a single thread R worker. This means that if you have more than one user, all concurrent users will interact with the app in this R worker. If the app have some slow calculations (raster calculations, forecasting with some heavy data, downloading a big file...) and one user ask for any of that, the other users will experience a drop in the app responsiveness, as they need to wait for the big calculation (none of them are aware of) to finish.
In summary, with this option you can find your users hitting multiple times the inputs or interactive outputs and the app seeming stuck. And the simple_scheduler does not help you here, as it states the number of concurrent users allowed, not the number of R threads.

b) Async shiny app with shinyserver

Here you have a nice RStudio docs explaining how to scale your shiny app using promises and futures. Depending on the complexity of your app, implementing this can range from "really easy" to "really hard". With this solution more threads are spawned, but only for specified calculations. But again, this have its limitations:

Async programming is mainly effective when your app has one, or two, or a few spots where a lot of time is spent.

If your app is even more complex and you have to support a high number of concurrent users then you have to explore different options.

c) Shinproxy

As shinyproxy says (emphasis mine):

ShinyProxy is your favourite way to deploy Shiny apps in an enterprise context. It has built-in functionality for LDAP authentication and authorization, makes securing Shiny traffic (over TLS) a breeze and has no limits on concurrent usage of a Shiny app

Shinyproxy uses java (Springboot) to launch docker images of the app for each user visiting the app. This results in one app per user, something you can not do in shinyserver.
It is a little more complex to set up than shinyserver, but can be an option (In fact, I'm already using it in production and it works really well).
But, take in mind that even if they state that "... (shinyproxy) has no limits in concurrent usage", that is not completly true. The limit is what your server supports. Each app will consume RAM and CPU, if your app consumes a lot of these, the limits of users are stated by your server resources, and believe me, you don't want to trigger an OOM exception in your production server (asumming a linux server here).

d) Docker swarm, kubernetes...

Assuming you have access to a cluster of servers, you can use containerized solutions like docker swarm and kubernetes, as they offer load balancing out of the box (after some complex configurations). I'm not yet proficient in those methods, so I can not get deep on them. But lately I'm testing the combination of shinyproxy and docker swarm to create a service stack with shinyproxy, web server and a db server (postgres) and is really promising.

Summary

You need to know aprox. how many concurrent user you will have (~50, you already had done this).
You need to know well your app. Identify bottlenecks and slow steps. Fix/optimize them, if possible, and check if the app is responsive enough in a single thread for your expected number of users.
If not optimization is possible, try out async and check again. If nothing of this help, you need to dive in more complex solutions as stated before.

Hope this help you. Of course, there must be other options not stated here, but I don't really know them weel to comment about ;)

like image 182
MalditoBarbudo Avatar answered Nov 16 '22 03:11

MalditoBarbudo