I've read all of the articles I can find on Heroku about Puma and dyno types and I can't get a straight answer.
I see some mentions that the number of Puma workers should be determined by the number of cores. I can't find anywhere that Heroku reveals how many cores a performance-M or performance-L dyno has.
In this article, Heroku hinted at an approach: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server
I think they're suggesting to set the threads to 1 and increase the number of Puma workers until you start to see R14 (memory) errors, then back off. And then increase the number of threads until the CPU maxes out, although I don't think Heroku reports on CPU utilization.
Can anyone provide guidance?
(I also want to decide whether I should use one performance-L or multiple performance-M dynos, but I think that will be clear once I figure out how to set the workers & threads)
Puma is a multi-threaded web server and our replacement for Unicorn. Unlike other Ruby Webservers, Puma was built for speed and parallelism. Puma is a small library that provides a very fast and concurrent HTTP 1.1 server for Ruby web applications. It is designed for running Rack apps only.
The maximum number of processes/threads that can exist in a dyno at any one time depends on dyno type: free , hobby and standard-1x dynos support no more than 256. standard-2x and private-s dynos support no more than 512. performance-m and private-m dynos support no more than 16384.
Heroku provides enough free hours to run a single dyno continuously for a month, but if we need a worker dyno for background processing (most apps do), we will not have enough free dyno hours. Free dynos are for great for demos, experimentation, and perhaps a staging app.
The roadmap I currently figured out like this:
heroku run "cat /proc/cpuinfo" --size performance-m --app yourapp
heroku run "cat /proc/cpuinfo" --size performance-l --app yourapp
standard-2X
/ standard-1X
to determine PUMA_WORKER
value.
(Max Threads of your desired dyno type could support) / (Max Threads of baseline dyno could support) x (Your experiment `PUMA_WORKER` value on baseline dyno) - (Number of CPU core)
For example, if the PUMA_WORKER
is 3 on my standard-2X
dyno as baseline, then the PUMA_WORKER
number on performance-m
I would start to test it out would be:
16384 / 512 * 3 - 4 = 92
You should also consider how much memory your app consumes and pick the lowest one.
EDIT: Previously my answer was written before ps:exec
available. You could read the official document and learn how to ssh into running dyno(s). It should be quite easier than before.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With