I want to know if there are any good solutions for autoscaling dynos AND workers on Heroku in a production environment (probably a different solution for each of those, as they are pretty unrelated). What are you/companies using, regarding this?
I found lots of options, but none of them seem really mature for a production environment. There is Heroscale, which seem to introduce some latency as it does not run locally, and I also heard of some downtime. There are modifications of delayed_jobs, which have not been updated for a long time, and there are some issues with current bundlers. There is also some alternatives related to reque, which seem not to handle very well some HTTP exceptions, which results in app crashing, and others which seem to need an always-running worker to schedule other workers, and may also suffer from some HTTP exceptions problems.
Well. In the end. What is being used, nowadays, for autoscaling Heroku's dynos and workers on a production environment with Rails3?
Thanks in advance.
We ran into this a while ago and I spent quite a bit of time on this to my great frustration. I'll try to stick to the salient point. There are several Heroku autoscaling solution that seem decent at first glance.
The example that has already been given heroku-autoscaler is actually for autoscaling dynos and is pretty much the only solution out there that claims to do this (and it certainly doesn't do it well). Most others will only claim to autoscale workers for you. So, let's focus on that first. The autoscalers you'll look at for workers depend on what you're actually using for you background workers e.g. delayed_job, resque. Those are the most common background processing libs that people use, so the autoscalers will try to hook into one of them. You can use things like:
Some of these work on the Cedar stack some might need a bit of tweaking. The problem with all of them is that it's like trying to pull yourself out of the swamp by your own hair. Let's take hirefire as an example (it's probably the best one of the lot). It modifies delayed_job so that the workers themselves can look at the queue and spin up more workers if necessary, if there are no more jobs in the queue, the workers will all shut each other down. There are several problems:
The thing that solves this problem is to have one worker running continuously it can therefore monitor the queue periodically and can execute jobs when necessary or even spin up more workers. But if you do that, you're not saving any money (you have a worker running continuously 24/7 and have to pay for that) and that's the whole premise behind autoscalers on heroku. In essence, if you only have occasional background processing to do, or you have background jobs that are likely to fail but succeed on retry, or you have background jobs that don't need to be executed instantly, there is no autoscaling library you can use that will work for you.
Here is one alternative. The guy who wrote Hirefire, later spun it off into a webapp (Hirefire app), the essence of which is to externally monitor your Heroku workers/dynos for you and spin up/shut down workers dynos as necessary. This was free in beta but it now costs money, less than what you'd pay to run a worker 24/7 but still not insignificant if you only need a few background jobs once in a while. Either way this is the only viable way to make sure your background job infrastructure does what you want (well that and rolling your own solution which means having a machine like an EC2 instance where you can put some scripts which will ping your heroku app and spin up/shut down workers as needed - a non-trivial amount of effort).
Now Hirefire app does offer to autoscale your dynos for you as well, it does this based on hooking in to the latency of your heroku request queue. However I found that this didn't work well, perhaps if you're close to the Amazon datacenter where your heroku app actually lives (we weren't), you might have a different experience. But, for us it unnecessarily spun up a whole bunch of dynos and would never spin them down no matter how much I tweaked the settings. You can put it down to the fact that it was a beta it may have improved since then, but that's the experience that I had.
Long story short, if you want to autoscale your workers, use Hirefire app, you'll be saving a lot less money than you thought, but it is still the cheapest option. If you want to autoscale dynos you're basically out of luck. This is just one of those limitations you live with for having the convenience of a platform like Heroku.
Heroku is offering a new add-on called AdeptScale which is now just out of Beta.
Here is the add-on page for AdeptScale
Here is the more detailed documentation for AdeptScale
Here is the form to sign up for Heroku's Beta Program
Hopefully this will be a robust solution for autoscaling Heroku Dynos, as I'm not still not happy with the current options.
Update (2/4/13): I signed up for Heroku's Beta program to try out this add-on, and its worked really well for me. Occasionally scaling up with traffic, but mostly sitting on the minimum number of dynos I've set of 2. It's greatly reduced my bill, and eliminated worry that I might be slow during peak usage times.
Update (3/6/13): Added link to Heroku's Sign up page for their beta program.
Update (4/14/13): Looks like auto-scaling is out of Beta. It's still working really well for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With