Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deploying Flask app that uses Celery and Redis to AWS: Elastic Beanstalk or EC2 directly?

I'm new to web development and i wrote a small Flask API that uses Celery for message queue and Redis as the broker. I start redis with redis-server and Celery with celery -A application.celery worker --loglevel=info on my local machine and the app runs with no problem.

However i couldn't get it to work on AWS. Right now I'm deploying the app following the docs but when I try to send requests to my API I get internal server errors, which are probably related to Redis and Celery not working. I SSH'ed into the EC2 instance but since I'm new, couldn't find what to do to get the app working.

My questions are:

1) What do i do to start my application, Redis and Celery after deploying it to AWS? Does Elastic Beanstalk do it automatically or do I need to do some stuff?

2) Where do I find my app files? I think I'll need to install all the requirements manually from requirements.txt, and set up a virtualenv in the EC2 instance, is that right?

3) If I setup and install all the requirements in a virtualenv, will they persist if the EC2 instance changes? The command line tool for Elastic Beanstalk deployed the application automatically and created Load Balancer and Auto Scaling Group. Will the installations I make through the SSH be available when new instances are created, or do I need to manually do that everytime, or is there some other way?

4) I heard some people say that creating an EC2 instance and deploying manually is better than using Elastic Beanstalk. What does Elastic Beanstalk do for me? Is it better if I use Elastic Beanstalk or deploy manually?

Thanks for any help!

like image 858
akaralar Avatar asked Dec 25 '22 19:12

akaralar


1 Answers

For the past week I was trying to do the exact same thing, so I'd thought I'd share everything I've learned. Although these answers are spread about stackoverflow/google, but I can help all the same.

1) To get a flask app running is easy, you can use the elastic beanstalk CLI. Generally, just follow the AWS documentation here, it's fairly straightforward. In terms of Redis/Celery, you start to get multiple things going on. Before you do your initial deploy, you'll probably want to setup the celery worker, you can use this stackoverflow answer on how to setup celery as a daemon. Be sure you read the script, you'll need to set your application name properly. The thing to note when you deploy to production via EBS is that your application will be hosted by apache, meaning some strange things will happen if you call your tasks via "some_task.delay", as the name of the celery app will be unknown. As far as I know, the only way to work around this properly is to use:

my_celery_object.send_task("flask_application_name.the_task", [param1, param2], ...)

Wherever you need to call tasks.

You can now prepare your redis cache. You can use anything, for this I'll just assume you want to use AWS ElasticCache (EC). Going to EC, you'll need to deploy a cache cluster with however many nodes you want. Once deployed you'll see it on the list under "Cache Clusters". Next, click the "X node" link that's in the table, you'll need to copy the endpoint url (and port!) to your celery application which you can learn about here.

So now that you have everything ready to deploy, you'll be sad to hear that the security thing I mentioned earlier will cause your application to fail on any task requests as the elastic cache cluster will be part of the wrong security group initially. Go ahead and deploy, this will create the security group you need along with your app and everything else. You can then find that security group under the EC2 dashboard, under Network & Security -> Security Groups. The name of the group should be the name of your environment, something like "myapp-env" is the default. Now modify the inbound rules and add a custom TCP rule setting the port number to your redis port and the source to "Anywhere", save. At this point, note the group name and go to your elastic cache. Click the Cach Clusters on the left, modify the CACHE CLUSTER (not the node) for the app, and update the VPC security group to the one you just noted and save.

Now celery will automatically connect to the redis cache as it will keep attempting to make connections for awhile. Otherwise you can always redeploy.

Hopefully you now have a functioning Flask/Celery app utilizing redis.

2) You shouldn't need to know the location of your app files on the EBS EC2 instance as it will automatically use a virtual environment and requirements.txt assuming you followed the instructions found here. However, at the time of writing this, you can always ssh to your EC2 instance at find your application files at:

/opt/python/current/app

3) I don't know how you mean "If I setup and install all the requirements in a virtualenv, will they persist if the EC2 instance changes?" As I said previously, if you followed the instructions on how to deploy an EBS environment for flask, then new instances that are deployed will automatically update their environment based on your requirements.txt

4) This is a matter of opinion. I have definitely heard not using EBS could be a better way to go, I personally have no opinion here as I have only used EBS. There have been some severe struggles (including trying to setup this application). What I hear some people do is deploy via EBS so that they can get a pre-configured and ready to go EC2 machine and then make an AMI from that machine, tear the EBS down, and then make an EC2 with the AMI. Regardless of the route you go, if you are planning to have a database backed server, I have learned (the hard way) that you shouldn't have EBS automatically attach the RDS. This is due to the fact that the RDS is then associated with the EBS application, so if you have to replace the resources, terminate it, etc., you'll lose the RDS (you can work around this of course, it's just a pain is all).

like image 170
DeusExMachina25 Avatar answered Dec 28 '22 09:12

DeusExMachina25