Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you run a worker with AWS Elastic Beanstalk?

I am launching a Django application on AWS Elastic Beanstalk. I'd like to run a background task or worker in order to run celery.

I can not find if it is possible or not. If yes how could it be achieved?

Here is what I am doing right now, but this is producing an event type error every time.

container_commands:   01_syncdb:     command: "django-admin.py syncdb --noinput"     leader_only: true   50_sqs_email:     command: "./manage.py celery worker --loglevel=info"     leader_only: true 
like image 804
Maxime P Avatar asked Feb 07 '13 21:02

Maxime P


People also ask

What does AWS worker do?

You can use a worker tier to do background processing of images, audio, documents and so on, as well as offload long-running processes from the web tier. This blog post covers the benefits of pairing Amazon SQS and Spot Instances to maximize cost savings in the worker tier, and a customer success story.

What user does Elastic Beanstalk run as?

The default user on Elastic Beanstalk instances is ec2-user , with the same group, ec2-user . Use commands to manipulate files/directories on EC2 instance, and container commands to manipulate your application files/directories.

Which AWS services can be used with Elastic Beanstalk?

Elastic Beanstalk uses core AWS services such as Amazon Elastic Compute Cloud (EC2), Amazon Elastic Container Service (ECS), AWS Auto Scaling, and Elastic Load Balancing (ELB) to easily support applications that need to scale to serve millions of users.


2 Answers

As @chris-wheadon suggested in his comment, you should try to run celery as a deamon in the background. AWS Elastic Beanstalk uses supervisord already to run some deamon processes. So you can leverage that to run celeryd and avoid creating a custom AMI for this. It works nicely for me.

What I do is to programatically add a celeryd config file to the instance after the app is deployed to it by EB. The tricky part is that the file needs to set the required environmental variables for the deamon (such as AWS access keys if you use S3 or other services in your app).

Below there is a copy of the script that I use, add this script to your .ebextensions folder that configures your EB environment.

The setup script creates a file in the /opt/elasticbeanstalk/hooks/appdeploy/post/ folder (documentation) that lives on all EB instances. Any shell script in there will be executed post deployment. The shell script that is placed there works as follows:

  1. In the celeryenv variable, the virutalenv environment is stored in a format that follows the supervisord notation. This is a comma separated list of env variables.
  2. Then the script creates a variable celeryconf that contains the configuration file as a string, which includes the previously parsed env variables.
  3. This variable is then piped into a file called celeryd.conf, a supervisord configuration file for the celery daemon.
  4. Finally, the path to the newly created config file is added to the main supervisord.conf file, if it is not already there.

Here is a copy of the script:

files:   "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":     mode: "000755"     owner: root     group: root     content: |       #!/usr/bin/env bash        # Get django environment variables       celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`       celeryenv=${celeryenv%?}        # Create celery configuraiton script       celeryconf="[program:celeryd]       ; Set full path to celery program if using virtualenv       command=/opt/python/run/venv/bin/celery worker -A myappname --loglevel=INFO        directory=/opt/python/current/app       user=nobody       numprocs=1       stdout_logfile=/var/log/celery-worker.log       stderr_logfile=/var/log/celery-worker.log       autostart=true       autorestart=true       startsecs=10        ; Need to wait for currently executing tasks to finish at shutdown.       ; Increase this if you have very long running tasks.       stopwaitsecs = 600        ; When resorting to send SIGKILL to the program to terminate it       ; send SIGKILL to its whole process group instead,       ; taking care of its children as well.       killasgroup=true        ; if rabbitmq is supervised, set its priority higher       ; so it starts first       priority=998        environment=$celeryenv"        # Create the celery supervisord conf script       echo "$celeryconf" | tee /opt/python/etc/celery.conf        # Add configuration script to supervisord conf (if not there already)       if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf           then           echo "[include]" | tee -a /opt/python/etc/supervisord.conf           echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf       fi        # Reread the supervisord config       supervisorctl -c /opt/python/etc/supervisord.conf reread        # Update supervisord in cache without restarting all services       supervisorctl -c /opt/python/etc/supervisord.conf update        # Start/Restart celeryd through supervisord       supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd 
like image 77
yellowcap Avatar answered Sep 25 '22 14:09

yellowcap


I was trying to do something similar in PHP however for whatever reason I couldn't keep the worker running. I switched to a AMI on an EC2 server and have had success ever since.

like image 26
Michael J. Calkins Avatar answered Sep 24 '22 14:09

Michael J. Calkins