Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Supervisor workers stop overtime and do not restart (used with laravel)

I finally managed to get supervisord service to run anytime server has been restarted using this tutorial and that part works great. I set it up so that it runs up 10 workers. The problem is when I check the next day, some of those workers have stopped and did not restart. The server wasn't restarted either.

This is how my program is setup in supervisord.conf file:

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command= php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
directory=/home/magic/_app/_laravel
numprocs=10
redirect_stderr=true
autostart=true
autorestart=true
startsecs=0
startretries=10
exitcodes=0,2
stopsignal=TERM
stopwaitsecs=10
stdout_logfile=/home/magic/_app/_laravel/storage/logs/supervisor/supervisor.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
stdout_capture_maxbytes=0
stdout_events_enabled=false
user=magic
environment=PATH="/usr/local/bin"

The first time I start the service using service supervisord start, and then if I do service supervisord status, it shows up the processes below. I don't fully understand them, but it looks right because there's 10 instances of my artisan command.

|-17991 /usr/bin/python /usr//bin/supervisord --pidfile /tmp/supervisord.pid
|-17992 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17993 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17994 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17995 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17996 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17997 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17998 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17999 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-18000 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-18001 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-18002 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18003 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18004 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18005 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18006 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18007 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18008 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18009 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-18010 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
`-18011 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production

But then after some time, usually the next day, I see things like this. And this time it looks like only 6 processes are running, so 4 of them have stopped.

|- 3984 /usr/bin/python /usr//bin/supervisord --pidfile /tmp/supervisord.pid
|- 3985 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|- 3986 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|- 3988 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|- 3989 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-12224 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-13610 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-14263 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-16417 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
|-17424 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-17425 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-17427 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-17428 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-17429 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-17430 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
|-17856 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
`-18356 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600

The supervisord log shows pretty normal stuff. There's a few 500 errors in there and I thought maybe that was a lead to investigate but they don't match the number of missing workers nor the timestamps.

The laravel log itself has a few maximum execution time exceeded errors, actually it has exactly 10 of them from last time I cleared the log. So I don't know if that's 1 for each worker. It looks something like this. Again the timestamps don't really match up with supervisor log so it's hard to know if that could be causing it. I can increase the execution time allowed by PHP I just don't know if that can cause the workers not to restart.

[2016-04-14 02:56:20] production.ERROR: exception 'Symfony\Component\Debug\Exception\FatalErrorException' with message 'Maximum execution time of 30 seconds exceeded' in /home/magic/_app/_laravel/vendor/symfony/process/Symfony/Component/Process/Pipes/UnixPipes.php:130

If anyone has any advice or suggestion I would appreciate it. Would really like to fix this issue.

So after fixing supervisorctl to work properly, now when I do supervisorctl status, it shows all 10 workers as running. But the problem is they're not running the commands they're supposed to.

Here's what supervisorctl status shows now:

laravel-worker:laravel-worker_00   RUNNING   pid 11566, uptime 4:38:36
laravel-worker:laravel-worker_01   RUNNING   pid 18358, uptime 2 days, 18:40:16
laravel-worker:laravel-worker_02   RUNNING   pid 8102, uptime 1 day, 20:57:15
laravel-worker:laravel-worker_03   RUNNING   pid 10691, uptime 11:42:54
laravel-worker:laravel-worker_04   RUNNING   pid 32543, uptime 1 day, 22:07:53
laravel-worker:laravel-worker_05   RUNNING   pid 30215, uptime 20:40:02
laravel-worker:laravel-worker_06   RUNNING   pid 32732, uptime 1 day, 22:03:05
laravel-worker:laravel-worker_07   RUNNING   pid 24625, uptime 21:44:41
laravel-worker:laravel-worker_08   RUNNING   pid 19807, uptime 2:36:21
laravel-worker:laravel-worker_09   RUNNING   pid 32332, uptime 1 day, 3:07:34

And here's what service supervisord status shows. As you can see it's showing 10 workers plus 3 tasks those workers are performing. But it should be 10 tasks to match the 10 workers. So it seems like overtime workers stop performing the tasks for some reason even though they are still running:

* supervisord.service - SYSV: Supervisor Server
   Loaded: loaded (/etc/rc.d/init.d/supervisord)
   Active: active (running) since Thu 2016-04-21 16:44:43 EDT; 2 days ago
     Docs: man:systemd-sysv-generator(8)
   CGroup: /system.slice/supervisord.service
           |- 8102 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-10691 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-11566 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-18356 /usr/bin/python /usr//bin/supervisord --pidfile /tmp/supervisord.pid
           |-18358 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-19807 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-24625 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-28557 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
           |-28711 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
           |-28712 /usr/local/bin/php artisan queue:work --queue=high,default --delay=0 --memory=128 --sleep=5 --tries=5 --env=production
           |-30215 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-32332 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           |-32543 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600
           `-32732 php /home/magic/_app/_laravel/artisan queue:listen --queue=high,default --tries=5 --sleep=5 --timeout=600

Apr 21 16:44:43 host systemd[1]: Starting SYSV: Supervisor Server...
Apr 21 16:44:43 host supervisord[18351]: Starting supervisord: /usr/lib/python2.7/site-packages/supervisor-3.2.3-py2.7.egg/supervisor/option...
Apr 21 16:44:43 host supervisord[18351]: 'Supervisord is running as root and it is searching '
Apr 21 16:44:43 host supervisord[18351]: [17B blob data]
Apr 21 16:44:43 host systemd[1]: Started SYSV: Supervisor Server.
like image 797
zen Avatar asked Apr 14 '16 14:04

zen


1 Answers

I think I might have fixed this issue. It might have been caused by using incorrect php in command attribute in supervisord.conf file. I changed it from php to /usr/local/bin/php and it's been running for 2 days now and none of the tasks have had to restart or have stopped processing queues.

I will update this question later if I see the problem repeat but it seems like it's running much more stable now than it ever has.

like image 120
zen Avatar answered Oct 21 '22 19:10

zen