I'm testing the limit of my Python Flask web application running on an Apache web server by making a request that takes over 30minutes to complete. The request requires thousands of database requests (one after the other) to a MySQL database. I understand this should ideally be run as a separate asynchronous process outside the apache server, but let's ignore that for now. The problem I'm having is that although this runs completely when I test it on my mac, it dies abruptly when running it on a linux server (Amazon linux on AWS EC2). I've not been able to figure out exactly what's killing it. I've checked that the server isn't running out of memory. The process uses very little RAM. I've not been able to find any Apache config parameter or any error message that makes sense to me (even after setting apache logLevel to debug). Please I need help on where to look. Here're more details about my setup:
Run Time
Server: It died after 8mins, 27mins, 21mins & 22mins respectively. Note that most of these runs were on a UAT server and this was the only request the server was processing.
Mac: It ran much slower that it runs on the server. The process ran successfully and took 2hours 47mins.
Linux Server details:
2 virtual CPUs and 4GB RAM
OS (output of uname -a
)
Linux ip-172-31-63-211 3.14.44-32.39.amzn1.x86_64 #1 SMP Thu Jun 11 20:33:38 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Apache error_log: https://drive.google.com/file/d/0B3XXZfJyzJYsNkFDU3hJekRRUlU/view?usp=sharing
Apache config file: https://drive.google.com/file/d/0B3XXZfJyzJYsM2lhSmxfVVRNNjQ/view?usp=sharing
Apache version (output of apachectl -V
)
Server version: Apache/2.4.23 (Amazon)
Server built: Jul 29 2016 21:42:17
Server's Module Magic Number: 20120211:61
Server loaded: APR 1.5.1, APR-UTIL 1.4.1
Compiled using: APR 1.5.1, APR-UTIL 1.4.1
Architecture: 64-bit
Server MPM: prefork
threaded: no
forked: yes (variable process count)
Server compiled with....
-D APR_HAS_SENDFILE
-D APR_HAS_MMAP
-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
-D APR_USE_SYSVSEM_SERIALIZE
-D APR_USE_PTHREAD_SERIALIZE
-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
-D APR_HAS_OTHER_CHILD
-D AP_HAVE_RELIABLE_PIPED_LOGS
-D DYNAMIC_MODULE_LIMIT=256
-D HTTPD_ROOT="/etc/httpd"
-D SUEXEC_BIN="/usr/sbin/suexec"
-D DEFAULT_PIDLOG="/var/run/httpd/httpd.pid"
-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
-D DEFAULT_ERRORLOG="logs/error_log"
-D AP_TYPES_CONFIG_FILE="conf/mime.types"
-D SERVER_CONFIG_FILE="conf/httpd.conf"
Mac details:
Apache config file: https://drive.google.com/file/d/0B3XXZfJyzJYsRUd6NW5NY3lON1U/view?usp=sharing
Apache version (output of apachectl -V
)
Server version: Apache/2.4.18 (Unix)
Server built: Feb 20 2016 20:03:19
Server's Module Magic Number: 20120211:52
Server loaded: APR 1.4.8, APR-UTIL 1.5.2
Compiled using: APR 1.4.8, APR-UTIL 1.5.2
Architecture: 64-bit
Server MPM: prefork
threaded: no
forked: yes (variable process count)
Server compiled with....
-D APR_HAS_SENDFILE
-D APR_HAS_MMAP
-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
-D APR_USE_FLOCK_SERIALIZE
-D APR_USE_PTHREAD_SERIALIZE
-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
-D APR_HAS_OTHER_CHILD
-D AP_HAVE_RELIABLE_PIPED_LOGS
-D DYNAMIC_MODULE_LIMIT=256
-D HTTPD_ROOT="/usr"
-D SUEXEC_BIN="/usr/bin/suexec"
-D DEFAULT_PIDLOG="/private/var/run/httpd.pid"
-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
-D DEFAULT_ERRORLOG="logs/error_log"
-D AP_TYPES_CONFIG_FILE="/private/etc/apache2/mime.types"
-D SERVER_CONFIG_FILE="/private/etc/apache2/httpd.conf"
If you are using embedded mode of mod_wsgi that can happen as Apache controls the life time of processes and can recycle them if it thinks a process is no longer required due to insufficient traffic.
You might be thinking 'but I am using daemon mode and not embedded mode', but reality is that you aren't as your configuration is wrong. You have:
<VirtualHost *:5010>
ServerName localhost
WSGIDaemonProcess entry user=kesiena group=staff threads=5
WSGIScriptAlias "/" "/Users/kesiena/Dropbox (MIT)/Sites/onetext/onetext.local.wsgi"
<directory "/Users/kesiena/Dropbox (MIT)/Sites/onetext/app">
WSGIProcessGroup start
WSGIApplicationGroup %{GLOBAL}
WSGIScriptReloading On
Order deny,allow
Allow from all
</directory>
</virtualhost>
That Directory
block doesn't use a directory which matches the path in WSGIScriptAlias
, so none of it applies.
Use:
<VirtualHost *:5010>
ServerName localhost
WSGIDaemonProcess entry user=kesiena group=staff threads=5
WSGIScriptAlias "/" "/Users/kesiena/Dropbox (MIT)/Sites/onetext/onetext.local.wsgi"
<directory "/Users/kesiena/Dropbox (MIT)/Sites/onetext">
WSGIProcessGroup start
WSGIApplicationGroup %{GLOBAL}
Order deny,allow
Allow from all
</directory>
</virtualhost>
The only reason it worked at all without that matching is that you had opened up access to Apache to host files in that directory by having:
<Directory "/Users/kesiena/Dropbox (MIT)/Sites">
Require all granted
</Directory>
It is bad practice to also set DocumentRoot
to be a parent directory of where your application source code exists. With the way it is written there is a risk I could come in on a different port or VirtualHost
and download all your application code.
Do not stick your application code under the directory listed against DocumentRoot
.
BTW, even when you have the WSGI application running in daemon mode, Apache can still recycle the worker processes it will use to proxy requests to mod_wsgi. So even if your very long running request keeps running in the WSGI application process, it could fail as soon as it starts to send a response if the worker process had been recycled in the interim because it had been running too long.
You should definitely farm out the long running operation to a back end Celery task queue or similar.
You might be hitting forced socket closures, though with the times you gave that does not look too likely. For a project I had on Azure, any connection that was idle for about 3 minutes would get closed by the system. I believe these closures were done ahead of the server in the network routing, so there was no way to disable them or increase the timeout.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With