I have this reoccurring issue with Nginx about once a day, about twice under high traffic loads. The fix is easy: Restart the server, but when the error happens, Nginx completely stops working. I have an Nginx PHP-FPM setup.
It issue starts with:
020/09/27 09:57:27 [error] 38#38: *430982 upstream timed out (110: Connection timed out) while connecting to upstream, client: x.x.x.x, server: example.com, request: "POST /api/sessions/wri
And then it progress into:
2020/09/27 10:03:22 [alert] 40#40: *431277 open socket #18 left in connection 51
2020/09/27 10:03:22 [alert] 38#38: *431298 open socket #34 left in connection 166
2020/09/27 10:03:22 [alert] 40#40: *431288 open socket #28 left in connection 59
2020/09/27 10:03:22 [alert] 38#38: *431296 open socket #32 left in connection 169
2020/09/27 10:03:22 [alert] 38#38: *431257 open socket #36 left in connection 177
2020/09/27 10:03:22 [alert] 38#38: *431291 open socket #23 left in connection 178
2020/09/27 10:03:22 [alert] 38#38: *431253 open socket #27 left in connection 188
2020/09/27 10:03:22 [alert] 38#38: *431300 open socket #31 left in connection 197
2020/09/27 10:03:22 [alert] 38#38: *431312 open socket #12 left in connection 204
2020/09/27 10:03:22 [alert] 38#38: *431259 open socket #38 left in connection 206
2020/09/27 10:03:22 [alert] 37#37: aborting
2020/09/27 10:03:22 [alert] 38#38: aborting
2020/09/27 10:03:22 [alert] 40#40: aborting
2020/09/27 10:03:23 [warn] 21568#21568: 8096 worker_connections exceed open file resource limit: 1024
2020/09/27 10:08:24 [warn] 21574#21574: *636 upstream server temporarily disabled while connecting to upstream,
Now GET requests still do work. So if I go to the website, it loads. But anything that is POST, PUT or DELETE will fail, so ultimately the users can’t do anything but browse.
Why is this happening? And is there a health check that can be used to detect these issues?
About the first timeout error you see, if you are using a FastCGI configuration, then you might need to review this:
fastcgi_read_timeout 600s;
Make sure that value is longer than the longest processing PHP will be doing.
You should also definitely increase worker connections setting to whatever expected number of simultaneous connections you expect (say 10000):
worker_connections 10000;
Also, as mentioned by Sang Lu's answer, you need to make sure nginx can open enough file handles (which includes network sockets as well).
If you start the master nginx process as root, you can simply do (at least two times of the worker_connections
configuration above:
worker_rlimit_nofile 21000;
The other way is to use ulimit
or /etc/security/limits.conf
for the user that starts the master nginx process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With