Nginx and php-fpm: cannot get rid of 502 and 504 errors

Question

I have an ubuntu-server and a pretty high loaded website. Server is:

Dedicated to nginx, uses php-fpm (no apache), mysql is located on different machine
Has 8 GB of RAM
Gets about 2000 requests per second.

Each php-fpm process consumes about 65MB of RAM, according to top command:

top command

Free memory:

admin@myserver:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          7910       7156        753          0        284       2502
-/+ buffers/cache:       4369       3540
Swap:         8099          0       8099

PROBLEM

Lately, I'm experiencing big performance problems. Very big response times, very many Gateway Timeouts and in evenings, when load gets high, 90% of the users just see "Server not found" instead of the website (I cannot seem to reproduce this)

LOGS

My Nginx error log is full of the fallowing messages:

2012/07/18 20:36:48 [error] 3451#0: *241904 upstream prematurely closed connection while reading response header from upstream, client: 178.49.30.245, server: example.net, request: request: "GET /readarticle/121430 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9001", host: "example.net", referrer: "http://example.net/articles"

I've tried switching to unix socket, but still get those errors:

2012/07/18 19:27:30 [crit] 2275#0: *12334 connect() to unix:/tmp/fastcgi.sock failed (2: No such file or directory) while connecting to upstream, client: 84.
237.189.45, server: example.net, request: "GET /readarticle/121430 HTTP/1.1", upstream: "fastcgi://unix:/tmp/fastcgi.sock:", host: "example.net", referrer: "http
://example.net/articles"

And php-fpm log is full of these:

[18-Jul-2012 19:23:34] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there  are 0 idle, and 75 total children

I've tried to increase given parameters up to 100, but it still seems not enough.

CONFIGS

Here is my current configuration

php-fpm

listen = 127.0.0.1:9001
listen.backlog = 4096
pm = dynamic
pm.max_children = 130
pm.start_servers = 40
pm.min_spare_servers = 10
pm.max_spare_servers = 40
pm.max_requests = 100

nginx

worker_processes  4;
worker_rlimit_nofile 8192;
worker_priority 0;
worker_cpu_affinity 0001 0010 0100 1000;

error_log  /var/log/nginx_errors.log;

events {
    multi_accept off;
    worker_connections  4096;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    access_log off;
    sendfile        on;
    keepalive_timeout  65;
    gzip  on;

    # fastcgi parameters
    fastcgi_connect_timeout 120;
    fastcgi_send_timeout 180;
    fastcgi_read_timeout 1000;
    fastcgi_buffer_size 128k;
    fastcgi_buffers 4 256k;
    fastcgi_busy_buffers_size 256k;
    fastcgi_temp_file_write_size 256k;
    fastcgi_intercept_errors on;

    client_max_body_size 128M;

    server {
        server_name example.net;
        root /var/www/example/httpdocs;
        index index.php;
        charset utf-8;
        error_log /var/www/example/nginx_error.log;

        error_page 502 504 = /gateway_timeout.html;

        # rewrite rule
        location / {
            if (!-e $request_filename) {
                rewrite ^(.*)$ /index.php?path=$1 last;
            }
        }
        location ~* \.php {
            fastcgi_pass 127.0.0.1:9001;
            fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
            fastcgi_param PATH_INFO $fastcgi_script_name;
            include fastcgi_params;
        }
    }
}

I would be very grateful for any advice on how to identify the problem and what parameters I can adjust to fix this. Or maybe 8GB of RAM is just not enough for this kind of load?

Chuan Ma · Accepted Answer

A number of issues. Still worth to fix them with such a busy site. MySQL may be the root cause for now. But longer term you need to do more work.

Caching

I see one of your error msg showing a get request to the php upstream. This doesn't look good with such a high traffic site (2000 r/s as you mentioned). This page (/readarticle/121430) seems a perfectly cacheable page. For one, you can use nginx for caching such pages. Check out fastcgi cache

GET /readarticle/121430

php-fpm

pm.max_requests = 100

The value means that a process will be killed by php-fpm master after serving 100 requests. php-fpm uses that value to fight against 3rd party memory leaks. Your site is very busy, with 2000r/s. Your max child processes is 130, each can only serve at most 100 requests. That means after 13000/2000 = 6.5 seconds all of them are going to be recycled. This is way too much (20 processes killed every second). You should at least start with a value of 1000 and increase that number as long as you don't see memory leak. Someone uses 10,000 in production.

nginx.conf

Issue 1:

    if (!-e $request_filename) {
        rewrite ^(.*)$ /index.php?path=$1 last;
    }

should be replaced by more efficient try_files:

    try_files $uri /index.php?path=$uri;

You save an extra if location block and a regex rewrite rule match.

Issue 2: using unix socket will save you more time than using ip (around 10-20% from my experience). That's why php-fpm is using it as default.
Issue 3: You may be interested in setting up keepalive connections between nginx and php-fpm. An example is given here in nginx official site.

Ligemer · Answer

I need to see your php.ini settings and I don't think this is related to MySQL since you're getting socket errors it looks like. Also, is this something that begins to start happening after a period of time or does it immediately happen when the server restarts?

Try restarting the php5-fpm daemon and see what happens while tailing your error log.

Check your php.ini file and also all your fastcgi_params typically located in /etc/nginx/fastcgi_params. There are a ton of examples for what you're trying to do.

Also, do you have the apc php caching extension enabled?

It will look like this in your php.ini file if your on a lamp stack:

extension=apc.so
....
apc.enabled=0

Probably wouldn't hurt to do some mysql connection load testing from the command line as well and see what the results are.

Nginx and php-fpm: cannot get rid of 502 and 504 errors

Tags:

performance

linux

php

nginx

Silver Light

2 Answers

Chuan Ma

Ligemer

Recent Activity

Donate For Us

Nginx and php-fpm: cannot get rid of 502 and 504 errors

Tags:

performance

linux

php

nginx

Silver Light

2 Answers

Chuan Ma

Ligemer

Related questions

Recent Activity

Donate For Us