3

EDIT: Added some configs and a clarification on how many is "many", as requested by anthonysomerset.

EDIT 2: Added fastcgi_cache to nginx config, as suggested by SleighBoy.

I run a server for a friend's site that now and then gets big spikes in traffic, with about 200-300 concurrent users. Between the spikes the server there's about 70-80 concurrent users and it handles the traffic without trouble.

The site is running WordPress with W3 Total Cache on a server with Debian Squeeze, nginx, PHP5-FPM+APC (128MB), MySQL 5, memcached (128MB) and Varnish (1GB). The amount in the parentesis is how much I've allocated for their respective cache. Memory never goes beyond 1.8 GB afaik, but it might be a bit overbooked. Doesn't usually cause any problems though...

What causes the problem is always PHP5-FPM utilizing 100% CPU for a while and then just crashes, resulting in nginx spewing out 502-errors. Log suggests increasing max amount of children, but I think I've reached the limit on how many children the server can handle. I've been running with pm.max_requests at 0 (unlimited), but now set it at 1000 to see if respawning children now and then could help.

/etc/php5/fpm/pool.d/www.conf

[www]
listen = /var/run/php5-fpm.sock

user = www-data
group = www-data

pm = dynamic
pm.max_children = 200
pm.start_servers = 20
pm.min_spare_servers = 20
pm.max_spare_servers = 60
pm.max_requests = 1000

/etc/nginx/nginx.conf

user www-data;
worker_processes 8;
pid /var/run/nginx.pid;

events {
        worker_connections 1024;
}

http {
        sendfile on;
        tcp_nopush on;
        tcp_nodelay on;
        types_hash_max_size 2048;

        include /etc/nginx/mime.types;
        default_type application/octet-stream;

        access_log /var/log/nginx/access.log;
        error_log /var/log/nginx/error.log;

        gzip on;
        gzip_disable "msie6";

        gzip_vary on;
        gzip_comp_level 9;
        gzip_buffers 16 8k;
        gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;

        fastcgi_cache_path /var/cache/nginx levels=1:2
        keys_zone=PHP5FPMCACHE:10m
        inactive=5m;

        fastcgi_cache_key "$scheme$request_method$host$request_uri";

        include /etc/nginx/conf.d/*.conf;
        include /etc/nginx/sites-enabled/*;
}

/etc/nginx/sites-available/website.com

upstream php5-fpm {
    server unix:/var/run/php5-fpm.sock;
}

server {
        server_name website.com *.website.com;
        server_name_in_redirect off;
        root /var/www/website.com;
        listen 8080;
        client_max_body_size 64M;
        access_log /var/log/nginx/website.com.access.log;
        error_log /var/log/nginx/website.com.error.log;

        keepalive_timeout 75;

        location / {
                index index.php;
                rewrite ^.*/files/(.*) /wp-includes/ms-files.php?file=$1 last;
                if (!-e $request_filename) {
                        rewrite ^(.+)$ /index.php?q=$1 last;
                }

        }

        location ~* ^.+\.(jpg|jpeg|gif|css|png|js|ico|xml)$ {
                expires 30d;
                access_log off;
        }

        location ~ \.php$ {
                fastcgi_pass php5-fpm;
                fastcgi_cache   PHP5FPMCACHE;
                fastcgi_cache_valid   200 302  1h;
                fastcgi_cache_valid   301      1d;
                fastcgi_cache_valid   any      1m;
                fastcgi_cache_min_uses  1;
                fastcgi_cache_use_stale error  timeout invalid_header http_500;
                fastcgi_index index.php;
                fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
                include /etc/nginx/fastcgi_params;
        }

}

/etc/varnish/website.vcl

backend default {
.host = "127.0.0.1";
.port = "8080";
}

sub vcl_recv {
# Normalize Content-Encoding
    if (req.http.Accept-Encoding) {
        if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|lzma|tbz)(\?.*|)$") {
            remove req.http.Accept-Encoding;
        } elsif (req.http.Accept-Encoding ~ "gzip") {
            set req.http.Accept-Encoding = "gzip";
        } elsif (req.http.Accept-Encoding ~ "deflate") {
            set req.http.Accept-Encoding = "deflate";
        } else {
            remove req.http.Accept-Encoding;
        }
    }
# Remove cookies and query string for real static files
    if (req.url ~ "^/[^?]+\.(jpeg|jpg|png|gif|ico|js|css|txt|gz|zip|lzma|bz2|tgz|tbz|html|htm)(\?.*|)$") {
       unset req.http.cookie;
       set req.url = regsub(req.url, "\?.*$", "");
    }
# Remove cookies from front page
    if (req.url ~ "^/$") {
       unset req.http.cookie;
    }
}

It's an VPS with 4 cores and hyperthreading (so "8 cores") and 2GB of RAM. I know that the hardware node (physical server) is far from overbooked and barely utilized (since I worked for the hosting company until a month ago or so), so it's pretty much a dedicated server.

If you need any specifications on anything, just ask.

4
  • can you add your php-fpm and nginx, global and vhost configs, also how many concurrents is "many"? :) your varnish vcl would be beneficial but i doubt its your problem unless its not actually caching Commented Oct 13, 2011 at 10:15
  • Thanks! All configs are added and a clarification on how many is "many" is added to the first section. Hitrate for memcached and Varnish is around 70% at all times. I'm sure that can be improved upon, and I'll look into that later.
    – jgabor
    Commented Oct 14, 2011 at 9:38
  • @jgabor did you resolve your issue?
    – PeterB
    Commented Oct 19, 2011 at 7:52
  • @PeterB Don't know yet, waiting for the next spike to occur. :) But page loading time as dropped significantly (3sec to 1.1sec) after introducing fastcgi_cache to the mix and the server "feels" more stable. But only time will tell...
    – jgabor
    Commented Oct 19, 2011 at 11:32

2 Answers 2

1

I suspect your varnish cache is not caching anywhere near enough of the hits

here's what I would do in your situation:

Lower php max children to 100 or even 50 (if varnish does its job properly you don't need them) also remove the max requests line to allow the php processes not to respawn too quickly and thus prevent APC from being cleared too quickly which is also bad

also IF is not good according to nginx - http://wiki.nginx.org/IfIsEvil

I would change this line:

            if (!-e $request_filename) {
                    rewrite ^(.+)$ /index.php?q=$1 last;
            }

to:

try_files $uri $uri/ /index.php?$args;

If your version of nginx supports it (pretty certain if your nginx version is > 0.7.51 then it supports it)

you should also look at inserting the w3tc nginx rules direct into your vhost file to enabled proper disk enhanced caching of pages (which is faster than APC caching with nginx)

Take a look at the following varnish vcl which I use for sites - you will need to read through and edit a few things for your website - it also assumes that its only WP sites on the server and only 1 site on the server, it can easily be modified for more sites (take a look at the cookie section)

generic vcl: https://gist.github.com/b7332971a848bcb7ecef

With this config I would argue to remove fastcgi_cache to prevent any possible issues with a cache-chain occurring whereby trying locate any stray stale cache entries is more difficult

also tell w3tc that varnish is at 127.0.0.1 and it will purge it for you ;)

I deployed this to a server on Wednesday evening (with a few domain specific modifications) that was handling 2500 active site visitors it reduced load to less than 1 and the approx number of running php children was around 10-20 (this number does depend on number of logged in users and other factors like cookies) this server did have much more ram but the principle is the same, you should be able to easily handle the number of visitors you get at peaks

1
  • According to varnishstat the hitrate is about 0.80. Max children was earlier at 100 but was raised to 200 to see if it helped against the spikes. And as I stated, I had max requests at 0 (unlimited) before setting it to 1000 as a test to see if it could help with the crashes. But I'll remove it again! I don't do disc enhanced caching... Everything W3TC caches is cached to memcached. (And yes, W3TC is allowed to purge. :) I'll definitely take a look at your vcl. Thanks! That IfIsEvil I actually already knew... No idea why I haven't spotted it before. Stupid me! Thanks for pointing it out.
    – jgabor
    Commented Oct 14, 2011 at 20:59
1

APC and nginx fastcgi_cache are going to help you a lot.

1
  • APC is already installed with a 128 MB cache. Thanks for <code>fastcgi_cache</code>, I'll look into it. :)
    – jgabor
    Commented Oct 14, 2011 at 9:25

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .