How much (cache) is too much?

One of the best rules of thumb I know is the 80/20 rule. I can’t think of a more practical rule in almost any situation. Combined with the law of diminishing returns, it pretty much sums up how the universe works. One case-study that hopes to illustrate both of these, if only a little, is a short experiment in optimization I carried out recently. I was reading so many posts about optimizing wordpress using nginx, varnish, W3-Total-Cache and php-fpm. The results on some of them were staggering in terms of improvements, and I was inspired to try to come up with a similar setup that will really push the boundary of how fast I can serve a wordpress site.

Spoiler – Conclusion

So I know there isn’t such a thing as too much cash, but does the same apply to cache?

The results were rather disappointing for me. It turns out that my existing configuration of W3TC was already closely matching that of a much more complex set-up involving two or three proxy layers involving varnish and nginx. The W3TC disk-enhanced page caching is simple but powerful enough to give nearly the same performance. How? The basic principle is that it creates a static version of every page, and hooks the webserver (Apache or Nginx) to serve this file directly if it exists. On most Linux platforms, file-caching is powerful enough to serve those files from memory. This means that it does in fact match the performance characteristics of an in-memory cache. Simple solutions are not necessarily weaker. With 20% of the effort, I reached 80% of my performance goal. In actual fact, I believe the results are more like 10/90… More on that (and some caveats) at the end.

Hosting platform

Unlike most guides and optimization benchmarks, I decided to try this “on a budget”. Not on a high-performance dedicated server, not even on a VPS, but rather on a shared hosting account with webfaction. As shared hosting providers go, webfaction is quite unique in that it allows, perhaps even encourages you, to compile your own tools and run them. It doesn’t give you as much freedom as a VPS, but it’s actually good enough. This post doubles-up as a mini-guide on how to set-up php-fpm, nginx and varnish specifically on webfaction (unlike most guides, which assume you have root access and can use some package manager).

Original Setup

The baseline was a (real!) wordpress site running on webfaction, and already using W3TC with disk-enhanced page caching using .htaccess rules. W3TC is clever enough to manage those rewrite rules for you and also to detect whether your instance is running behind Apache or Nginx. My initial thinking was that since those static pages generated by W3TC are served by Apache, and are saved on-disk (rather than in-memory), I could benefit by introducing some in-memory caching (Varnish), and remove Apache out of the equation. It’s important to note that webfaction already use nginx as a front-end proxy. However, with a wordpress app, they forward all requests via Apache.

BASELINE:

{Internet} -> Webfaction Nginx -> Webfaction Apache -> W3TC / WordPress

This is how the baseline setup looks like (pretty, isn’t it?)

Nginx Setup

The first variation was to install my own nginx instance, and serve the wordpress pages via it, instead of Apache. In order to also serve php pages effectively, I opted for php-fpm, which seems like the most recommended option with nginx. php-fpm (fastcgi process manager) means you need to run a mini server that listens to requests to serve php files… Nginx uses it when it needs to serve a php page. I include some installation instructions for php-fpm here too. I couldn’t find any guide online on how to do it with webfaction.

NGINX:

{Internet} -> Webfaction Nginx -> My Nginx / php-fpm -> W3TC / WordPress

and this is what the Nginx setup will look like (once we install it, it’s not THAT easy)

Installing Nginx on webfaction

The easiest option to run your own nginx instance on webfaction is to use the ready-made passenger application. This is usually used to host a ruby-on-rails app, but there’s nothing stopping us from removing all the passenger stuff, and just using it as an nginx instance. In the webfaction control panel go to Applications, Add a new app, lets call it engine, and then choose Passenger for app category and then Passenger 3.0.11 (nginx 1.0.10/Ruby 1.9.3) in the App type. This will install it under ~/webapps/engine.

Installing PHP-FPM

Next up is getting php-fpm compiled, installed and running. This is done by compiling php with the right option to include php-fpm. SSH to your webfaction account and run these commands

    mkdir ~/src
    cd ~/src
    wget http://www.php.net/get/php-5.3.10.tar.gz/from/sg2.php.net/mirror
    tar -zxvf php-5.3.10.tar.gz
    cd php-5.3.10
    ./configure --prefix=$HOME --with-pdo-mysql --with-pdo-pgsql=/usr/pgsql-9.1 --enable-fpm --enable-bcmath --enable-calendar --enable-exif --enable-ftp --enable-mbstring --enable-soap --enable-zip --with-curl --with-freetype-dir --with-gd --with-gettext --with-gmp --with-iconv --with-jpeg-dir --with-kerberos --with-mhash --with-mysql --with-mysqli --with-openssl --with-pgsql=/usr/pgsql-9.1 --with-png-dir --with-regex --with-xmlrpc --with-xsl --with-zlib-dir --without-pear --enable-sockets --enable-intl --with-mysql-sock=/var/lib/mysql/mysql.sock
    make
    make install
    

We should now have php-fpm installed under ~/sbin and a default configuration file created in ~/etc/php-fpm.conf.default.

PHP-FPM configuration

Follow these steps to copy the configuration file and creating a folder where our Unix socket will live

    cp ~/etc/php-fpm.conf.default ~/etc/php-fpm.conf
    mkdir -p ~/var/spool
    

Then edit our php-fpm.conf file. We only need to replace the line that says listen = 127.0.0.1:9000 with

    listen = /home/WEBFACTION_USER/var/spool/phpfpm.sock
    

Replace the WEBFACTION_USER with your webfaction username

Now we can launch our php-fpm and it will listen on the unix socket. To launch it, simply run

    ~/sbin/php-fpm
    

Note that if you plan to use this for your server you will need to create a cron job that checks whether it’s running and launches it. Follow this webfaction guide about custom apps for more info.

One last thing is to create the fastcgi_params parameters file required by nginx to the nginx conf folder (~/webapps/engine/nginx/conf/fastcgi_params)

fastcgi_param  QUERY_STRING       $query_string;
fastcgi_param  REQUEST_METHOD     $request_method;
fastcgi_param  CONTENT_TYPE       $content_type;
fastcgi_param  CONTENT_LENGTH     $content_length;

fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
fastcgi_param  REQUEST_URI        $request_uri;
fastcgi_param  DOCUMENT_URI       $document_uri;
fastcgi_param  DOCUMENT_ROOT      $document_root;
fastcgi_param  SERVER_PROTOCOL    $server_protocol;

fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
fastcgi_param  SERVER_SOFTWARE    nginx/$nginx_version;

fastcgi_param  REMOTE_ADDR        $remote_addr;
fastcgi_param  REMOTE_PORT        $remote_port;
fastcgi_param  SERVER_ADDR        $server_addr;
fastcgi_param  SERVER_PORT        $server_port;
fastcgi_param  SERVER_NAME        $server_name;

# PHP only, required if PHP was built with --enable-force-cgi-redirect
fastcgi_param  REDIRECT_STATUS    200;

Nginx Configuration

This part was getting a little tricky, since I wasn’t sure which guide to follow for configuring nginx in the best way for using with W3TC. It also seemed to me that all the online guides miss an important part. W3TC actually generates a nginx configuration for you. This is not a complete configuration, but you need to make sure it is included within your nginx.conf in order to really get the most out of the W3TC plugin.

First step was to create a folder where we’ll get our W3TC-generated nginx.conf file into

    mkdir -p ~/webapps/engine/nginx/conf/sites-enabled
    

Then edit your ~/webapps/engine/nginx/conf/nginx.conf file, so it looks something like this:

    worker_processes  1;

    events {
        worker_connections  1024;
    }

    http {
        access_log  /home/WEBFACTION_USER/logs/user/access_engine.log  combined;
        error_log   /home/WEBFACTION_USER/logs/user/error_engine.log   crit;

        include         mime.types;
        sendfile        on;
        tcp_nodelay on;
        tcp_nopush on;
        port_in_redirect off;

        server {
            listen             99999; # make sure the port is the same as configured on your webfaction `engine` app
            server_name        localhost;
            root               /home/WEBFACTION_USER/webapps/wptst;  # point this to your wordpress folder
            index              index.php;
            include /home/WEBFACTION_USER/webapps/engine/nginx/conf/sites-enabled/*;

            location / {
                try_files $uri $uri/ /index.php?q=$uri&$args;
            }
            # Deny access to hidden files
            location ~* /\.ht {
                deny            all;
                access_log      off;
                log_not_found   off;
            }

            # Pass PHP scripts on to PHP-FPM
            location ~* \.php$ {
                try_files       $uri /index.php;
                fastcgi_index   index.php;
                fastcgi_pass    unix:/home/WEBFACTION_USER/var/spool/phpfpm.sock;
                include         fastcgi_params;
                fastcgi_param   SCRIPT_FILENAME    $document_root$fastcgi_script_name;
                fastcgi_param   SCRIPT_NAME        $fastcgi_script_name;
            }

        }
    }
    

Make sure to replace WEBFACTION_USER with your username, so the folders are correct. Also update the listen port from 99999 to the port that was given to your app.

(Re)start your nginx process by running

    ~/webapps/engine/bin/restart
    

Configuring wordpress via nginx

Now that we have both our php-fpm and nginx processes running, lets plug our site to use this new configuration. This is quite easy for those familiar with the webfaction control panel. Either change your existing website instance to point to the engine app, or create a new website, and point it to engine. Note that the domain name should match that of your wordpress instance. This is all done under the Domains / websites -> Websites menu.

Give webfaction a couple of minutes to sync, and you should be ready to access your wordpress, now running under nginx. Next step is to configure W3TC to generate the configuration for nginx

W3TC configuration

Login to your wordpress admin and go to the W3TC Performance menu. Ignore any errors or warnings you might see for now.

Scroll down to the Miscellaneous section. You should see a Nginx server configuration file path option. You should enter this path

    /home/WEBFACTION_USER/webapps/engine/nginx/conf/sites-enabled/nginx.conf
    

(replace with your own webfaction username).

Then click Save all settinegs.

Now you can click auto-install on all the warning messages that W3TC spits out. This will generate a custom nginx.conf file for you in the right folder. W3TC will probably keep complaining with an error that says

It appears Page Cache URL rewriting is not working. If using apache, verify that the server configuration allows .htaccess or if using nginx verify all configuration files are included in the configuration..

You now need to restart your nginx to pick up the new configuration file generated by W3TC.

~/webapps/engine/bin/restart

Varnish

The second variation was to use the previous nginx configuration, but also place Varnish cache in-front of it.

VARNISH:

{Internet} -> Webfaction Nginx -> My Varnish -> My Nginx / php-fpm -> W3TC / WordPress

and this is what it would look like with Varnish. It’s like the cherry on the cake. Probably not as sweet though.

Installing Varnish on Webfaction

Compiling and installing varnish is quite similar to php-fpm and nginx

    cd ~/src
    wget wget http://repo.varnish-cache.org/source/varnish-3.0.2.tar.gz
    tar -zxvf varnish-3.0.2.tar.gz
    cd varnish-3.0.2
    ./autogen.sh
    ./configure --prefix=$HOME
    make
    # this small hack was also required... (from http://community.webfaction.com/questions/5470/easy-reverse-proxy-cache-with-webfaction/5514)
    mv ./libtool ./libtool_old
    ln -s /usr/bin/libtool ./libtool
    make install
    

Unlike php-fpm, which can use unix sockets, varnish needs a TCP port to listen to. Luckily, webfaction makes it relatively easy. To do that, create a varnish app on webfaction. We’ll call it varnish, and choose Custom for the App category and Custom app (listening on port) for the App type. Click Create and notice the port number. We’ll use this port to listen to on our varnish configuration.

The configuration lives under ~/etc/varnish/default.vcl. You can simply overwrite the sample file already created there with this configuration:

 backend default {
  .host = "localhost";
# Replace this with your NGINX (Engine app) port !
  .port = "99999";
}
acl purge {
        "localhost";
}
sub vcl_recv {
        if (req.request == "PURGE") {
                if (!client.ip ~ purge) {
                        error 405 "Not allowed.";
                }
                return(lookup);
        }
if (req.url ~ "^/$") {
               unset req.http.cookie;
            }
}
sub vcl_hit {
        if (req.request == "PURGE") {
                set obj.ttl = 0s;
                error 200 "Purged.";
        }
}
sub vcl_miss {
        if (req.request == "PURGE") {
                error 404 "Not in cache.";
        }
if (!(req.url ~ "wp-(login|admin)")) {
                        unset req.http.cookie;
                }
    if (req.url ~ "^/[^?]+.(jpeg|jpg|png|gif|ico|js|css|txt|gz|zip|lzma|bz2|tgz|tbz|html|htm)(\?.|)$") {
       unset req.http.cookie;
       set req.url = regsub(req.url, "\?.$", "");
    }
    if (req.url ~ "^/$") {
       unset req.http.cookie;
    }
}
sub vcl_fetch {
        if (req.url ~ "^/$") {
                unset beresp.http.set-cookie;
        }
        if (!(req.url ~ "wp-(login|admin)")) {
                unset beresp.http.set-cookie;
        }
}

Notice that the port number in this configuration file is of the NGINX server (which we created earlier and called it engine)!

Now to run varnish:

    ~/sbin/varnishd -f ~/etc/varnish/default.vcl -s malloc,64M -a 127.0.0.1:55555
    

The port here is of the newly created varnish app (replace 55555 with your own port). You can tweak this to use more or less memory. webfaction gives each account 256Mb of application-usable memory, so it depends on how much stuff you have running already.

Benchmark results

To carry out those tests, I used ApacheBench from a near-by location with good internet connection and low latency (ping test showed around 4ms). I repeated each test several times and checked that there were no configuration problems or issues that might skew the results. The tests were carried out against the exact same wordpress site and testing both static and dynamic pages against each of the configurations (BASELINE, NGINX, VARNISH). I also used httperf for sanity-testing, to make sure the apache bench results were accurate, and on the varnish testing made sure varnishhits show realistic information about cache hits/misses.

The command I used was

    ab -kc 10 -n 1000 {url}
    

Static Pages

Static pages are not just plain html, but rather pages that W3TC caches and converts into static files. Ideally most, if not all of the pages on a typical wordpress setup can be cached this way. I have tested a couple of pages of different sizes.

Pretty Big Page (~100kb)

VARNISH

Document Path:          /category/dinosaurs/
Document Length:        113742 bytes

Concurrency Level:      10
Time taken for tests:   9.779 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      114047000 bytes
HTML transferred:       113742000 bytes
Requests per second:    102.26 [#/sec] (mean)
Time per request:       97.793 [ms] (mean)
Time per request:       9.779 [ms] (mean, across all concurrent requests)
Transfer rate:          11388.76 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2   10   2.1     10      33
Processing:    57   87   6.2     87     112
Waiting:        3   11   2.3     11      34
Total:         67   97   6.4     97     124

Percentage of the requests served within a certain time (ms)
  50%     97
  66%    100
  75%    101
  80%    103
  90%    105
  95%    107
  98%    112
  99%    118
 100%    124 (longest request)

NGINX

Document Path:          /category/dinosaurs/
Document Length:        113742 bytes

Concurrency Level:      10
Time taken for tests:   9.723 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    993
Total transferred:      114059965 bytes
HTML transferred:       113742000 bytes
Requests per second:    102.85 [#/sec] (mean)
Time per request:       97.232 [ms] (mean)
Time per request:       9.723 [ms] (mean, across all concurrent requests)
Transfer rate:          11455.74 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   5.5      0      87
Processing:    43   96  17.5     97     310
Waiting:        3   66  22.5     74      88
Total:         43   97  20.9     97     397

Percentage of the requests served within a certain time (ms)
  50%     97
  66%     97
  75%     97
  80%     98
  90%    105
  95%    117
  98%    129
  99%    155
 100%    397 (longest request)

BASELINE

Document Path:          /category/dinosaurs/
Document Length:        113753 bytes

Concurrency Level:      10
Time taken for tests:   9.736 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    994
Total transferred:      114099970 bytes
HTML transferred:       113753000 bytes
Requests per second:    102.72 [#/sec] (mean)
Time per request:       97.356 [ms] (mean)
Time per request:       9.736 [ms] (mean, across all concurrent requests)
Transfer rate:          11445.21 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   3.6      0      63
Processing:    29   97  51.3     91     643
Waiting:        3   50  14.8     50     116
Total:         29   97  51.8     92     643

Percentage of the requests served within a certain time (ms)
  50%     92
  66%     98
  75%    101
  80%    106
  90%    126
  95%    149
  98%    213
  99%    407
 100%    643 (longest request)

Notice that nginx and our baseline setup are able to use HTTP keep-alive, whereas varnish didn’t. I’m not sure why this happens.

Smaller Page (~35kb)

VARNISH

Document Path:          /category/elephants/
Document Length:        38304 bytes

Concurrency Level:      10
Time taken for tests:   3.317 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      38607989 bytes
HTML transferred:       38304000 bytes
Requests per second:    301.50 [#/sec] (mean)
Time per request:       33.167 [ms] (mean)
Time per request:       3.317 [ms] (mean, across all concurrent requests)
Transfer rate:          11367.56 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2    8   1.5      8      11
Processing:    17   25   2.2     26      35
Waiting:        2    8   1.5      8      13
Total:         23   33   2.2     33      40

Percentage of the requests served within a certain time (ms)
  50%     33
  66%     34
  75%     34
  80%     35
  90%     36
  95%     37
  98%     38
  99%     38
 100%     40 (longest request)

NGINX

Document Path:          /category/elephants/
Document Length:        38304 bytes

Concurrency Level:      10
Time taken for tests:   3.308 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    993
Total transferred:      38620965 bytes
HTML transferred:       38304000 bytes
Requests per second:    302.30 [#/sec] (mean)
Time per request:       33.080 [ms] (mean)
Time per request:       3.308 [ms] (mean, across all concurrent requests)
Transfer rate:          11401.33 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.4      0      23
Processing:    10   33  13.5     31     117
Waiting:        3   18   4.2     18      28
Total:         10   33  13.7     31     117

Percentage of the requests served within a certain time (ms)
  50%     31
  66%     36
  75%     40
  80%     42
  90%     46
  95%     51
  98%     79
  99%     95
 100%    117 (longest request)

BASELINE

Document Path:          /category/elephants/
Document Length:        38315 bytes

Concurrency Level:      10
Time taken for tests:   3.333 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    994
Total transferred:      38660970 bytes
HTML transferred:       38315000 bytes
Requests per second:    300.05 [#/sec] (mean)
Time per request:       33.327 [ms] (mean)
Time per request:       3.333 [ms] (mean, across all concurrent requests)
Transfer rate:          11328.49 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.5      0      23
Processing:     9   33  13.8     29     131
Waiting:        3   20   4.6     20      45
Total:          9   33  14.1     29     131

Percentage of the requests served within a certain time (ms)
  50%     29
  66%     34
  75%     39
  80%     43
  90%     50
  95%     52
  98%     76
  99%     93
 100%    131 (longest request)

Even smaller (~15kb)

VARNISH

Document Path:          /cateogry/kittens/
Document Length:        16986 bytes

Concurrency Level:      10
Time taken for tests:   1.594 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      17290000 bytes
HTML transferred:       16986000 bytes
Requests per second:    627.16 [#/sec] (mean)
Time per request:       15.945 [ms] (mean)
Time per request:       1.594 [ms] (mean, across all concurrent requests)
Transfer rate:          10589.52 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2    5   1.7      4      11
Processing:     5   11   2.2     11      17
Waiting:        2    6   1.7      5      13
Total:          9   16   2.2     16      25

Percentage of the requests served within a certain time (ms)
  50%     16
  66%     17
  75%     17
  80%     18
  90%     19
  95%     19
  98%     21
  99%     21
 100%     25 (longest request)

NGINX

Document Path:          /cateogry/kittens/
Document Length:        16986 bytes

Concurrency Level:      10
Time taken for tests:   1.482 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    990
Total transferred:      17302950 bytes
HTML transferred:       16986000 bytes
Requests per second:    674.94 [#/sec] (mean)
Time per request:       14.816 [ms] (mean)
Time per request:       1.482 [ms] (mean, across all concurrent requests)
Transfer rate:          11404.67 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.3      0       3
Processing:     5   15   1.0     15      24
Waiting:        3   13   1.1     13      14
Total:          5   15   1.0     15      24

Percentage of the requests served within a certain time (ms)
  50%     15
  66%     15
  75%     15
  80%     15
  90%     15
  95%     15
  98%     16
  99%     18
 100%     24 (longest request)

BASELINE

Document Path:          /cateogry/kittens/
Document Length:        16997 bytes

Concurrency Level:      10
Time taken for tests:   1.488 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    992
Total transferred:      17342960 bytes
HTML transferred:       16997000 bytes
Requests per second:    671.83 [#/sec] (mean)
Time per request:       14.885 [ms] (mean)
Time per request:       1.488 [ms] (mean, across all concurrent requests)
Transfer rate:          11378.42 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.6      0      13
Processing:     6   15   0.9     15      26
Waiting:        3   13   1.2     13      15
Total:          6   15   1.1     15      26

Percentage of the requests served within a certain time (ms)
  50%     15
  66%     15
  75%     15
  80%     15
  90%     15
  95%     15
  98%     16
  99%     19
 100%     26 (longest request)

Dynamic Pages

I chose the wp-login.php for the test. Ideally most pages will be cached anyway. Nevertheless, it’s interesting to see what overhead (if any) is added by varnish or nginx, and also to compare PHP-FPM with the standard php-cgi provided by default.

VARNISH

Document Path:          /wp-login.php
Document Length:        3643 bytes

Concurrency Level:      10
Time taken for tests:   32.551 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    993
Total transferred:      4309921 bytes
HTML transferred:       3643000 bytes
Requests per second:    30.72 [#/sec] (mean)
Time per request:       325.514 [ms] (mean)
Time per request:       32.551 [ms] (mean, across all concurrent requests)
Transfer rate:          129.30 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.3      0       3
Processing:   137  324  42.2    316     545
Waiting:      137  323  42.1    314     544
Total:        139  324  42.3    316     548

Percentage of the requests served within a certain time (ms)
  50%    316
  66%    339
  75%    355
  80%    360
  90%    380
  95%    396
  98%    408
  99%    413
 100%    548 (longest request)

NGINX

Document Path:          /wp-login.php
Document Length:        3643 bytes

Concurrency Level:      10
Time taken for tests:   32.085 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      4213000 bytes
HTML transferred:       3643000 bytes
Requests per second:    31.17 [#/sec] (mean)
Time per request:       320.852 [ms] (mean)
Time per request:       32.085 [ms] (mean, across all concurrent requests)
Transfer rate:          128.23 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2    3   0.5      3       6
Processing:   142  317  40.9    309     427
Waiting:      141  315  41.0    307     427
Total:        144  319  40.9    312     431

Percentage of the requests served within a certain time (ms)
  50%    312
  66%    332
  75%    348
  80%    357
  90%    380
  95%    397
  98%    412
  99%    416
 100%    431 (longest request)

BASELINE

Document Path:          /wp-login.php
Document Length:        3654 bytes

Concurrency Level:      10
Time taken for tests:   41.740 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      4201000 bytes
HTML transferred:       3654000 bytes
Requests per second:    23.96 [#/sec] (mean)
Time per request:       417.396 [ms] (mean)
Time per request:       41.740 [ms] (mean, across all concurrent requests)
Transfer rate:          98.29 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        2    3   0.5      3       4
Processing:   236  414  56.0    404     607
Waiting:      215  380  53.0    371     555
Total:        238  416  56.0    407     610

Percentage of the requests served within a certain time (ms)
  50%    407
  66%    434
  75%    452
  80%    465
  90%    494
  95%    519
  98%    554
  99%    577
 100%    610 (longest request)

Analysis

If I’m reading the numbers from apache-bench correctly, then nginx does offer some (relatively modest) improvement in performance. This is actually a little more noticable on dynamic php pages. This is not a huge surprise, since my nginx setup was configured with php-fpm, whereas the baseline setup had to use php-cgi. Adding Varnish on top does not boost performance that much further though. Perhaps my configuration wasn’t optimal, or maybe you really start to benefit when you have more memory. I don’t know. Considering the added complexity of two proxying layers, I’d say it’s not really worth it for me. I even wonder if managing my own nginx is worth the hassle. This is because with nginx, there’s a bigger risk of some weirdness in compatibility with wordpress and the myriad of plugins it supports, most of which are running on Apache.

It’s worth remembering that those benchmark don’t really simulate real-life scenarios. They hit the server with 1,000 request in a short span. Sure, it’s great for simulating your site getting slashdotted, but it’s not that realistic. Furthermore, browsing experience is affected by many other factors and elements on the page. Images, javascript, browser caching. The location of the user in relation to the server also makes a huge difference of course. Those tests were running from a close-proximity location with low latency. Doing the same from across the globe might produce much different results.

And of course, I also might have made some blatant configuration mistakes or used sub-optimal settings. I’d be happy to hear some ideas on how to work things even better!

Insights

I was reading a fascinating post recently, talking about why the web is slow. I know it’s not strictly related to my optimization, but there are many things to consider when trying to improve your site’s performance. Oddly, one of the things that really hit me. hard. The thing that I ignored completely before starting this process, was how much the actual size of the page matters. The few performance benchmarks which I came across online failed to even mention which pages were tested, or used really tiny out-of-the-box wordpress pages. If you really want to boost your site’s performance – make your web pages as small as possible. I am now starting to experiment with minifying my html using W3TC (A feature which isn’t enabled by-default), as well as trying to reduce page size and moving across unnecessary stuff to be fetched via ajax.

Thanks

Special thanks to webfaction support. It seems to me like they went beyond their call of duty to help me install stuff on the host on a couple of occasions. Considering it’s a shared-hosting provider, they really do a fantastic job. I hope this post might give some pointers to people who want to install nginx, php-fpm, varnish or anything else on webfaction. Perhaps it’s not as easy as using apt-get install, but it’s definitely possible.

Update

Looking at the results I realised that apache-bench does not use gzip compression by default. However, it can be used with gzip compression, which makes page sizes considerably smaller, and hence the results much faster. I only did a very quick cursary test, and so pretty big page became rather skinny, dropping from around 100kb to only around 15kb (and the response times accordingly). To test your site with gzip switched-on, use

ab -kc 10 -n 1000 -H 'Accept-Encoding: gzip'

10 Responses to “How much (cache) is too much?”

  1. Daniel Miessler

    Greetings. I think you’re somewhat confused about something here. You keep mentioning the addition of multiple layers when the approach I recommend is actually to *remove* layers.

    A WordPress plugin is extra crap to run. Removing stuff like that is generally a good idea.

    I suggest running Nginx –> WordPress, and then putting Varnish in front of that if you’re looking for the ability to handle major load. This is without running any WordPress plugins.

    You can also run this config without Varnish at all–so just Nginx -> WordPress (with no caching plugin).

    At any rate, using my system I’m getting load times of around ~200ms for my pages, which is pretty damn fast. Here’s how I did it, in case you didn’t read this post:

    http://danielmiessler.com/blog/10-ways-to-improve-your-website-performance

    Cheers,
    Daniel

  2. Yoav Aner

    Hi Daniel,

    Thanks for your comment. I didn’t see the post you linked to on the comment, only the one with the performance benchmarks. I’ll definitely read through it and see if I can pick up some ideas or tips. Thanks!

    I think we pretty much agree, just use a different approach. I think the W3TC page caching gives you good-enough performance, comparable to that of a much more complicated setup involving nginx + varnish. For a typical user, installing and configuring the W3TC plugin is so much easier than installing and configuring all those extra layers…

  3. Eber

    Hello, do you do Varnish and Nginx install for a fee. If yes, please contact me through my email address.

  4. Resende

    Awesome post, Yoav! The only thing missing in your equation is an opcode cache, like APC or eAccelerator. That would make a difference.

    By the way – is there any way to restart/stop PHP-FPM thru the socket path? ‘~/sbin/php-fpm restart’ or ‘~/sbin/php-fpm stop’ don’t seem to work. Or do we need to always kill the PHP-FPM master process and hit ‘~/sbin/php-fpm’ to restart PHP?

    Thanks!

  5. Yoav Aner

    Thanks Resende,

    I don’t remember precisely, but when I played around with APC I didn’t notice any major improvements. That said, I didn’t do much comparative testing with it. Perhaps it’s worth trying one day. Thanks for the suggestion.

    As of restarting php-fpm. Not sure, but have you tried sending a signal to it? I did a quick search and found this link http://forum.nginx.org/read.php?3,3485 – looks like SIGUSR2 should restart it, and SIGTERM / SIGQUIT would stop it.

  6. Jake

    Interesting benchmark, however I think that your number of requests and concurrency significantly skewed the results.

    I would love to see the results if you upped the concurrency to, say, 1,000-5,000, and the number of requests to >25,000. This would provide a much better illustration as to why nginx outperforms Apache for high-traffic environment.

    Cheers,

    Jake @ Avvo.com

  7. Yoav Aner

    Hi Jake,

    Yes, perhaps that would be an interesting experiment. Those numbers seem to reach the point where Apache itself is struggling to keep up and hence nginx will likely shine.

    However, don’t forget this testing was done on a budget shared-hosting platform. I wasn’t aiming at the mega-busy site with 1,000-5,000 concurrent requests, which I somehow doubt a typical shared-hosting account can deal with. My little experiment was for the low-end side of the scale, and for that I believe the results are still valid.

    In any case, if you are able to run a similar benchmark, I’d be happy to see the results. As things stand now, I believe running this kind of test on a shared hosting account will risk getting my account blocked.

  8. Criação

    This was an excellent experiment, trying to install nginx and varnish on a shared hosting!
    I was researching on this subject and your post made me rethink the whole matter. I won’t bother anymore with installing nginx + varnish on the site I was considering (which already runs W3TC). Also, this is priceless information:
    “I even wonder if managing my own nginx is worth the hassle. This is because with nginx, there’s a bigger risk of some weirdness in compatibility with wordpress and the myriad of plugins it supports, most of which are running on Apache.”

  9. Greg

    This is an interesting post. What are you running your blog on now? I noticed Cloud Front in your source.

    Did you confirm that Varnish (or Nginx) were properly caching the page during the benchmark? If Varnish is passing the request through to your PHP back end then you’ll see no change by adding Varnish. If instead, Varnish handles the requests, you should see huge boost. Lastly, did you use a benchmark server that was faster than your test target? You can spin up a nice 4 gig VPS, use it for benchmarking and then shut it down for all of $0.20.

    Here’s a Varnish benchmark on my $5 Digital Ocean server which should qualify as “budget” yes? My base line benchmark of Nginx serving dynamic content directly with W3 total cache out of APC was the same as yours with 120 requests per second was good and it would crap out after 40 – 50 concurrent users.

    Now with Varnish:
    ab -rkc 3000 -n 10000 -H ‘Accept-Encoding: gzip’ http://www.gregboggs.com/
    Server Software: nginx/1.4.2
    Server Hostname: http://www.gregboggs.com
    Server Port: 80

    Document Path: /
    Document Length: 5178 bytes

    Concurrency Level: 3000
    Time taken for tests: 1.739 seconds
    Complete requests: 8859
    Failed requests: 0
    Write errors: 0
    Keep-Alive requests: 8859
    Total transferred: 49840563 bytes
    HTML transferred: 45879696 bytes
    Requests per second: 5095.65 [#/sec] (mean)
    Time per request: 588.738 [ms] (mean)
    Time per request: 0.196 [ms] (mean, across all concurrent requests)
    Transfer rate: 27996.12 [Kbytes/sec] received

    Connection Times (ms)
    min mean[+/-sd] median max
    Connect: 0 22 42.1 0 120
    Processing: 71 261 131.5 248 1020
    Waiting: 71 261 131.5 248 1020
    Total: 71 283 143.5 254 1114

    Percentage of the requests served within a certain time (ms)
    50% 254
    66% 265
    75% 287
    80% 289
    90% 332
    95% 468
    98% 1033
    99% 1039
    100% 1114 (longest request)

  10. Yoav Aner

    Hi Greg,

    Thanks for the feedback. A few comments:

    * I wasn’t aware of digitalocean when I ran those tests, and the price gap between VPS and shared hosting at the time was way bigger.
    * Nevertheless, even though I manage a few dozen servers on a regular basis on AWS, Linode, Rackspace, DO and others – I would still prefer to run simple blogs and wordpress sites on a shared hosting, primarily for the simplicity of management.
    * As I wrote, even though I could increase performance, the gap I observed wasn’t as big to be worth it (for me).
    * You are using concurrency level of 3000. This strikes me as silly for most sites and I wouldn’t test so high. If your response time is, say, roughly 200ms, and (if) you can serve 3000 concurrently, that’s 15,000 requests per second! I see that your tests reached around 5,000 req/s.
    * The response time on the AB test you posted shows a rather slow response with a 95 percentile of 468ms
    * Your document size was 5178 bytes. Smaller than my smallest test. On my tests, even just with W3TC – response time was around 19ms even for the 99th percentile.
    * I also see that out of 10,000 requests, 8,859 completed on your test.

    So overall, without knowing your exact setup and details – it feels like you’re stressing it too much to produce meaningful results. In any case, I think it’s great to see people trying to get the most out of their systems and setting up different caching layers along the way!

Leave a Reply

css.php