Categories
linux Performance Technology

bootstrap shooting at the clouds

One of my primary aims when building a resillient cloud architecture, is being able to spawn instances quickly. Many cloud providers give you tools to create images or snapshots of existing cloud instances and launch them. This is great, but not particularly portable. If I have one instance on Linode and I want to clone it to Rackspace, I can’t easily do that.

That’s one of the reasons I am creating bootstrap scripts that completely automate a server (re)build process. Given an IP address and root password, the script should connect to the instance, install all necessary packages, pull the code from the repository, initialize the database, configure the web server and get the server ready for restore of user-data.

I’m primarily using fabric for automating this process, and use a standard operating system across different cloud providers. This allows a fairly consistent deployments across different providers. This also means the architecture is not dependent on a single provider, which in my opinion gives a huge benefit. Not only can my architecture run on different data centres or geographic locations, but I can also be flxeible in the choice of hosting providers.

All that aside however, building and refining this bootstrapping process allowed me to run it across different cloud providers, namely: Rackspace, Linode, and EC2. Whilst running the bootrstrapping process many times, I thought it might be a great opportunity to compare performance of those providers side-by-side. My bootstrap process runs the same commands in order, and covers quite a variety of operations. This should give an interesting indication on how each of the cloud providers performs.

Tested platforms

The tests were carried out using the default Debian 6 Squeeze on the lowest-end cloud instances on all three providers:

  • Rackspace 256Mb and 512Mb – using the London data centre.
  • Linode 512 – using the London data centre.
  • EC2 micro instance (EBS volume) – using the Ireland data centre.

Bootstrap process

The bootstrap process executes the following tasks:

  • apt-get update && apt-get upgrade and installing a list of prerequisite packages
  • Installing Postgresql from backports
  • Downloading, compiling and installing ruby and sphinx from source
  • Setting up SSH keys
  • Pulling code from a remote git repository
  • Creating a couple of small (empty) databases and user accounts
  • Tweaking some configuration files
  • Performing bundle install on a rails project
  • Performing rake tasks to set the database schema and seed the database

These are relatively I/O intensive operations, but also involve CPU tasks (compiling code) and network access (downloading sources and packages), so should provide a reasonable benchmark for comparing the performance of those cloud providers.

Results

These highly-unscientific results are quite basic. No fancy charts or anything. All I measured was how long the entire bootstrap operation was taking on each of the cloud providers.

  • Rackspace 256: 1269 seconds (~21 minutes)
  • Rackspace 512: 1144 seconds (~19 minutes)
  • Linode 512: 1053 seconds (~17.5 minutes)
  • EC2 micro: 4090 seconds (1 hour and 8 minutes!!??)

Linode seems to be the winner, running around 20% faster than Rackspace 256 and 8% faster than rackspace 512. What’s much more surprising however (for me anyway), is how slow EC2 is in comparison, running 378% slower than Linode… I am guessing this is down to EBS storage. Quite a big performance hit for persistent storage though.

5 replies on “bootstrap shooting at the clouds”

Perhaps I’m mistaken, but I seem to remember that the EC2 Micro CPUs are a non-constant, availability-based, computing resource. If that’s true, it might explain your findings.

I’ve run these a couple of times and was getting similar experience. I didn’t measure every execution, but EC2 certainly felt much slower than the rest. Even if my tests aren’t really indicative, it still shows that at times you can get really bad performance on EC2 and it seems performance is better and more stable on other virtual hosting providers – at least for the smaller end of the virtual server scale anyway.

EC2 micro instances will be throttled on any extended CPU usage. After 20 seconds or so you could lose up to 99% of CPU processing.

However, if your use is bursts at less than hundred millisecs or so, with average CPU load at less than 15%, then the micro instance are great. We use them to run webapps with 40-50 organisations on each instance with a highly cached architecture (java NOT php) and NOSQL data backend. The trick is to design system to minimise CPU load, resulting in a win-win – faster processing and no throttling.

If you use SQL and PHP, you’re probably entering a lose-lose scenario.

A further recommendation is when configuring, if lots of processing is required, to use a “small instance” type, and then restart as a “micro-instance”.

Thanks for your comment Martyn!

It made me curious to try it out, so I re-ran the test, this time comparing the EC2 small with the equivalent platform on Linode and Rackspace (+ also to EC2 micro).

You can read the full post here

Martyn, I don’t see why running Java with some nosql db would be less cpu intensive than running php mysql, especially on a micro instance where most of the RAM is already consumed by the OS and various daemons, leaving no ram for any kind of descent cache.. Could you please explain your set-up?

Leave a Reply

Your email address will not be published. Required fields are marked *