Categories
django monitoring python Technology

Fabric Installer for Graphite

fabric-graphite is a fabric script to install Graphite, Nginx, uwsgi and all dependencies on a debian-based host.

Why?

I was reading a few interesting posts about graphite. When I tried to install it however, I couldn’t find anything that really covered all the steps. Some covered it well for Apache, others covered Nginx, but had steps missing or assumed the reader knows about them etc.

I’m a big fan of fabric, and try to do all deployments and installations using it. This way I can re-run the process, and also better document what needs to be done. So instead of writing another guide, I created this fabric script.

Categories
linux Performance Technology

bootstrap shooting at the clouds

One of my primary aims when building a resillient cloud architecture, is being able to spawn instances quickly. Many cloud providers give you tools to create images or snapshots of existing cloud instances and launch them. This is great, but not particularly portable. If I have one instance on Linode and I want to clone it to Rackspace, I can’t easily do that.

That’s one of the reasons I am creating bootstrap scripts that completely automate a server (re)build process. Given an IP address and root password, the script should connect to the instance, install all necessary packages, pull the code from the repository, initialize the database, configure the web server and get the server ready for restore of user-data.

I’m primarily using fabric for automating this process, and use a standard operating system across different cloud providers. This allows a fairly consistent deployments across different providers. This also means the architecture is not dependent on a single provider, which in my opinion gives a huge benefit. Not only can my architecture run on different data centres or geographic locations, but I can also be flxeible in the choice of hosting providers.

All that aside however, building and refining this bootstrapping process allowed me to run it across different cloud providers, namely: Rackspace, Linode, and EC2. Whilst running the bootrstrapping process many times, I thought it might be a great opportunity to compare performance of those providers side-by-side. My bootstrap process runs the same commands in order, and covers quite a variety of operations. This should give an interesting indication on how each of the cloud providers performs.

Categories
optimization Technology wordpress

How much (cache) is too much?

One of the best rules of thumb I know is the 80/20 rule. I can’t think of a more practical rule in almost any situation. Combined with the law of diminishing returns, it pretty much sums up how the universe works. One case-study that hopes to illustrate both of these, if only a little, is a short experiment in optimization I carried out recently. I was reading so many posts about optimizing wordpress using nginx, varnish, W3-Total-Cache and php-fpm. The results on some of them were staggering in terms of improvements, and I was inspired to try to come up with a similar setup that will really push the boundary of how fast I can serve a wordpress site.

Spoiler – Conclusion

So I know there isn’t such a thing as too much cash, but does the same apply to cache?

Categories
Security Technology

encryption is not the right solution

When talking about security, the first thing that usually comes to mind is encryption. Spies secretly coding (or de-coding) some secret message that should not be revealed to the enemy. Encryption is this mysterious thing that turns all text into a part of the matrix. Developers generally like encryption. It’s kinda cool. You pass stuff into a function, get some completely scrambled output. Nobody can tell what’s in there. You pass it back through another function – the text is clear again. Magic.

Encryption is cool. It is fundamental to doing lots of things on the Internet. How could you pay with your credit card on Amazon without encryption? How can you check your bank balance? How can MI5 pass their secret messages without Al-Qaida intercepting it?

But encryption is actually not as useful as people think. It is often used in the wrong place. It can easily give a false sense of security. Why? People forget that encryption, by itself, is usually not sufficient. You cannot read the encrypted data. But nothing stops you from changing it. In many cases, it is very easy to change encrypted data, without knowledge of the encryption key.

Categories
Technology troubleshooting

dynamic goal values in google analytics

Scoring a goal against google is never easy. Google analytics allows you to do some strange and wonderful things, but not without some teeth grinding. I was struggling with this for a little while, and it was a great source of frustration, since there’s hardly any info out there about it. Or maybe there is lots of info, but no solution to this particular problem. I think I finally nailed it.

Dynamic Goal Conversion Values

I was trying to get some dynamic goal conversion values into Analytics. I ended up reading about Ecommerce tracking and it seemed like the way to go. Not only would I be able to pick the goal conversion value dynamically, it gives you a breakdown of each and every transaction. Very nice. After implementing it, I was quite impressed to see each transaction, product, sku etc appear neatly on the ecommerce reports. So far so good. But somehow, goals – which were set on the very same page as the ecommerce tracking code – failed to add the transaction value. The goals were tracked just fine, I could see them adding up, but not the goal value. grrrr…

Categories
Technology wordpress

unicode url double-encoding 404 redirect trick

I’ve come across a small nuisance that seemed to appear occasionally with unicode urls. Some websites seem to encode/escape/quote urls as soon as they see any symbol (particularly % sign). They appear to assume it needs to be encoded, and convert any such character to its URL-Encoded form. For example, percent (%) symbol will convert to %25, ampersand (&) to %26 and so on.

This is not normally a problem, unless the URL is already encoded. Since all unicode-based urls use this encoding, they are more prone to these errors. What happens then is that a URL that looks like this:
http://www.frau-vintage.com/2011/%E3%81%95%E3%81%8F%E3%82%89 …

will be encoded again to this:
http://www.frau-vintage.com/2011/%25E3%2581%2595%25E3%25 …

So clicking on such a double-encoded link will unfortunately lead to a 404 page (don’t try it with the links above, because the workaround was already applied there).

A workaround

This workaround is specific to wordpress 404.php, but can be applied quite easily in other frameworks like django, drupal, and maybe even using apache htaccess rule(?).


<?php 
/* detecting 'double-encoded' urls
 *  if the request uri contain %25 (the urlncoded form of '%' symbol)
 *  within the first few characeters, we try to decode the url and redirect
 */
$pos = strpos($_SERVER&#91;'REQUEST_URI'&#93;,'%25');
if ($pos!==false && $pos < 10) :
    header("Status: 301 Moved Permanently");
    header("Location:" . urldecode($_SERVER&#91;'REQUEST_URI'&#93;)); 
else:
    get_header(); ?>
    <h2>Error 404 - Page Not Found</h2>
    <?php get_sidebar(); ?>
    <?php get_footer(); 
endif; ?>

This is placed only in the 404 page. It then grabs the request URI and checks if it contains the string ‘%25’ within the first 10 characters (you can modify the check to suit your needs). If it finds it, it redirects to a urldecoded version of the same page…

Categories
django monitoring optimization python Technology

django memory leaks, part II

On my previous post I talked about django memory management, the little-known maxrequests parameter in particular, and how it can help ‘pop’ some balloons, i.e. kill and restart some django processes in order to release some memory. On this post I’m going to cover some of the things to do or avoid in order to keep memory usage low from within your code. In addition, I am going to show at least one method to monitor (and act automatically!) when memory usage shoots through the roof.

Categories
django monitoring optimization python Technology

django memory leaks, part I

A while ago I was working on optimizing memory use for some django instances. During that process, I managed to better understand memory management within django, and thought it would be nice to share some of those insights. This is by no means a definitive guide. It’s likely to have some mistakes, but I think it helped me grasp the configuration options better, and allowed easier optimization.

Does django leak memory?

In actual fact, No. It doesn’t. The title is therefore misleading. I know. However, if you’re not careful, your memory usage or configuration can easily lead to exhausting all memory and crashing django. So whilst django itself doesn’t leak memory, the end result is very similar.

Memory management in Django – with (bad) illustrations

Lets start with the basics. Lets look at a django process. A django process is a basic unit that handles requests from users. We have several of those on the server, to allow handling more than one request at the time. Each process however handles one request at any given time.

But lets look at just one.

cute, isn’t it? it’s a little like a balloon actually (and balloons are generally cute). The balloon has a certain initial size to allow the process to do all the stuff it needs to. Lets say this is balloon size 1.

Categories
Security Technology wordpress

timthumb vulnerability

About a month ago I posted about tweaking timthumb to work with CDN. Timthumb is a great script, but great scripts also have bugs. A recently discovered one is a rather serious bug. It can allow attackers to inject arbitrary php code onto your site, and from there onwards, pretty much take control over it.

Luckily no websites I know or maintain were affected, possibly since the htaccess change I used shouldn’t allow using remote URLs in the first place (and also it renamed timthumb.php from the url string, making it slightly obfuscated). I still very strongly advise anybody using timthumb to upgrade to the latest version to avoid risks.

Categories
Security Technology wordpress

ajaxizing

Following from my previous post, I’ve come across another issue related to caching in wordpress: dynamic content. There’s a constant trade-off between caching and dynamic content. If you want your content to be truly dynamic, you can’t cache it properly. If you cache the whole page, it won’t show the latest update. W3 Total Cache, WP Super Cache and others have some workarounds for this. For example, W3TC has something called fragment caching. So if you have a widget that displays dynamic content, you can use fragment caching to prevent caching. However, from what I worked out, all it does is essentially prevent the page with the fragment from being fully cached, which defeats the purpose of caching (especially if this widget is on the sidebar of all pages).

The best solution for these cases is using ajax, to asynchronously pull dynamic content from the server using Javascript. So whilst many plugins already support ajax, and can load data dynamically for you, many others don’t. So what can you do if you have a plugin that you use, and you want to ‘ajaxize’ it?? Well, there are a few solutions out there. For example this post shows you how to do it, and works quite well.

The thing is, I wanted to take it a step further. If I can do it by following this manual process, why can’t I use a plugin that, erm, ‘ajaxizes’ other plugins?? I tried to search for solutions, but found none. So I decided to write one myself. It’s my first ‘proper’ plugin, but I think it works pretty well.