When talking about security, the first thing that usually comes to mind is encryption. Spies secretly coding (or de-coding) some secret message that should not be revealed to the enemy. Encryption is this mysterious thing that turns all text into a part of the matrix. Developers generally like encryption. It’s kinda cool. You pass stuff into a function, get some completely scrambled output. Nobody can tell what’s in there. You pass it back through another function – the text is clear again. Magic.
Encryption is cool. It is fundamental to doing lots of things on the Internet. How could you pay with your credit card on Amazon without encryption? How can you check your bank balance? How can MI5 pass their secret messages without Al-Qaida intercepting it?
But encryption is actually not as useful as people think. It is often used in the wrong place. It can easily give a false sense of security. Why? People forget that encryption, by itself, is usually not sufficient. You cannot read the encrypted data. But nothing stops you from changing it. In many cases, it is very easy to change encrypted data, without knowledge of the encryption key. This seemingly small ‘misconception’ can lead to some really serious security holes. I think it is primarily a perception issue. As soon as something becomes obfuscated, scrambled, illegible (as encrypted data does), it’s hard to intuitively figure out the risks to it. People tend to make assumptions that are not always valid about it.
There must be plenty of examples of when encryption was used incorrectly, and how it lead to unforeseen consequences. The most famous example I am aware of is WEP. I won’t go into the details. There are plenty of resources online describing the flaws in great detail. The essence of it however was that WEP used relatively good ciphers incorrectly, or insufficiently. The encryption algorithm it used (RC4), was not ‘broken’. The algorithm itself is quite robust (No algorithm is perfect, but this wasn’t the major flaw in WEP). It’s the design of WEP, not the core algorithm, that were broken.
Lets walk through a simplified, imaginary scenario, to illustrate when NOT to use encryption. It goes something like this:
Developer: “We need to allow access to the X section of the website to more people”
Me: “Ok, then ask them to login before they can have access”
Developer: “No. We want to send them a secure link with expiry date and not all users will have an account”
Me: “In that case, we should use something like HMAC or OAUTH. It’s designed for this purpose”
Developer: “No. I don’t have time to read about Oauth, it’s too confusing. But I’ve written a real cool function that encrypts the part of the URL, and only if we can decrypt it properly, we allow access… It uses Triple-DES, so it’s super-secure. Wikipedia says that The algorithm is believed to be practically secure in the form of Triple DES, although there are theoretical attacks”
Me: “Lets have a look at your ‘secure’ solution then…”
The solution was something along those lines (hugely simplified for the sake of illustration):
where encrypted_string contained the expiry date of the page, e.g. 20120315 (15th March 2012).
Can you see where this is flawed??
There are at least a couple of possible attacks here, that don’t involve anything with breaking the encryption or trying to get the key:
- Randomly change the encrypted string – perhaps not very sophisticated, but since all we need is a date larger than today, there is still a fair bit of chance we might be lucky. Even without calculating the entire search space, it’s quite easy to see it is quite narrow.
- Replace it with another encrypted string – this is even easier. All you need as another URL string with a known-to-work encrypted string. Copy this string and paste it, and you’re good to go!
Another flaw the developer hasn’t worked out was that he was using triple-DES in ECB mode. This is the most basic mode, which makes the attacks I described much easier. So even if the algorithm itself is robust, the way it is applied can be much more important. In addition to that, under some circumstances, if the encryption process failed (which it could easily, since there was no problem manipulating the encrypted string), the code was very kind as to output the decrypted string inside the error message… Back to the user (or wannabe hacker as the case may be).
Of course, the solution was a little more complicated. It contained not just an encrypted date, but some other data. Nevertheless, the same principles of attack applies. There are many bit-flipping attacks, that can alter encrypted data and generate a predictable output. The bottom line is this: Encrypting data does not prevent modification.
In this case, all we want to achieve is authentication. We want to verify that the request was legitimate, and that nobody else can fake such a request. There’s nothing in particular that needs hiding. The expiry date of the access is not such a big secret. All we need is some kind of a signature, to validate the legitimacy of the request. This is where hash functions, and HMAC / Oauth come into play. Those mechanisms, for some strange reason, are less appealing to many developers. I’m not entirely sure why, but maybe it’s just not as fun to see an extra hash at the end of the url as it is to encrypt a string. These mechanisms are much more effective in this case. This is what they’re designed to do.
So how does this work? Again, for the sake of simplicity I’m not going to cover the detailed aspects of those algorithms. But the principle is quite simple: You take a string you want to ‘sign’, plus a secret key, and generate a unique hash value from both of them. This value will be totally different even with the slightest modification to the original string (or the key). The key will never be published or shown, only the string with the generated ‘signature’ (this unique hash value we produced). How is this more secure? As I mentioned earlier, even the slightest modification of the string will produce a completely different signature. This will prevent any undetected modification of the url. Producing the unique hash signature is virtually impossible without the knowledge of the secret key. Voila! Simple. Secure. Elegant. Of course, as always, god is in the details, so even these algorithms can be used wrongly, producing an insecure outcome. However, from my experience, following the guidelines and using the oauth/hmac libraries is far easier and less error-prone than using any encryption algorithm.
Size doesn’t matter, it’s how you use it
The Security of the WEP algorithm page sums it up quite nicely: “The [WEP] protocol’s problems are a result of misunderstanding of some cryptographic primitives and therefore combining them in insecure ways”. Even if you pick the best and most secure encryption algorithm, it might not be enough to make your solution secure. In fact, as I tried to illustrate, encryption might not be necessary at all.