unicode url double-encoding 404 redirect trick

I’ve come across a small nuisance that seemed to appear occasionally with unicode urls. Some websites seem to encode/escape/quote urls as soon as they see any symbol (particularly % sign). They appear to assume it needs to be encoded, and convert any such character to its URL-Encoded form. For example, percent (%) symbol will convert to %25, ampersand (&) to %26 and so on.

This is not normally a problem, unless the URL is already encoded. Since all unicode-based urls use this encoding, they are more prone to these errors. What happens then is that a URL that looks like this:
http://www.frau-vintage.com/2011/%E3%81%95%E3%81%8F%E3%82%89 …

will be encoded again to this:
http://www.frau-vintage.com/2011/%25E3%2581%2595%25E3%25 …

So clicking on such a double-encoded link will unfortunately lead to a 404 page (don’t try it with the links above, because the workaround was already applied there).

A workaround

This workaround is specific to wordpress 404.php, but can be applied quite easily in other frameworks like django, drupal, and maybe even using apache htaccess rule(?).


<?php 
/* detecting 'double-encoded' urls
 *  if the request uri contain %25 (the urlncoded form of '%' symbol)
 *  within the first few characeters, we try to decode the url and redirect
 */
$pos = strpos($_SERVER&#91;'REQUEST_URI'&#93;,'%25');
if ($pos!==false && $pos < 10) :
    header("Status: 301 Moved Permanently");
    header("Location:" . urldecode($_SERVER&#91;'REQUEST_URI'&#93;)); 
else:
    get_header(); ?>
    <h2>Error 404 - Page Not Found</h2>
    <?php get_sidebar(); ?>
    <?php get_footer(); 
endif; ?>

This is placed only in the 404 page. It then grabs the request URI and checks if it contains the string ‘%25’ within the first 10 characters (you can modify the check to suit your needs). If it finds it, it redirects to a urldecoded version of the same page…

django memory leaks, part II

On my previous post I talked about django memory management, the little-known maxrequests parameter in particular, and how it can help ‘pop’ some balloons, i.e. kill and restart some django processes in order to release some memory. On this post I’m going to cover some of the things to do or avoid in order to keep memory usage low from within your code. In addition, I am going to show at least one method to monitor (and act automatically!) when memory usage shoots through the roof.
Continue reading “django memory leaks, part II”

django memory leaks, part I

A while ago I was working on optimizing memory use for some django instances. During that process, I managed to better understand memory management within django, and thought it would be nice to share some of those insights. This is by no means a definitive guide. It’s likely to have some mistakes, but I think it helped me grasp the configuration options better, and allowed easier optimization.

Does django leak memory?

In actual fact, No. It doesn’t. The title is therefore misleading. I know. However, if you’re not careful, your memory usage or configuration can easily lead to exhausting all memory and crashing django. So whilst django itself doesn’t leak memory, the end result is very similar.

Memory management in Django – with (bad) illustrations

Lets start with the basics. Lets look at a django process. A django process is a basic unit that handles requests from users. We have several of those on the server, to allow handling more than one request at the time. Each process however handles one request at any given time.

But lets look at just one.

cute, isn’t it? it’s a little like a balloon actually (and balloons are generally cute). The balloon has a certain initial size to allow the process to do all the stuff it needs to. Lets say this is balloon size 1.
Continue reading “django memory leaks, part I”

timthumb vulnerability

About a month ago I posted about tweaking timthumb to work with CDN. Timthumb is a great script, but great scripts also have bugs. A recently discovered one is a rather serious bug. It can allow attackers to inject arbitrary php code onto your site, and from there onwards, pretty much take control over it.

Luckily no websites I know or maintain were affected, possibly since the htaccess change I used shouldn’t allow using remote URLs in the first place (and also it renamed timthumb.php from the url string, making it slightly obfuscated). I still very strongly advise anybody using timthumb to upgrade to the latest version to avoid risks.

ajaxizing

Following from my previous post, I’ve come across another issue related to caching in wordpress: dynamic content. There’s a constant trade-off between caching and dynamic content. If you want your content to be truly dynamic, you can’t cache it properly. If you cache the whole page, it won’t show the latest update. W3 Total Cache, WP Super Cache and others have some workarounds for this. For example, W3TC has something called fragment caching. So if you have a widget that displays dynamic content, you can use fragment caching to prevent caching. However, from what I worked out, all it does is essentially prevent the page with the fragment from being fully cached, which defeats the purpose of caching (especially if this widget is on the sidebar of all pages).

The best solution for these cases is using ajax, to asynchronously pull dynamic content from the server using Javascript. So whilst many plugins already support ajax, and can load data dynamically for you, many others don’t. So what can you do if you have a plugin that you use, and you want to ‘ajaxize’ it?? Well, there are a few solutions out there. For example this post shows you how to do it, and works quite well.

The thing is, I wanted to take it a step further. If I can do it by following this manual process, why can’t I use a plugin that, erm, ‘ajaxizes’ other plugins?? I tried to search for solutions, but found none. So I decided to write one myself. It’s my first ‘proper’ plugin, but I think it works pretty well. Continue reading “ajaxizing”

thumbs up

[IMPORTANT: please check that you have the latest version of timthumb! older versions might have a serious security vulnerability. A little more about it here]

I’ve been recently trying to optimize a wordpress based site. It was running fine, but I wanted to run it even faster, and make the best use of resources. So I ended up picking W3 Total Cache (W3TC). It’s very robust and highly configurable, if perhaps a bit complicated to fully figure out. So eventually things were running fine, and my next task was to boost it even further by using a Content Delivery Network (CDN). In this case, the choice was Amazon Cloudfront. The recent release allowed managing custom origin from the console, which made things even easier. One of the remaining issues however, was trying to optimize timthumb.

Timthumb was already included with the theme, and I liked the way it works. It allowed some neat features, like fitting screenshots nicely, and also fitting company logos well within a fixed size (with zc=2 option). Google search has led me to a couple of sources. However, for some reason none of them worked, so I ended using a slightly different solution… Continue reading “thumbs up”

timing is everything

A quick-tip on the importance of timestamps and making sure your time zone is set correctly.

I was recently playing around with fail2ban. It’s a really cool little tool that monitors your log files, matches certain patterns, and can act on it. Fail2ban would typically monitor your authentication log file, and if for example it spots 5 or more consecutive failures, it would simply add a filter to your iptables to block this IP address for a certain amount of time. I like fail2ban because it’s simple and effective. It does not try to be too sophisticated, or have too many features. It does one thing, and does it very well.

I was trying to build a custom-rule to watch a specific application log-file. I had a reasonably simple regular expression and I was able to test it successfully using fail2ban-regex. It matched the lines in the log file, and gave me a successful result

Success, the total number of match is 6

However, when running fail2ban, even though it loaded the configuration file correctly, and detected changes in the log files, fail2ban, erm, failed to ban… I couldn’t work out what was the problem.

As it turns-out, the timestamps on my log file was set to a different time-zone, so fail2ban treated those log entries as too old and did not take action. Make sure your timestamps are correct and on the same timezone as your system!! Once the timezone was set, fail2ban was working just fine.

passwordless password manager

[Also published on testuff.com]

Most people I know tend to simply use the same password on ALL websites. Email, Paypal, Amazon, Ebay, Facebook, Twitter. This is obviously a very bad idea.

Passwords are always a problem. Difficult to remember, hard to think of a good one when you need a new one, tricky to keep safe. For the moderately-paranoid and the sufficiently-techie there are many good solutions out there. Password managers. Online, offline, commercial, free. So I usually suggest to my friends and colleagues to use a password manager.
Continue reading “passwordless password manager”

smile

This saturday, 8th January 2011 I’m running a small geeky arts project at Madame Lillie’s gallry in Stoke Newington.


SMILE – a temporary exhibition
The smile project attempts to capture snapshots within the exhibition space. The audience takes an active role as part of thework and passively or actively affects it. The exhibition space is a number of webcams, each captures still-image snapshots at random. Some cameras are hidden, whilst others are visible. Those snapshots are then randomly layed-out and printed onto a photographs every few minutes. The audience is invited to take those snapshots home, as a souvenir and a piece of the artwork. Each snapshot is unique and cannot be reproduced. The images are deleted immediately after being processed and printed out.

Influenced by thoughts about the London surveillance network, the smile project looks at the proliferation of cameras that capture parts of our lives, and the knowledge that we all, willingly or unknowingly appear in images captured by others. With the advances in technology it is becoming increasingly easy to take photos and videos. It is also cheap and easy to keep those on file for a long period of time, perhaps indefinitely. Photos and videos that we take these days are instant and perishable: they appear briefly on our facebook page and get immediate attention until quickly replaced by others. Yet at the same time we cannot truly delete them. Once posted online, they are beyond our control. They are stored electronically, archived and backed-up. They are searchable and indexed. Whether we are the subjects of the images or those who create them, we have little control over them.

smile is attempting to both make use of and question the technology that dominates our modern lives. It is meant to be a fun and humorous experience, involving the audience and rewarding it. It uses digital imaging technology, but produces a tangible, unique output. The creation process involves programming in various scripting languages, using a mix of digital tools, primarily open-source, all form a part of a random montage.

Once upon a time

One-Time-Passwords always fascinated me. Long long time ago in a land far far away I suddenly had this idea. The idea was simple and in today’s terms pretty common, perhaps trivial. One-Time-Password without the need for an extra token. After the user keys in their username and password, they get sent a random password via SMS. Ten years ago there wasn’t anything that did that. I created a basic RADIUS implementation with support for different SMS gateways, all in Java. Sadly however, with no funding, no clue how to turn it into a business, and just finishing my computer science degree, it had to be abandoned for an easier day job.

I was recently pulled into looking at two-factor-authentication (2FA) solutions. I used SecurID at a previous job, and know of several solutions in this area. I was quite pleased to discover many soft-token solutions working on mobile phones (iphone, blackberry, HTC, Nokia) and USB-based ideas like Yubikey. I was even more pleased to discover open source initiatives in this area, and OATH HOTP in particular.
Continue reading “Once upon a time”

iphone running late

I recently noticed my iphone clock wasn’t accurate. I’m not exactly sure why. It was only a few minutes behind, but it still annoyed me. Why couldn’t my iphone sync its time with an internet time server?? I know it is supposed to sync with my mobile network operator, but I think mine doesn’t sync… Maybe it’s my operator?

For jailbroken iphones, there’s a neat app on cydia called NTPDate. It’s a great app and I recommend installing it. All you need is specify the ntp server, and click ‘set’ and it will sync your clock for you. However, I wanted to go a step further. I wanted my iphone to sync itself automatically for me, using a cron job. Well, not quite using cron, but it can be done automatically.
Continue reading “iphone running late”

iphone asterisk sync

On my last post I described how I get my asterisk box to know the caller name from a csv data file. The thing is, my address book keeps changing on my iphone. People change their phone numbers, I meet new people (can you believe it? I don’t let it happen too often though)… I wanted to be able to sync it automatically to my asterisk. This synchronisation also doubles up as a backup for my address book.
Continue reading “iphone asterisk sync”

who’s calling?

Caller ID is a wonderful feature. Don’t we love getting a call from someone we like, and perhaps more importantly, ignore those annoying callers who we really don’t want to talk to.

But this is yesterday’s news. We all have caller IDs. It just works. Well, yes. It does. But what if we get a call on our landline? We get the caller ID there too, but do we know who it is?? All our contacts are on our mobile phones. Standard phones don’t usually have the capacity to hold more than 10 names on average. And even if they did. Who’s got the energy to key in those numbers?
Continue reading “who’s calling?”

Get in shape

ISPs are a strange breed. They’re supposed to give a very straight-forward service – plug me in to the Internet please. That’s all. Plain and simple. It seems like some ISPs have different ideas as to their roles and responsibilities. Traffic shaping is one of those. Port/Service blocking is its ugly cousin. I don’t like either. You’re not my Big Brother. If I wanted one I’d move to China.

Continue reading “Get in shape”

Guilty Pleasures

Perhaps yet another misleading title for this post, but bear with me. When I was a child we used to play outside a lot. I always remember the neighbours complaining if we made too much noise. Such is life. There was one period of time that I knew I would get in trouble though. We would get told off big time!! When?? Every day between 2 and 4 in the afternoon. There was even a sign in big red letters telling us all to keep quiet at ‘rest time’ (loosely translated). There was no sign about making noise after 11pm, but there was one for the afternoon nap. It was THAT important.

These days seem long gone now. Does anybody have time for an afternoon nap?? I certainly don’t recall seeing any such signs around.
Continue reading “Guilty Pleasures”

GoDaddy taking european domains hostage prior to expiration

This is something I never thought would concern me. All those domain ownership issues, I thought, were with people who just aren’t organised enough to renew their domains, forget their passwords, or pick domains that others try to steal.

I’m currently managing a couple of .DE and one .CO.UK domains on Godaddy. The .co.uk domain was bought for a fairly long period, so no problems there, but recently, both the .de domains were up for renewal. I received a couple of email reminders from GoDaddy, but when I logged into my account I noticed they are only due to expire on 26th March. More than a month away. On the last email I received on the 20th, I decided it’s probably time to renew, so clicked on the link and was getting my credit card ready.

To my surprise, the GoDaddy domain portal did not allow me to renew them, and marked them as ‘pending expiration’. hmmm… weird. Oh well, I emailed GoDaddy and asked to renew, not even worrying too much. Maybe a system glitch of some sort.
Continue reading “GoDaddy taking european domains hostage prior to expiration”

Postcode, Barcode and python code

I’ve had a strange thing happening a while ago. I sent a CD in a padded envelope to someone, and it was returned to me. Well, it didn’t look like it was returned, more like they actually delivered it to me instead of to the person I sent it to.

Then I noticed something. I was re-using an old envelope. For environmental reasons of course (read: being so tight-assed, saving money on padded envelopes). I did write the destination address on the right side, so my address was nowhere on the envelope. What was left there however was this tiny printed barcode used by the previous sender with my address.
Continue reading “Postcode, Barcode and python code”

sniffing some fresh tomatoes

Perhaps the title is a little misleading, but it’s an opportunity to combine two of my greatest loves: food and computers. I suppose even this intro is misleading. Oh. Forget it. Lets get down to business. And this time our business is rather short (and sweet).

Running tcpdump on my Linksys router (well, Buffalo WHR-54GS to be precise, but the same famous WRT54G clone that runs open source firmware).

Continue reading “sniffing some fresh tomatoes”

JDK 5 Debian etch Virtuozzo installation oddities

Not a particularly interesting post, unless you happen to be running Debian 4.0 etch running on Virtuozzo host and are trying to install JDK5. I saw a post with a user reporting the same error on the parallels forum , but couldn’t find a clear solution, until I bumped onto a weird forum in German, which luckily linked to a related bug report (in English).
Continue reading “JDK 5 Debian etch Virtuozzo installation oddities”

Turbogears, Lighttpd and WSGI for real dummies

Just so to set the records straight, I’m the real dummy here, as you would obviously see.

I absolutely have no experience with python, lighttpd, turbogears or wsgi, fcgi but I was given a task to set it up on a server. All I used was some online posts and common sense. Normally I wouldn’t have to post a blog entry about it, most of the stuff is already out there. On this occasion however, I felt there’s some missing documentation, or else – simply because I don’t know anything about any of these it was more tricky than it usually is. Perhaps it’s one indicator how this all seems from the sysadmin perspective, as opposed to a turbogears developer.
Continue reading “Turbogears, Lighttpd and WSGI for real dummies”

iCal on Mac with Apache on Windows

[Migrated from tuzig.com]

I’ve been struggling with getting the iCal app on Mac OS X to use a shared calendar on Apache installed on Windows and using domain authentication (SSPI). It is all supposed to be so simple, yet it didn’t work. Authentication failed with this message:


Access is denied at http://server.com/file.ics with this login and password

When using Safari and accessing the same webDAV url authentication works just fine. I could even download the calendar file from Safari and it would open it in iCal, but for some bizarre reason – neither Subscribing nor publishing worked

I’ll leave out all my trials and errors. Eventually it came down to SSPI authentication not working well with iCal, even with SSPIOfferBasic On and SSPIBasicPreferred On. Only when I added SSPIOfferSSPI Off did it start working…

I’m still slightly confused why no modification is allowed to subscribed calendars in iCal, as it does let you do it in Sunbird, but at least you can publish your own calendar and subscribe to other’s

Rsnapshot Server on Windows

[Migrated from tuzig.com]

2009 Aug 14 Update: Looks like rsnapshot is now packaged in cygwin! Thanks to pseudo-anonymous coward for the comment! Some of the information below may still be of interest, so it’s left unchanged.

Rsnapshot is a great tool for backing up data, keeping versions and using relatively minimal disk space. From the rsnapshot website:

“rsnapshot is a filesystem snapshot utility for making backups of local and remote systems. Continue reading “Rsnapshot Server on Windows”