Nobody likes re-inventing the wheel, and I would genuinely doubt the need for “yet another” framework. However, after some considerable time spent looking for alternatives, it seemed like the most sensible option. That being said, the Internet is a big place, so I could have mis-judged what some of the tools offer, or missed some other framework that could do the same thing, or even better. I’d love to discover more of what’s out there.
Is it even a framework?
I’m using the term framework very loosely here, and throughout this post. Especially if you consider that AlephBet is 5kb after being minified and gzipped. It’s more like a nano-framework, or a wrapper, or a very small library. I also tag it as “for developers”. This means that it’s not providing any shiny interface, just some common-sense patterns for tying together your A/B testing.
I’ve been advocating and using A/B testing regularly over the last few years. I firmly believe that most changes should be tested if possible, rather than blindly assuming they will bear positive results. It’s not always easy defining the hypothesis and building the experiment in a way to produce meaningful, statistically-significant results (especially within a reasonble timeframe), but it’s worth trying.
- Caching – running experiments with server-side caching in place is a real pain. I don’t want to give up on caching, nor play-around with it too much. (It’s tricky enough even without adding multivariate experiments into the mix).
- The experiments are de-coupled from the code. This doesn’t seem like a big deal, but when you’re running several experiments and want to keep things clean, you see why this is better. Server-side A/B test frameworks force you to litter your code with unrelated concerns, and it’s hard to figure out which experiments are running where.
A/B testing services
My search led us to the two leading providers: Visual Website Optimizer (VWO), and the then new-kid-on-the-block: Optimizely.
Signing up for a trial was easy with both VWO and Optimizely, but we were having some strange issues with VWO that couldn’t be resolved with their support. Optimizely worked out of the box without a hitch. The choice was easy. Optimizely offered a silver package at 65 EUR per month. There was a clear upgrade path to the Gold package as we grow. What more can you ask?
We were very happy with Optimizely, despite some road bumps along the way (their new dashboard was horrific at first, but luckily improved over time). We were recommending them to whomever we spoke to. It “just worked”. We didn’t really need any of the WYSIWYG stuff though. We normally just opened the code section, added a few lines to tweak the variation, and off we went to run the experiment.
Our experiments were running along fine. Traffic increased. Conversion improved. What could go wrong?
By the time we were ready to upgrade from Silver to Gold, we had a rather odd surprise. There were no longer any Gold or Silver packages. Suddenly, there was ‘Free’ or ‘Enterprise’, and no pricing info in sight. The free option wasn’t suitable, because it lacked some key features, like launching experiments based on cookies or browser-type. The Enterprise plan started at around 2000 EUR per month we were told. Including some heavy hand-holding, red-carpet onboarding and some other perks we didn’t need. And we were not at a point where we can justify this spend.
We asked, we begged, we tried. Optimizely wouldn’t have us as customers any more. From loyal customers, we became a liability. Dead-weight on Optimizely’s plan to rule the world. They kinda grandfathered us, by allowing to keep the Silver plan. But it wasn’t enough, and the overage charges were killing us.
So surely there are other service providers out there, eager to take away business from Optimizely. What about our friends at VWO? They too seemed to have upped the ante. To use cookie-based targetting, you’re looking at the Enterprise plan starting at $999+ per month (when paid annually). Ouch.
Open-Source to the rescue?
So what about some lighter-weight JS-based frameworks? I did a search for ‘A/B test’ on github, and there are several worth mentioning. Most were using Google Analytics as the backend. This seems like a sensible choice. Plugging another backend, like keen.io or mixpanel was either available, or easy-enough to customize. But none of them ticked all boxes though:
- pifantastic/ab [8 stars on github; last updated May 2014] – seemed to have a slightly odd API based on events. Some good ideas, but felt rough around the edges. No obvious concept of goals.
- scenario.js [15 stars; last updated Aug 2013] – good name. Using mixpanel for event storage. The API was nice and clean. However, it didn’t offer any local persistnce, so it seems like every visit could produce a different variant to the user. It also seemed too tightly-coupled to mixpanel, not offering a different backend. It also didn’t support multiple goals.
- abba [1162 stars; last updated Aug 2013] – looked very promising. Apparently it was created by Stripe, which drastically increased its credibility. Backend was Ruby / Mongo. I personally wasn’t too keen on Mongo, but could probably live with it if it hit the right marks in other aspects. It didn’t support multiple goals however, and felt a bit ‘heavy’, with tight-coupling to the backend.
- labrats [6 stars; updated Jun 2013] – apart from the rather awkward name, I didn’t particularly liked its dependency on jQuery and found the API rather confusing.
- Cohorts [211 stars; last updated Jul 2010] – deserves its own section. See below.
Despite having no updates since 2010, Cohorts appeared to be a good find. In fact, AlephBet is mostly inspired by Cohorts and uses a lot of its core structure. The code structure and API were just the thing I was looking for, and the whole codebase was small and easy to grasp. I particularly liked the pluggable storage adapter, supporting Google Analytics out of the box, but being easily extendable with other providers. I was able to write a small adapter for keen.io in a matter of minutes and get it working. This was a good start. There were still a few important things missing or implemented in a way that didn’t feel right. Visitors (or experiment participants) were counted more than once, and same happened with events (goals). I could hack around it, or fork it and update some elements. But I wanted to do more. For example, the terminology (cohorts, tests, events) felt a bit off to me. Perhaps being used to Optimizely, I’d like to look at those things as visitors, experiments and goals, respectively. The storage adapter felt like it should be called tracking adapter to me (tracking events to analytics or any other provider). So rather than a fork, it felt more like a re-write.
So that’s how things got started. I decided to re-write Cohorts. I prefer using coffeescript, so that was another reason for not being able to easily just fork the project. I wanted to use browserify and split things into different files for better organization. I didn’t like inlining some code where a library can be used, localStorage seemed like a better approach than cookies, and so on… Those things added up and it became clear that I need to create a different tool, albeit being very much based on Cohorts.
So there I was with a new project. It’s quite small, but should fit exactly what I need. It has a clear definition of Experiments, Variants and Goals. You can assign goals to experiments (or experiments to goals). Just like Cohorts, you can create your own tracking adapter extremely easily. It only tracks unique visitors or goals, but if you want to, you can still track non-unique ones. The API feels simple and approachable (much thanks to Cohorts, although I’m happy to get feedback and improve, since this is very much a matter of taste).
Through the creation process of AlephBet, I discovered npm, browserify, lodash custom modules, lodash-modularize, bower, keen.io and a few other bits and pieces. AlephBet is still pretty small and I hope it will stay this way. I’d love to get feedback, code contributions or suggestions. I’m hoping it will hit the right spot for others as well, but even if it doesn’t – it should at least scratch my own itch and replace Optimizely for us.
Please check it out on github, feel free to leave a comment, open an issue or a pull request.