AlephBet

I recently created AlephBet: a new javascript A/B Test framework, built for developers. This post tries to capture the motivation and some background for creating it in the first place, especially with so many commercial and open-source frameworks and services available for A/B testing.

Another Framework?

Nobody likes re-inventing the wheel, and I would genuinely doubt the need for “yet another” framework. However, after some considerable time spent looking for alternatives, it seemed like the most sensible option. That being said, the Internet is a big place, so I could have mis-judged what some of the tools offer, or missed some other framework that could do the same thing, or even better. I’d love to discover more of what’s out there.

Is it even a framework?

I’m using the term framework very loosely here, and throughout this post. Especially if you consider that AlephBet is 5kb after being minified and gzipped. It’s more like a nano-framework, or a wrapper, or a very small library. I also tag it as “for developers”. This means that it’s not providing any shiny interface, just some common-sense patterns for tying together your A/B testing.

Some background

I’ve been advocating and using A/B testing regularly over the last few years. I firmly believe that most changes should be tested if possible, rather than blindly assuming they will bear positive results. It’s not always easy defining the hypothesis and building the experiment in a way to produce meaningful, statistically-significant results (especially within a reasonble timeframe), but it’s worth trying.

I started off using the rails-based Split gem. It’s a solid framework with a nice and clean API. However, it had one major limitation: it’s a server-side framework. I’m primarily a back-end guy. That’s where I feel “at home”. Using the best tool for the job trumps any personal preferences however. Javascript-based frameworks offer a couple of key advantages in my opinion:

Caching – running experiments with server-side caching in place is a real pain. I don’t want to give up on caching, nor play-around with it too much. (It’s tricky enough even without adding multivariate experiments into the mix).
The experiments are de-coupled from the code. This doesn’t seem like a big deal, but when you’re running several experiments and want to keep things clean, you see why this is better. Server-side A/B test frameworks force you to litter your code with unrelated concerns, and it’s hard to figure out which experiments are running where.
Some stuff is already javascript-based. If you want to set a goal on a button click, which doesn’t necessarily trigger a request to the server – you have to capture it client-side anyway. What do you do then?

A/B testing services

Given my preference for javascript-based frameworks, and the desire to de-couple experiments from the main codebase, it was clear that using a 3rd party service offers another advantage: maintenance. Storing event data on the server and/or adding another component to the mix isn’t the best idea. There were plenty of commercial 3rd party tools specializing in this. It therefore makes perfect sense to pay a few bucks and let them take care of the heavy lifting.

My search led us to the two leading providers: Visual Website Optimizer (VWO), and the then new-kid-on-the-block: Optimizely.

Honeymoon period

Signing up for a trial was easy with both VWO and Optimizely, but we were having some strange issues with VWO that couldn’t be resolved with their support. Optimizely worked out of the box without a hitch. The choice was easy. Optimizely offered a silver package at 65 EUR per month. There was a clear upgrade path to the Gold package as we grow. What more can you ask?

We were very happy with Optimizely, despite some road bumps along the way (their new dashboard was horrific at first, but luckily improved over time). We were recommending them to whomever we spoke to. It “just worked”. We didn’t really need any of the WYSIWYG stuff though. We normally just opened the code section, added a few lines to tweak the variation, and off we went to run the experiment.

Our experiments were running along fine. Traffic increased. Conversion improved. What could go wrong?

By the time we were ready to upgrade from Silver to Gold, we had a rather odd surprise. There were no longer any Gold or Silver packages. Suddenly, there was ‘Free’ or ‘Enterprise’, and no pricing info in sight. The free option wasn’t suitable, because it lacked some key features, like launching experiments based on cookies or browser-type. The Enterprise plan started at around 2000 EUR per month we were told. Including some heavy hand-holding, red-carpet onboarding and some other perks we didn’t need. And we were not at a point where we can justify this spend.

We asked, we begged, we tried. Optimizely wouldn’t have us as customers any more. From loyal customers, we became a liability. Dead-weight on Optimizely’s plan to rule the world. They kinda grandfathered us, by allowing to keep the Silver plan. But it wasn’t enough, and the overage charges were killing us.

Competition?

So surely there are other service providers out there, eager to take away business from Optimizely. What about our friends at VWO? They too seemed to have upped the ante. To use cookie-based targetting, you’re looking at the Enterprise plan starting at $999+ per month (when paid annually). Ouch.

We sampled a few others, but they all fell short. Things were clunky or didn’t work at all. Or pricing was out of our league. Are we really asking for too much? We don’t need fancy dashboards, WYSIWYG editors. Just give us a way to build an experiment with some js snippet for each variation, define a goal which we could also trigger via javascript, and a way to track those numbers. We can even do all the statistical-significance calculation ourselves. Why do we suddenly have to pay that much more, just for a feature that’s the equivalent of one if condition in javascript.

Open-Source to the rescue?

So we turned back to Open-Source. There must be dozens and dozens of frameworks, libraries and tools out there. Well, there are. But none of them hit the spot. Many are server-based. One in particular grabbed our attention: SixPack. It is language-agnostic and even comes with a javascript library. Unfortunately though, the javascript library waits for the server to pick the test variant. This was a big no-go for us. It could drastically increase the latency of our experiments. We could perhaps force a variant, but it felt like a hack. It also meant hosting the backend, which we weren’t so keen on.

So what about some lighter-weight JS-based frameworks? I did a search for ‘A/B test’ on github, and there are several worth mentioning. Most were using Google Analytics as the backend. This seems like a sensible choice. Plugging another backend, like keen.io or mixpanel was either available, or easy-enough to customize. But none of them ticked all boxes though:

pifantastic/ab [8 stars on github; last updated May 2014] – seemed to have a slightly odd API based on events. Some good ideas, but felt rough around the edges. No obvious concept of goals.
scenario.js [15 stars; last updated Aug 2013] – good name. Using mixpanel for event storage. The API was nice and clean. However, it didn’t offer any local persistnce, so it seems like every visit could produce a different variant to the user. It also seemed too tightly-coupled to mixpanel, not offering a different backend. It also didn’t support multiple goals.
abba [1162 stars; last updated Aug 2013] – looked very promising. Apparently it was created by Stripe, which drastically increased its credibility. Backend was Ruby / Mongo. I personally wasn’t too keen on Mongo, but could probably live with it if it hit the right marks in other aspects. It didn’t support multiple goals however, and felt a bit ‘heavy’, with tight-coupling to the backend.
labrats [6 stars; updated Jun 2013] – apart from the rather awkward name, I didn’t particularly liked its dependency on jQuery and found the API rather confusing.
Cohorts [211 stars; last updated Jul 2010] – deserves its own section. See below.

Cohorts

Despite having no updates since 2010, Cohorts appeared to be a good find. In fact, AlephBet is mostly inspired by Cohorts and uses a lot of its core structure. The code structure and API were just the thing I was looking for, and the whole codebase was small and easy to grasp. I particularly liked the pluggable storage adapter, supporting Google Analytics out of the box, but being easily extendable with other providers. I was able to write a small adapter for keen.io in a matter of minutes and get it working. This was a good start. There were still a few important things missing or implemented in a way that didn’t feel right. Visitors (or experiment participants) were counted more than once, and same happened with events (goals). I could hack around it, or fork it and update some elements. But I wanted to do more. For example, the terminology (cohorts, tests, events) felt a bit off to me. Perhaps being used to Optimizely, I’d like to look at those things as visitors, experiments and goals, respectively. The storage adapter felt like it should be called tracking adapter to me (tracking events to analytics or any other provider). So rather than a fork, it felt more like a re-write.

So that’s how things got started. I decided to re-write Cohorts. I prefer using coffeescript, so that was another reason for not being able to easily just fork the project. I wanted to use browserify and split things into different files for better organization. I didn’t like inlining some code where a library can be used, localStorage seemed like a better approach than cookies, and so on… Those things added up and it became clear that I need to create a different tool, albeit being very much based on Cohorts.

So there I was with a new project. It’s quite small, but should fit exactly what I need. It has a clear definition of Experiments, Variants and Goals. You can assign goals to experiments (or experiments to goals). Just like Cohorts, you can create your own tracking adapter extremely easily. It only tracks unique visitors or goals, but if you want to, you can still track non-unique ones. The API feels simple and approachable (much thanks to Cohorts, although I’m happy to get feedback and improve, since this is very much a matter of taste).

Through the creation process of AlephBet, I discovered npm, browserify, lodash custom modules, lodash-modularize, bower, keen.io and a few other bits and pieces. AlephBet is still pretty small and I hope it will stay this way. I’d love to get feedback, code contributions or suggestions. I’m hoping it will hit the right spot for others as well, but even if it doesn’t – it should at least scratch my own itch and replace Optimizely for us.

Please check it out on github, feel free to leave a comment, open an issue or a pull request.

4 replies on “AlephBet – javascript A/B Test framework for developers”

Wow, this post hit the spot for me.

I was beginning to feel alone in my search for a solution which did not rely on Google Analytics, or much server-side ‘decision-making’. I began from scratch as well.

Thank you for writing this up! I feel much better :)

Glad you feel better, Ryan :) – and also to know I’m probably not the only one!

AlephBet has evolved a little since I wrote it, so if you haven’t already, please take a look at https://github.com/Alephbet/gimel – the AWS Lambda backend for AlephBet.

Cheers,
Yoav

Hey man, considering VWO, but also using our own system or a JS/Ruby library. Would you mind sharing with me what issues you had with VWO?

Hi Jackson,

To be perfectly honest, I don’t remember the exact issues, but it was so long ago that it might be totally irrelevant anyway. VWO improved a lot in recent years from what I see, even though I had no direct experience using it since. It still comes with a pretty hefty price tag…

If you’re a developer and don’t need bells&whistles, then I think Alephbet / Gimel provide a great solution which is drastically more cost-effective.