invisible reCAPTCHA v3 with Rails and Devise

We’re recently being hit with more and more bots.

Some of them are crawling our site and hitting valid or invalid endpoints. We’ve seen plenty of credential stuffing attacks as well. Most of them distributed across different IPs, with each IP hitting us at low frequency.

And most recently, someone abused our registration form to spam their recipients via our system.

It was quite clever actually. When you register, you enter your name, email and password. We then send a confirmation email saying something like

“Hey Roberta, thanks for joining. Please click here to confirm your account”.

Now those guys used their victim’s email address, and used the name field to link to a URL. So those users would get an email

“Hey lottery tickets http://some.link, thanks for joining. Please click here to confirm your account”.

Slimey. Naturally our own email system took the hit of sending spam. Double ouch.

Luckily, we had some anomaly detection in place, and we blocked those guys quickly. They used some browser automation from a fixed set of IPs, so it was easy to block. At least until the next wave…

I’ve been dealing with those types of scenarios with fail2ban, and it’s really quite effective. We define regular expressions to inspect our log files matching certain patterns, and then ban if we see repeated offensive behaviour. fail2ban is limited though in some aspects.

First of all, those rules are a bit of a pain to create and maintain, and you need to make sure the offending IP appears on the application log record you want to capture. In some cases it’s easy, but not always. The bigger problem however is that fail2ban doesn’t scale. The more servers you have — let’s say in a load-balanced setup — the less accurate fail2ban becomes. Or you need to aggregate all your logs on a single fail2ban host, creating a single point of failure or a bottleneck…

So I was searching for a better solution. Sadly there aren’t many. Cloudflare, which we also use, offers some degree of protection. But it’s not as flexible. And of course there’s reCAPTCHA. You know, those annoying things asking you to pick traffic signs, or even just click “I’m not a robot”?

Now I was initially hesitating to use it. I’m not sure why, but the fact that it doesn’t really have any real competition bothers me. Plus, as a user, I’m frequently annoyed by those challenges, and I hate this experience.

Luckily, the latest version of reCAPTCHA (v3) doesn’t present any user-facing challenges. It’s completely invisible. The no-competition problem is not something I can solve. I discovered that even Cloudflare itself uses reCAPTCHA in some cases! And these guys have their own Javascript challenge and what not… So I decided to bite the bullet, and give it a shot.

Setting it up is surprisingly simple, and from my limited experience, quite effective. That is, the scores it produced were surprisingly accurate. Albeit my ability to test different scenarios was limited.

I’ll try to give some pointers for implementing reCAPTCHA v3 with Rails 5.1 and Devise 4. The implementation can work on any form or controller however, and not just with Devise.

How does it work exactly?

There are two components for using reCAPTCHA v3: the client, and the server.

The client runs the reCAPTCHA javascript code, which tests whether you’re a human or a bot. The client code doesn’t return the score though, this won’t be secure. Instead the client-side JS gets a token from Google. We take this token, and add it to our form. And when the form is submitted (POSTed to the server), the server uses the token to retrieve the score and verify if it’s a human or a bot.

The score ranges from 1.0 (definitely human) to 0.0 (definitely bot). How do you decide what’s the threshold below which you block the request? that’s up to you…

What if the server gets no token? well, it should block the request. The javascript code wasn’t run, which suggests it was a bot (but maybe not? see below)

And if the token is invalid? well, either something went wrong, or it was a bot. We’ll block it as well.

What if the Google API times-out? that’s something you can consider the trade-offs for. You might be ok with some requests going untested, and therefore fail open. Or decide that you must validate, and fail close.

Use a Gem?

There’s actually a reCAPTCHA Gem available. So if you’re looking for something quick and easy, go and use it. It doesn’t support reCAPTCHA v3 yet (but I’m sure it would soon).

The reason I opted out of using a gem, besides the lack of v3 support, was that it felt a bit cumbersome. As you can see, the code is fairly simple, and I prefer to have it explicitly in my codebase. It was also a good exercise to make sure I understand how things work.

The frontend code

This isn’t the simplest implementation, but it’s async and makes sure reCAPTHCA is loaded. The code below is in coffeescript.


recapthcha_site_key = "your reCAPTHCA site key"
window.recaptcha_callback = ->
  grecaptcha.execute(
    recaptcha_site_key, {action: "registration"}
  ).then (token) ->
    $("form").prepend(
      "<input type='hidden' name='recaptcha_token' value='#{token}'>"
    )
( ->
  resource = document.createElement("script")
  resource.type = "text/javascript"
  resource.async = true
  resource.crossorigin = "Anonymous"
  resource.src = "https://www.google.com/recaptcha/api.js?onload=recaptcha_callback&render=#{recaptcha_site_key}"
  script = document.getElementsByTagName("script")[0]
  script.parentNode.insertBefore(resource, script)
)()

Let’s break it down a little.

Let’s start from the end of the snippet. Lines 10 and below simply load the recaptcha engine from Google. It does so asynchronously, and when it’s finished loading, it calls our callback window.recaptcha_callback on line 2.

This is where our code lives. I used a simple example, but you can extend it to fit your needs.

In this example, we’re calling grecaptcha.execute with our site key, and the action name. The action name is important, and you should define it based on the type of page you have. You can read more about actions here. This call returns a unique token. We then add this token to our form — using jQuery. When the user submits the form, the token is passed to our rails app. We called this parameter recaptcha_token.

If you plan to use reCAPTCHA on more than one page or form (for example, your registration and login pages, or for any other form that might be hit by bots), then you should adapt the code to call grecaptcha.execute with a different action, based on the page you’re on.

Backend (with devise)

Now let’s take a look at our backend. You can use reCAPTCHA with your own custom controllers, but in this example, I’m going to use devise specifically. It’s the most common authentication gem for rails. Parts of the code however are generic, and re-usable.

I’m going to define a controller concern, which we can use in any controller, to make it more re-usable. I created a new module: controllers/concerns/recaptcha.rb


module Recaptcha
  RECAPTCHA_URL = "https://www.google.com/recaptcha/api/siteverify".freeze
  def validate_recaptcha(threshold=0.3)
    # params.require raises an exception if recaptcha_token is missing
    token = params.require(:recaptcha_token)
    result = JSON.parse(
      # using RestClient to keep things clean and simple
      RestClient.post(RECAPTCHA_URL,
                      :secret => ENV["recapthca_secret_key"],
                      :response => token,
                      :remoteip => request.remote_ip)
    )
    # if the Google API results aren't a success, we return false
    return false unless result["success"]
    # if the score is below our threshold, we also return false
    return false if result["score"] < threshold
    # otherwise, the request was validated and is above our threshold
    return true
  rescue ActionController::ParameterMissing
    return false
  rescue RestClient::Exceptions::Timeout
    # you can decide if you want to fail-open or fail-close here.
    return true
  end
end

This module includes a single method, so when we include it in any of our controllers, we can call validate_recaptcha, and it would return true or false whether it validated correctly.

What’s happening here:

  • If our recaptcha_token is missing, we raise an exception, then catch it and return false
  • We use the token to call the Google reCAPTCHA api and give us a score
  • Our threshold is configurable. I’ve set it to 0.3 by default, but you might want to tweak it, and you can set a higher or lower threshold differently for each form or controller
  • If the result we get is not a "success", or if the score is lower than our threshold, we return false
  • otherwise, the score is there, the token is valid, and above the threshold, so we return true
  • If the request to the Google API times out, we return true (fail open in this case, but you can decide what works best for you)
  • I think it’s useful to log an error for troubleshooting. You can also use reCAPTCHA in log-only mode, and always return true if you want to gather some typical score values to decide on the right threshold. For simplicity, I didn’t include any logging though

Now let’s use our little module inside our devise controller


class Users::RegistrationsController < Devise::RegistrationsController
  include Recaptcha
  def create
    if validate_recaptcha
      super
    else
      recaptcha_error
    end
  end
  # ...
  private
  # see https://github.com/plataformatec/devise/wiki/How-To:-Use-Recaptcha-with-Devise
  def recapthca_error
    self.resource = resource_class.new sign_up_params
    resource.validate
    resource.errors.add(:base, "Registration Error. Please try again")
    # this part is different from the devise wiki, since we don't use a create template, only new
    respond_with_navigational(resource) { render :new }
  end
end

The devise create action is called when a new user registers. We validate the captcha before calling the create action itself. So if our captcha validation fails, we don’t create a new user. We return a generic error to the user. You can decide whether you want to return a more specific error. Legitimate users who are caught by this would be confused anyway, and I’m not sure we want to help bots by giving them too much information either, so I kept the error a bit vague.

As you can imagine, it should be very easy to include this module and use the validation on any controller you might need, not just with devise. All we have to make sure is that we pass the recapthca_token to the controller, and that we handle validation failures.

Other thoughts

reCAPTCHA has the potential to kill off those bots from the get go, so in that sense it’s a great solution. But in some environments, it’s not suitable.

For example, if you want to serve browsers or access without Javascript, or if you have privacy concerns. Dealing with false positives is another concern. You really don’t want to block legitimate users, and how do you know who gets blocked and based on what behaviour?? your really have to put lots of faith in the machine-learning algorithms of Google.

I was therefore wondering. I think fail2ban is neat, but I’m surprised there isn’t really a distributed version of it. I guess something that would plug into your log aggregation. I’ve been using Logentries, Scalyr, and most recently Datadog for logging, and all of them have some degree of pattern-matching and triggers, mostly to send an alert. It would be great if it was easier to plug those alerts back into the system and put a block in place.

Leave a Reply

Your email address will not be published. Required fields are marked *