Hire Us
Fixing multithreading for Rails

Fixing multithreading for Rails in development mode

Preface

In Rails, automatic code reloading in development mode is based on autoload,
which is not threadsafe. It means you can’t use it while developing multithreaded Rails applications(especially for JRuby). But several monkeypatches can do the trick.

Workaround

When we discovered this limitation almost a year ago, we had next options:

  • development without code reloading and restart server after every change
  • fix autoload threadsafe issue
  • create some workaround

As any RoR developer, I can’t imagine my life without automatic code
reloading, so server restart was not an option.
Fixing autoload is a nontrivial task. Corresponded bug was reported 4
years ago, and still wasn’t fixed.

So, the only choice we had is to create some workaround. Solution
quickly came to mind – simply don’t use multithreading in development mode. We created fallback which made our application logic work in single-threaded mode for development, and in multithreaded mode for production.

Fix of autoload

After 4 years of struggling, autoload was fixed in ruby-trunk. It’s
scheduled to be delivered with Ruby 1.9.4 at the beginning of 2013. Luckily for us, patch was ported into JRuby 1.6.6, so we don’t have to wait till Christmas.

rvm install jruby-1.6.7

Then I disabled our fallback for development mode, and … found out that server crashed repeatedly across different places. Switching to production mode instantly fixed all exceptions. It looks like threadsafety of autoload was not the only issue.

Digging into Rails guts

Every exception was different, but in most cases it was database
related issue. E.g. some objects in database were missing, some were not updated, deadlocking of SQL-queries and so on.

I started with grep-ing database query logs. Very soon quite weird behavior was observed: bunch of SQL-queries from one controller action were handled by different database connections. But they all should be issued by only one AR-connection!

After this finding, the main suspect became infamous ActiveRecord::ConnectionAdapters::ConnectionPool. Again! In case you
don’t know, ConnectionPool is not threadsafe in Rails < 3.2.4, so we already had
monkeypatch for it.
After adding some traces to ConnectionPool .checkin and .checkout
methods, we got next flow:

But why was connection checked-in in the middle of job processing?
ConnectionPool has 3 working strategies, and
we used the simplest one. Connection is checked-out on demand, and when controller action is done, all connections used by
current thread are checked-in by clear_active_connections! method.

But in development mode something checks-in connections while they are actively used. Such behavior may not affect some applications, but with active usage of database transactions, which are bound to db-connections, disaster is guaranteed.

There are just a few methods in ActiveRecord, which checkin/checkout connections. Putting several additional traces I got next flow:

Connections were checked-in by clear_reloadable_connections! which is
called after processing of every request in development mode. Here is how it looks:

def clear_reloadable_connections!
  #this part checkin all checked-out connections even used by other threads!
  @reserved_connections.each do |name, conn|
    checkin conn
  end
  @reserved_connections = {}
  #this part disconnects all conections which requires reloading.
  @connections.each do |conn|
    conn.disconnect! if conn.requires_reloading?
  end
  @connections.delete_if do |conn|
    conn.requires_reloading?
  end
end

It can be split in two parts. First one is the root of our issue: it simply checks-in all active connections from all threads, no matter they are used or not. So why is it so lame? Git-blame shows, that this method was added 6 years ago. Commit message says “Only reload connections in development mode that supports (and requires that) — in other words, only do it for SQLite”.
And here is how it looked 6 years ago:

def clear_reloadable_connections!
  @@active_connections.each do |name, conn|
    conn.disconnect! if conn.supports_reloading?
    @@active_connections.delete(name)
  end
end

A bit simplier, but it does two things:

  • disconnects SQLite connections(.support_reloading? method returns true only for SQLite connection)
  • checks-in ALL connections without any respect to other threads

Before this commit, instead of clear_reloadable_connections! another
method was called: clear_active_connections!, which was:

def clear_active_connections!
  clear_cache!(@@active_connections) do |name, conn|
    conn.disconnect!
  end
end

def clear_cache!(cache, thread_id = nil, &block)
  if cache
    if @@allow_concurrency
      thread_id ||= Thread.current.object_id 
        #in multithreaded environment it will work only with current thread,
        #leaving others threads cache intact
      thread_cache, cache = cache, cache[thread_id]
      return unless cache
    end

    cache.each(&block) if block_given?
    cache.clear
  end
ensure
  if thread_cache && @@allow_concurrency
    thread_cache.delete(thread_id)
  end
end

Unlike clear_reloadable_connections!, this one respects other threads and clears connections used by current thread only. Looks like clear_reloadable_connections! was de-threadsafed unintentionally and this should be fixed.

Solution

If we don’t use SQLite why bother ourselves with fix for clear_reloadable_connections!? Instead we can just disable it :). Here is a dumbest monkey-patch:

class ActiveRecord::ConnectionAdapters::ConnectionPool
  def clear_reloadable_connections!;end 
  # disable it, since it checks-in ALL connections from ALL threads after every request
  # in development mode
end

We have like 5 minutes to celebrate the victory, but ActiveRecord::ConnectionTimeoutError, “could not obtain a database connection within 5 seconds” brings us back to earth. Looks like database connections leak from ConnectionPool.

Further digging showed that WEBrick leaks connections when exception is raised (like RoutingError). Nobody noticed this because clear_reloadable_connections! checks-out all connections after every request.

ActiveRecord has special method – clear_stale_cached_connections!.
It checks-in all leaked connections from dead threads, leaving active intact. Easiest place to call it is at the beginning of the checkout method:

class ActiveRecord::ConnectionAdapters::ConnectionPool
  def checkout
    clear_stale_cached_connections! if Rails.env.development? #hack for WEBrick
    ...
  end
end

Voila! We have multithreaded application running in development mode with code reload!

Summary

To make code reloading work in multithreaded mode you need:

  • JRuby >= 1.6.6 or MRI >= 1.9.4, where autoload threadsafety is fixed
  • monkeypatch ConnectionPool to disable clear_reloadable_connections!
  • monkeypatch ConnectionPool to deal with connections leakage in WEBrick
  • update to Rails >= 3.2.4, where threadsafety issues with ConnectionPool is fixed