Skip to content

Replication with Rails on Heroku

Ryan Law edited this page Aug 6, 2014 · 26 revisions

Required Configuration

First, install the gem per instructions in the 'Install' section in the README.

Note: master HEAD may not work with this configuration currently. If you have problems, use the rocketmobile fork HEAD until this issue is resolved (this message will be removed). gem "ar-octopus", github: "rocketmobile/octopus", require: "octopus"

Next, place this dynamic Octopus configuration in config/shards.yml. This configures your Heroku follower databases as Octopus slaves for read-only replication usage.

Finally, mark the appropriate AR models by using the replicated_model class method:

class StaticThing < ActiveRecord::Base
  replicated_model
end

This will result in your followers being used for read queries, and master for write queries for that model. This is appropriate for models that won't yield unexpected behavior when read queries come from a slave that may be a few seconds behind the master they follow.

That's it. You're now using horizontal scaling with follower databases to increase your application performance.

You can read more about Octopus in the repo README to learn how the using methods allow you to use your read-only followers in a more granular fashion in controllers, models and Active Record queries, if needed. Use the using methods with the Octopus.followers initializer monkey-patch to ensure your code is ready for future horizontal database scaling without any code changes.

Initializer (Recommended)

Place this initializer code in config/initializers/octopus.rb for:

  • Convenient logging of the slaves configured at app initialization *
=> Booting Thin
=> Rails 3.2.12 application starting in production
=> Ctrl-C to shutdown server
=> 2 databases enabled as read-only slaves
  * CRIMSON follower
  * PINK follower
>> Thin web server (codename Knife)
>> Listening on 0.0.0.0:----, CTRL+C to stop
  • Use of Octopus.followers to retrieve configured followers
    • Example: StaticThing.using(Octopus.followers).all
    • Necessary until using_group functionality is more robust.

More Information and Options

This configuration uses the environmental variables that Heroku sets up when you create your heroku-postgresql add-on databases to automatically set up any slaves that are present. This assumes that you desire all non-primary databases to be used as read-only slaves. If you only want certain followers to be used, see the 'Enabling/Disabling Followers as Slaves' section below.

Your primary database will still be configured as usual by the configuration that Heroku injects into database.yml.

How closely your followers (slaves) follow master is application specific. As such, this recommended configuration does not configure your application to automatically send all read queries to your followers by default.

In Octopus lingo, your slaves will not be marked as 'fully replicated' with this configuration, and you are required to set the appropriate models with replicated_model, or explicitly call the using methods to have queries sent to your followers. See 'Full Replication' below if this isn't desired for your application.

Logging

Octopus integrates with Rails logging and will prepend debug SQL log lines with the follower that was queried:

Master:

DynamicThing Load (0.2ms)  SELECT "dynamic_things".* FROM "dynamic_things"

Follower(s):

[Shard: orange_follower]  StaticThing Load (0.3ms)  SELECT "static_things".* FROM "static_things" 
[Shard: pink_follower]  StaticThing Load (0.2ms)  SELECT "static_things".* FROM "static_things" 

Enabling/Disabling Followers as Slaves

If using the above configuration, you can choose the followers you want to have slave duties with environmental variables. Whitelist followers you want or blacklist the followers you don't want with SLAVE_ENABLED_FOLLOWERS and/or SLAVE_DISABLED_FOLLOWERS. Use a comma+space separated list of (case insensitive) follower colors:

heroku config:add SLAVE_ENABLED_FOLLOWERS=PINK, CRIMSON

Without these variables set (the default), all followers will be used as slaves. This may be undesirable for you. One example of usage is when adding an additional follower to a live application, where the above ENV vars are used to ensure the new follower is excluded until it is sufficiently caught up to master for duty.

Note: the blacklist config isn't currently written to work in development or test environments. The blacklist var is effectively ignored in non-Heroku environments.

Development and Test configuration

By default, 2 followers will be simulated (pointing at your development or test database) in development or test environments. The number of dev/test followers is easily changed with the following code in the configuration file:

Array.new(2){YAML::load_file(File.open("config/database.yml"))[Rails.env]}

Additional Params in DB URL

Heroku's dynamic configuration for your primary database allows you too add additional configuration params as URI params on the DB URL configuration variable (cross-reference: SO thread).

This same method of adding params via the (SLAVE_DATABASE)_URL ENV key is allowed for slaves when using this dynamic Octopus configuration. However, per issue #167, this is not recommended by Heroku; setting additional parameters via an after-initialize application callback is the recommended way to modify the configuration of your heroku-postgresql databases.

Full Replication

If you understand the implications and want all application reads to go to your slaves, simply change false to true in the following line in config/shards.yml:

fully_replicated: false

Caveats

SQL Caching (As of Version 0.7.0, Octopus provides SQL caching https://github.com/tchandy/octopus/issues/81))

ActiveRecord Query Caching currently does not work with Octopus (see open issue). This means that if N duplicate ActiveRecord queries are sent to slaves during the same request, this will result in N queries to your database. By default, ActiveRecord returns duplicated queries made during the same request from a cache.

If your application heavily utilizes this caching behavior (search your logs for CACHE (0.0ms)), you could be unknowingly hurting your performance by switching to replication reads with Octopus. To work around this issue, identify the source of critical duplicated queries and avoid them with additional application code.

Clone this wiki locally