December 3, 2013

Pub/sub with Ruby and Postgres asynchronous triggers

On a recent project at work we were trying to synchronize different client websites that stored user information on their databases with a central accounts system that maintained user names and emails. We wanted all the client apps to be updated when a user changed their information in the central app. After trying a few different approaches with RabbitMQ pub/sub I realized there was a simpler solution. Since all the apps involved used Postgres I just had to leverage Postgresql's messaging capabilities.

Initially I tried using foreign tables pointing to the clients.  The trigger on the central update events would simply update the foreign tables. The trouble with that approach is that the entire connection to the remote database and the update is done in the same transaction resulting in very slow updates to the central database. And this is why triggers have a bad rep: putting too much logic inside them.

The only way I could think of making the remote update asynchronous was with Listen/Notify. This solution works very well resulting in near real time updates between the environments. The only drawback is the non-persistent queue resulting in the loss of any dropped messages. To catch these there's a cron job that makes sure all the records are synchronized.

The two main parts of asynchronous triggers are:

1. Create a plpgsql function on the central database that sends out a notification. The notification payload is a JSON string of the key value pairs we're interested in propagating. Setup a trigger on the central users table to call the function when certain fields are updated. Keeping the triggers this simple only adds a negligible 2 milliseconds to the update transaction.


2. Run a continuous Ruby script that starts a SQL client to listen for the notifications and updates the client databases. I chose to use the sequel gem which provides a nice interface to Postgres Listen. The script parses the incoming JSON message and in a production environment should make sure that the JSON is valid and that the connections all work.