Creating and Maintaining Custom Database Functions in Ruby on Rails applications

25 Oct 2019

tl;dr

Do you want to save the database custom functions in a Ruby on Rails application but you find the schema.rb file doesn’t preserve the functions? You also discover running all migrations from the beginning of the project is not best practice.

How to maintain custom database functions in a Ruby on Rails project?

Switch to using structure.sql as the authoratative description of the database instead schema.rb by adding the configuration to: config/application.rb

config.active_record.schema_format = :sql

Run $ rake db:migrate and this creates the db/structure.sql file. Commit the new file, delete db/schema.rb, and now $ rake db:schema:load will replicate any custom functions created in the migration.

Here is a link to a repository for all the code in this project that you can follow along with.

Read on for additional details.

Winslow Homer - Fishing Boats, Key West source and more information

I want to record how to manage custom database functions in a Ruby on Rails application through a project’s lifecyle: development, continuous integration, and production.

I will share a situation where I needed to wanted a custom Postgres trigger incorporated into the application and lost the custom trigger unless I ran all the migrations from the beginning.

By reading this article, I hope you learn how to manage custom database functions within your Ruby on Rails application and understand implications of using custom database triggers.

This article will take you around six minutes to read, probably less because there’s a lot of code. ;-)

Introduction

Using custom database functions, such as Postgresql’s trigger or windowing functions or MySQL’s stored functions.

The best way to introduce them is through a Ruby on Rails migration, such as:

Ruby on Rails Migration with postgres trigger

class AddTsvectorColumns < ActiveRecord::Migration
  def up
    add_column :products, :tsv, :tsvector
    add_index :products, :tsv, using: "gin"

    execute <<-SQL
      CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
      ON products FOR EACH ROW EXECUTE PROCEDURE
      tsvector_update_trigger(
        tsv, 'pg_catalog.english', description, manufacturer_name, name
      );
    SQL

    now = Time.current.to_s(:db)
    update("UPDATE products SET updated_at = '#{now}'")
  end

  def down
    execute <<-SQL
      DROP TRIGGER tsvectorupdate
      ON products
    SQL

    remove_index :products, :tsv
    remove_column :products, :tsv
  end
end

source

This will work fine on the implementing developer’s machine. When pushing this change up to a continuous integration service or another developer’s machine, any tests based on this function will fail.

Why?

The reason is the implementing developer used: rake db:migrate to set up the database function, while the continuous integration service instantiates the database using: rake db:schema:load, which loads from schema.rb. This file does not contain the database functions introduced by the migration.

The following is a schema.rb file for the above article with the custom database function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# This file is auto-generated from the current state of the database. Instead
# of editing this file, please use the migrations feature of Active Record to
# incrementally modify your database, and then regenerate this schema definition.
#
# Note that this schema.rb definition is the authoritative source for your
# database schema. If you need to create the application database on another
# system, you should be using db:schema:load, not running all the migrations
# from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).
#
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema.define(version: 20191024121431) do

  # These are extensions that must be enabled in order to support this database
  enable_extension "plpgsql"

  create_table "products", force: :cascade do |t|
    t.text "description"
    t.text "manufacturer"
    t.text "name"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    t.tsvector "tsv"
    t.index ["tsv"], name: "index_products_on_tsv", using: :gin
  end

end

The only indication of the tsvector feature is on line 24, which is a now basic column, the column loses its function created from the migration.

How can we preserve this function between systems such as continuous integration servers and even production environments?

Database Authority

An obvious way to solve this: force every project to run all migrations from the beginning. Skip using the schema.rb file as both running all the migrations AND the schema.rb file should produce the same results.

In a Ruby on Rails project, the schema.rb file is the authoritative source for the database. As stated in the beginning of the schema.rb file, lines 5 - 10:

# Note that this schema.rb definition is the authoritative source for your
# database schema. If you need to create the application database on another
# system, you should be using db:schema:load, not running all the migrations
# from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).

This declares the schema.rb to be the authoritative source for the database and not to run all the migrations from scratch, how can one maintain database consistency between Ruby on Rails applications?

The database function is codified in the project, there has to be a way.

Bye `schema.rb`, hello `structure.sql`

Ruby on Rails has a second format that describes the database, structure.sql. This file is SQL code instead of Ruby code to recreate the database. It will contain the database functions created by the migration and will configure the database when using rake db:schema:load, without requiring running migrations from scratch.

Creating `structure.sql`

To set up a Ruby on Rails project to start using structure.sql, add the following configuration in config/application.rb:

config.active_record.schema_format = :sql

In a sample project, this is what the configuration file will look like:

require_relative 'boot'

require 'rails/all'

Bundler.require(*Rails.groups)

module PgTriggers
  class Application < Rails::Application
    config.load_defaults 5.1
    config.active_record.schema_format = :sql
  end
end

To create the structure.sql file, run:

$ rake db:schema:dump

OR just run a migration:

$ rake db:migrate

and this creates a new structure.sql file.

Verifying Preservation of Custom Database Functions

Nothing worse than reading an article on the Internet and accepting it works, until it doesn’t in production, right?! :-)

Let’s test out what the structure.sql changes will be when adding a custom postgres trigger function into a migration.

Create the structure.sql file from the previous section

Adding the tsvector migration, $ rake db:create:migration AddTsvectorColumns

In the new file, modify it so the contents match the above

Run the migration: $ rake db:migrate and run diff using $ git diff structure.sql:

diff --git a/db/structure.sql b/db/structure.sql
index 6cc96a5..b6b5c1f 100644
--- a/db/structure.sql
+++ b/db/structure.sql
@@ -49,7 +49,8 @@ CREATE TABLE public.products (
     manufacturer text,
     name text,
     created_at timestamp without time zone NOT NULL,
-    updated_at timestamp without time zone NOT NULL
+    updated_at timestamp without time zone NOT NULL,
+    tsv tsvector
 );


@@ -112,6 +113,20 @@ ALTER TABLE ONLY public.schema_migrations
     ADD CONSTRAINT schema_migrations_pkey PRIMARY KEY (version);



+--
+-- Name: index_products_on_tsv; Type: INDEX; Schema: public; Owner: -
+--
+
+CREATE INDEX index_products_on_tsv ON public.products USING gin (tsv);
+
+
+--
+-- Name: products tsvectorupdate; Type: TRIGGER; Schema: public; Owner: -
+--
+
+CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE ON public.products FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('tsv', 'pg_catalog.english', 'description', 'manufacturer_name', 'name');
+
+
 --
 -- PostgreSQL database dump complete
 --

The structure.sql file contains and creates the trigger function from the migration file:

CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON products FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(
  tsv, 'pg_catalog.english', description, manufacturer_name, name
);

This validates the structure.sql file contains everything from the migration, unlike the schema.rb file.

Caveats

One thing to note with this approach: using custom database functions will limit the database system to the one the function uses. If there is a need to change the database system in the future, these functions will require significant changes before switching database systems.

In that event of changing database systems, the course of action I would take would be to go back to using schema.rb as the database source, create the new migration containing the equivalent functions for the new database system, migrate, then create structure.sql file for the new database system.

Conclusion

Using a structure.sql file preserves any custom functions introduced through the Ruby on Rails database migration, even when loading through $ rake db:schema:load on another system.

The reason why schema.rb will not ever contain these custom functions is the intention of file is for any database system. Custom database functions are not guaranteed to be compatible with different systems.

Adjust the Ruby on Rails application configuration, running all migrations, and saving the new structure.sql file.