Creating and Maintaining Custom Database Functions in Ruby on Rails applications
tl;dr
Do you want to save the database custom functions in a Ruby on Rails
application but you find the schema.rb
file doesn’t preserve the
functions? You also discover running all migrations from the beginning
of the project is not best practice.
How to maintain custom database functions in a Ruby on Rails project?
Switch to using structure.sql
as the authoratative description of
the database instead schema.rb
by adding the configuration to:
config/application.rb
config.active_record.schema_format = :sql
Run $ rake db:migrate
and this creates the db/structure.sql
file. Commit the new file, delete db/schema.rb
, and now $ rake
db:schema:load
will replicate any custom functions created in the
migration.
Here is a link to a repository for all the code in this project that you can follow along with.
Read on for additional details.
I want to record how to manage custom database functions in a Ruby on Rails application through a project’s lifecyle: development, continuous integration, and production.
I will share a situation where I needed to wanted a custom Postgres trigger incorporated into the application and lost the custom trigger unless I ran all the migrations from the beginning.
By reading this article, I hope you learn how to manage custom database functions within your Ruby on Rails application and understand implications of using custom database triggers.
This article will take you around six minutes to read, probably less because there’s a lot of code. ;-)
Introduction
Using custom database functions, such as Postgresql’s trigger or windowing functions or MySQL’s stored functions.
The best way to introduce them is through a Ruby on Rails migration, such as:
Ruby on Rails Migration with postgres trigger
class AddTsvectorColumns < ActiveRecord::Migration
def up
add_column :products, :tsv, :tsvector
add_index :products, :tsv, using: "gin"
execute <<-SQL
CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON products FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(
tsv, 'pg_catalog.english', description, manufacturer_name, name
);
SQL
now = Time.current.to_s(:db)
update("UPDATE products SET updated_at = '#{now}'")
end
def down
execute <<-SQL
DROP TRIGGER tsvectorupdate
ON products
SQL
remove_index :products, :tsv
remove_column :products, :tsv
end
end
This will work fine on the implementing developer’s machine. When pushing this change up to a continuous integration service or another developer’s machine, any tests based on this function will fail.
Why?
The reason is the implementing developer used: rake db:migrate
to
set up the database function, while the continuous integration service
instantiates the database using: rake db:schema:load
, which loads
from schema.rb
. This file does not contain the database functions
introduced by the migration.
The following is a schema.rb
file for the above article with the
custom database function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# This file is auto-generated from the current state of the database. Instead
# of editing this file, please use the migrations feature of Active Record to
# incrementally modify your database, and then regenerate this schema definition.
#
# Note that this schema.rb definition is the authoritative source for your
# database schema. If you need to create the application database on another
# system, you should be using db:schema:load, not running all the migrations
# from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).
#
# It's strongly recommended that you check this file into your version control system.
ActiveRecord::Schema.define(version: 20191024121431) do
# These are extensions that must be enabled in order to support this database
enable_extension "plpgsql"
create_table "products", force: :cascade do |t|
t.text "description"
t.text "manufacturer"
t.text "name"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.tsvector "tsv"
t.index ["tsv"], name: "index_products_on_tsv", using: :gin
end
end
The only indication of the tsvector feature is on line 24, which is a now basic column, the column loses its function created from the migration.
How can we preserve this function between systems such as continuous integration servers and even production environments?
Database Authority
An obvious way to solve this: force every project to run all
migrations from the beginning. Skip using the schema.rb
file as both
running all the migrations AND the schema.rb
file should produce
the same results.
In a Ruby on Rails project, the schema.rb
file is the
authoritative source for the database. As stated in the beginning of
the schema.rb
file, lines 5 - 10:
# Note that this schema.rb definition is the authoritative source for your
# database schema. If you need to create the application database on another
# system, you should be using db:schema:load, not running all the migrations
# from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).
This declares the schema.rb
to be the authoritative source for the
database and not to run all the migrations from scratch, how can one
maintain database consistency between Ruby on Rails applications?
The database function is codified in the project, there has to be a way.
Bye schema.rb
, hello structure.sql
Ruby on Rails has a second format that describes the database,
structure.sql
. This file is SQL code instead of Ruby code to
recreate the database. It will contain the database functions created
by the migration and will configure the database when using rake
db:schema:load
, without requiring running migrations from scratch.
Creating structure.sql
To set up a Ruby on Rails project to start using structure.sql
, add
the following configuration in config/application.rb
:
config.active_record.schema_format = :sql
In a sample project, this is what the configuration file will look like:
require_relative 'boot'
require 'rails/all'
Bundler.require(*Rails.groups)
module PgTriggers
class Application < Rails::Application
config.load_defaults 5.1
config.active_record.schema_format = :sql
end
end
To create the structure.sql
file, run:
$ rake db:schema:dump
OR just run a migration:
$ rake db:migrate
and this creates a new structure.sql
file.
Verifying Preservation of Custom Database Functions
Nothing worse than reading an article on the Internet and accepting it works, until it doesn’t in production, right?! :-)
Let’s test out what the structure.sql
changes will be when adding a
custom postgres trigger function into a migration.
Create the structure.sql
file from the previous
section
Adding the tsvector migration, $ rake db:create:migration AddTsvectorColumns
In the new file, modify it so the contents match the above
Run the migration: $ rake db:migrate
and run diff using $ git diff
structure.sql
:
diff --git a/db/structure.sql b/db/structure.sql
index 6cc96a5..b6b5c1f 100644
--- a/db/structure.sql
+++ b/db/structure.sql
@@ -49,7 +49,8 @@ CREATE TABLE public.products (
manufacturer text,
name text,
created_at timestamp without time zone NOT NULL,
- updated_at timestamp without time zone NOT NULL
+ updated_at timestamp without time zone NOT NULL,
+ tsv tsvector
);
@@ -112,6 +113,20 @@ ALTER TABLE ONLY public.schema_migrations
ADD CONSTRAINT schema_migrations_pkey PRIMARY KEY (version);
+--
+-- Name: index_products_on_tsv; Type: INDEX; Schema: public; Owner: -
+--
+
+CREATE INDEX index_products_on_tsv ON public.products USING gin (tsv);
+
+
+--
+-- Name: products tsvectorupdate; Type: TRIGGER; Schema: public; Owner: -
+--
+
+CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE ON public.products FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('tsv', 'pg_catalog.english', 'description', 'manufacturer_name', 'name');
+
+
--
-- PostgreSQL database dump complete
--
The structure.sql
file contains and creates the trigger function
from the migration file:
CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON products FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(
tsv, 'pg_catalog.english', description, manufacturer_name, name
);
This validates the structure.sql
file contains everything from the
migration, unlike the schema.rb
file.
Caveats
One thing to note with this approach: using custom database functions will limit the database system to the one the function uses. If there is a need to change the database system in the future, these functions will require significant changes before switching database systems.
In that event of changing database systems, the course of action I
would take would be to go back to using schema.rb
as the database
source, create the new migration containing the equivalent functions
for the new database system, migrate, then create structure.sql
file
for the new database system.
Conclusion
Using a structure.sql
file preserves any custom functions introduced
through the Ruby on Rails database migration, even when loading
through $ rake db:schema:load
on another system.
The reason why schema.rb
will not ever contain these custom
functions is the intention of file is for any database
system. Custom database functions are not guaranteed to be compatible
with different systems.
Adjust the Ruby on Rails application configuration, running all
migrations, and saving the new structure.sql
file.