Why Does schema.rb (or structure.sql) Have Unexpected Changes?

Ryan Sandridge · April 27, 2023

For years I’ve seen people new to Rails get confused about why their schema.rb or structure.sql files have unexpected changes in them, particularly when the engineer is new to working on a Rails project in a team environment. Let’s put this mystery to rest. In this post, I’ll explain reasons why this happens, what to do when it happens, and recommended practices to avoid commiting (or forgetting to commit) the necessary changes to these files.

For the purposes of this post, anything I say about schema.rb file applies also to structure.sql file, unless explicitly stated otherwise.

Why Does This Happen?

In short, this happens because when you migrate your database, Rails will do more than simply apply new migrations to your database. Rails rewrites schema.rb to match the existing database schema everytime you migrate (including if there are no migrations to run). You don’t have to be working on a team to run into this scenario, though it happens more often and is more problematic if you are collaborating on a team.

In a freshly created Rails app with a clean git history, lets switch to a feature branch, generate a migration, and migrate the database:

$ git checkout -b create_products_table
$ bin/rails generate migration CreateProducts name:string
Running via Spring preloader in process 94823
      invoke  active_record
      create    db/migrate/20220202222222_create_products.rb
      
$ bin/rails db:migrate
== 20220202222222 CreateProducts: migrating ===================================
-- create_table(:products)
   -> 0.0018s
== 20220202222222 CreateProducts: migrated (0.0019s) ==========================

As a result of the above, you should now have a new migration file in the db/migrate directory, and either a new db/schema.rb file, or if you’d previously migrated, then your db/schema.rb file should have changes in it reflecting the new products table. Here is the current contents of schema.rb:

# This file is auto-generated from the current state of the database. Instead
# of editing this file, please use the migrations feature of Active Record to
# incrementally modify your database, and then regenerate this schema definition.
#
# This file is the source Rails uses to define your schema when running `rails
# db:schema:load`. When creating a new database, `rails db:schema:load` tends to
# be faster and is potentially less error prone than running all of your
# migrations from scratch. Old migrations may fail to apply correctly if those
# migrations use external dependencies or application code.
#
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema.define(version: 2022_02_08_195608) do

  create_table "products", force: :cascade do |t|
    t.string "name"
  end

end

In fact, pay attention to the autogenerated comment in that file. It clearly states that the file reflects the current state of the database. If you want to prove to yourself that Rails will just recreate this from the contents of the existing database, delete it and run migrations again.

$ rm db/schema.rb
$ bin/rails db:migrate

You’ll notice that this time you don’t get any output from Rails because no migrations needed to run, but you’ll see that db/schema.rb once again exists and has the same contents as it had before you deleted it. Go ahead and stage and commit these changes:

$ git add db/migrate/20220202222222_create_products.rb db/schema.rb
$ git commit -m "adding products table"

Now, imagine your collegue asks for your help debugging an issue that is currently happening in production, and you check out the latest version of the main branch to run your app locally. But there have been other migrations added since you last checked out. You might do something like this:

$ git checkout main
$ git pull origin main
From github.com:dissolved/unicorn
remote: Enumerating objects: 822, done.
remote: Counting objects: 100% (822/822), done.
remote: Compressing objects: 100% (275/275), done.
remote: Total 822 (delta 563), reused 749 (delta 529), pack-reused 0
Receiving objects: 100% (822/822), 254.31 KiB | 5.09 MiB/s, done.
Resolving deltas: 100% (563/563), completed with 138 local objects.
   8f60eba6f9..ec110b491e  main                                                               -> origin/main
$ bin/rails db:migrate
== 20220208080808 CreateRatings: migrating ===================================
-- create_table(:ratings)
   -> 0.0018s
== 20220208080808 CreateRatings: migrated (0.0019s) ==========================

You quickly find the bug, but you need to change the type for the value attribute on the ratings table from string to integer. That should be easy enough, but before we do anything else, let’s do something that we often might not do; let’s check the git status:

$ git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   db/schema.rb

no changes added to commit (use "git add" and/or "git commit -a")

Wait, what?!? We have uncommitted changes! Of course I’ve already told you why this is happening, it is happening because when we ran bin/rails db:migrate it made schema.rb reflect the actual current schema in our database, and since we were working on our own feature that added the products table, our database has something in it that other developers’ databases do not have. Let’s look at the diff.

$ git diff
diff --git a/db/schema.rb b/db/schema.rb
index 48142da..2cf67f2 100644
--- a/db/schema.rb
+++ b/db/schema.rb
@@ -12,6 +12,10 @@

 ActiveRecord::Schema.define(version: 2022_02_08_220908) do

+  create_table "products", force: :cascade do |t|
+    t.string "name"
+  end
+
   create_table "ratings", force: :cascade do |t|
     t.string "value"
   end

Yep, the change is from the migration in the feature branch we were previously working on. It does not matter that we changed branches, our database retained the changes from that branch.

So this should explain the “why does this happen” sufficiently. It isn’t always your fault either. Sometimes people commit migrations but don’t commit the schema.rb file, so when you run migrations and their migration runs on your local environment, their changes will appear in your schema.rb as uncommitted changes. At which point a friendly reminder that the schema.rb file should be committed alongside the commit (or PR) that introduces the actual migration file is in order, along with a link to this post!

What Should I Do Now?

In most cases you’ll want your schema.rb file to have only the things in it that were added by migrations present in the current branch. One way to fix the issue is to manually remove the lines from schema.rb that were introduced by migrations not present on the current branch, but this can sometimes be hard to identify, or at least subject to human error. But for some simple situations, by all means edit it manually.

Alternatively, you if you haven’t yet made any other changes to the current branch, you could simply clean the directory, load the committed schema, and move on.

$ git restore .
$ bin/rails db:schema:load

If you’ve already been hacking away, and already added a new migration file for your current work, you could explicitly restore just the schema file, load that schema, and run your migrations.

$ git restore db/schema.rb
$ bin/rails db:schema:load
$ bin/rails db:migrate

And now both your database and your schema.rb file should be in sync and properly reflect the state you want on your current branch. If you’re using structure.sql instead, you’d use bin/rails db:structure:load instead to load the schema into your database.

How To Prevent This?

Ok, but that is kind of a hassle. Could we have prevented this unexpected change to the schema file to begin with? Yes, we could have. We could have rolled back the migration before leaving our feature branch. Though remembering to do that might be hard, and even if you do remember, some migrations aren’t written in such a way that they can be rolled back (easily). Without jumping through more hoops, this is about the best we can do. I may write more about this topic and how to make clever use of git hooks to minimize this problem.

Twitter, Facebook