There are a lot of people who have a lot of opinions about ORMs. This person and this other person both suggest that ORM is an anti-pattern. I can't really empathize with these people at all and I assume they cry themselves to sleep at night. Unlike them, I have things to do, which is why I'm a fan of tools like Rails to accelerate and simplify my development efforts.
99% of the time, I feel like Rails (and other ORMs) positively contribute to my code. They keep things simple, readable, and maintainable, which is a very important thing when you're onboarding new engineers. Furthermore, they provide pretty much all of the DB functionality I need at a fraction of the development effort. In the case of Rails, it's easy to build composable pieces of logic using scopes that lead to extremely expressive and elegant code.
Having said that, the one problem with ORMs is that, as developers, we almost completely stop paying attention to the SQL that's being generated. Again, 99% of the time this is totally fine and is, of course, the intention of the design pattern's abstraction. Once in a while, though, the wrong query will burn your ORM paradise to the ground. In my experience, this very frequently involves locking.
Let's say that we have a company where we sell some stuff to people, and then we have to ship that stuff to them from one of our warehouses. Here's what our models look like:
class User < ApplicationRecord has_many :orders end class Order < ApplicationRecord belongs_to :user belongs_to :warehouse has_many :order_items end class OrderItem < ApplicationRecord belongs_to :order belongs_to :product end class Product < ApplicationRecord # Has a uniquely identifying sku attribute end class Warehouse < ApplicationRecord # Has a uniquely identifying code attribute end
Let's say for some reason we want to fulfill orders for different warehouses on different days and you do this per user since we need to charge their credit card or something. We do all of this in a transaction because we don't want to charge the user unless all of the orders successfully commit. Now you add some scopes and write some functions and your code looks like this:
class User < ApplicationRecord has_many :orders def commit_orders_for_warehouse(warehouse_code) transaction do orders.with_status('pending') .shipping_from(warehouse_code) .lock .each(&:send_to_fulfillment!) amount_to_charge = orders.sum(:total_price) charge!(amount_to_charge) end end def charge!(amount) # Make a network call to a payment provider sleep 1 end end
class Order < ApplicationRecord belongs_to :user belongs_to :warehouse has_many :order_items scope :shipping_from, ->(code) { joins(:warehouse).where(warehouses: { code: code }) } scope :with_status, ->(status) { where(status: status) } def send_to_fulfillment! # Make a network call to a 3PL or something update_attributes!(status: 'fulfilled') end end
We're making a few fatal mistakes here, the first of which I see constantly: making network calls in a transaction. This makes sense in theory: "I don't want to do the thing unless the other thing works." In practice, long-running transactions are bad, and the time spent on a network call (especially one we forgot to set a timeout on) is a lifetime in the methamphetamine nightmare that is your database engine. This is particularly damaging when we notice the other serious flaw in this code...
Let's take a look at the SQL generated for one user:
BEGIN SELECT "orders".* FROM "orders" INNER JOIN "warehouses" ON "warehouses"."id" = "orders"."warehouse_id" WHERE "orders"."user_id" = $1 AND "orders"."status" = $2 AND "warehouses"."code" = $3 FOR UPDATE [["user_id", 1], ["status", "pending"], ["code", "NYC"]] SELECT "warehouses".* FROM "warehouses" WHERE "warehouses"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]] UPDATE "orders" SET "status" = $1 WHERE "orders"."id" = $2 [["status", "fulfilled"], ["id", 1]] SELECT "warehouses".* FROM "warehouses" WHERE "warehouses"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]] UPDATE "orders" SET "status" = $1 WHERE "orders"."id" = $2 [["status", "fulfilled"], ["id", 2]] SELECT "warehouses".* FROM "warehouses" WHERE "warehouses"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]] UPDATE "orders" SET "status" = $1 WHERE "orders"."id" = $2 [["status", "fulfilled"], ["id", 3]] SELECT SUM("orders"."total_price") FROM "orders" WHERE "orders"."user_id" = $1 [["user_id", 1]] COMMIT
If we run these statements one-at-a-time, we can analyze the locking behavior of our code (for reference, here's a good post on examining locks in Postgres):
test=# BEGIN; BEGIN test=# SELECT "orders".* FROM "orders" INNER JOIN "warehouses" ON "warehouses"."id" = "orders"."warehouse_id" WHERE "orders"."user_id" = 1 AND "orders"."status" = 'pending' AND "warehouses"."code" = 'NYC' FOR UPDATE; id | user_id | warehouse_id | status | total_price ----+---------+-------------+---------+------------- 1 | 1 | 1 | pending | 10 2 | 1 | 1 | pending | 10 3 | 1 | 1 | pending | 10 (3 rows) test=# SELECT locktype, relation::regclass, mode, transactionid AS tid, virtualtransaction AS vtid, pid, granted FROM pg_locks; locktype | relation | mode | tid | vtid | pid | granted ---------------+-----------------+-----------------+--------+----------+-------+--------- relation | pg_locks | AccessShareLock | | 2/298107 | 93051 | t relation | warehouses_pkey | AccessShareLock | | 2/298107 | 93051 | t relation | orders_pkey | AccessShareLock | | 2/298107 | 93051 | t relation | warehouses | RowShareLock | | 2/298107 | 93051 | t relation | orders | RowShareLock | | 2/298107 | 93051 | t virtualxid | | ExclusiveLock | | 2/298107 | 93051 | t transactionid | | ExclusiveLock | 947996 | 2/298107 | 93051 | t (7 rows)
Notice that line that mentions a RowShareLock on warehouses? Yeah...that's really bad. Because our Order scope joins to Warehouse, both the orders and the warehouse are locked; our code has effectively turned the warehouse into a mutex, which makes absolutely no sense. The side effect of this is that we can only process one user per warehouse at a time. To illustrate, if we open another transaction in another session and try to run the same statement for user 2, we halt completely:
test=# BEGIN; BEGIN test=# SELECT "orders".* FROM "orders" INNER JOIN "warehouses" ON "warehouses"."id" = "orders"."warehouse_id" WHERE "orders"."user_id" = 2 AND "orders"."status" = 'pending' AND "warehouses"."code" = 'NYC' FOR UPDATE;
An easy solution to this is to introduce a second query: the first one selects all the orders for the specific warehouse, the second query locks all of them. This is what the resulting code would look like:
def commit_orders_for_warehouse(warehouse_code) transaction do order_ids = orders.with_status('pending') .shipping_from(warehouse_code) .pluck('orders.id') orders.where(id: order_ids) .lock .each(&:send_to_fulfillment!) amount_to_charge = orders.sum(:total_price) charge!(amount_to_charge) end end
With the sleep statement in our charge! method, we can easily see how the before and after code compare. I created five users with orders shipping from NYC. Here's some code to time the processing for all of those users:
def self.run_all_users(warehouse_code) Benchmark.ms do threads = [] User.all.each do |user| threads << Thread.new { user.commit_orders_for_warehouse(warehouse_code) } end threads.each(&:join) end end
# Before the code change irb(main):050:0> User.run_all_users('NYC') => 5274.9072830192745 # After irb(main):005:0> User.run_all_users('NYC') => 1099.306590971537
The results are pretty stark. It takes 5s in the first scenario (1s for each user since our threads are blocking each other), while the latter takes just over a second (each thread actually runs concurrently). This also presupposes that you even have a big enough connection pool for all of these threads to sit around and wait (I set mine to 10). More likely, many of these threads will start raising an ActiveRecord::ConnectionTimeoutError.
Caught up in your intense love affair with Rails, you thought you were safe from these horrors. Locking all of the orders was the right thing to do, but because of your beautiful yet naive code, you created a huge bottleneck and brought your site (or more likely your job queue) to a screeching halt.
Perhaps this is an argument against ORMs, but more likely this a testament to the complexity of systems and the need for developers to continuously improve their understanding of the technologies therein. Yes, the goal of an ORM is to abstract away the interactions with the database, but that doesn't mean you can pretend it isn't there. As DHH eloquently put it, Ruby, Rails, and other ORMs are "sharp knives", and developers must be empowered to make better decisions "by convention, by nudges, and through education. Not by banning sharp knives from the kitchen and insisting everyone use spoons to slice tomatoes."