Question

I noticed that Rails can have concurrency issues with multiple servers and would like to force my model to always lock. Is this possible in Rails, similar to unique constraints to force data integrity? Or does it just require careful programming?

Terminal One

irb(main):033:0* Vote.transaction do
irb(main):034:1* v = Vote.lock.first
irb(main):035:1> v.vote += 1
irb(main):036:1> sleep 60
irb(main):037:1> v.save
irb(main):038:1> end

Terminal Two, while sleeping

irb(main):240:0* Vote.transaction do
irb(main):241:1* v = Vote.first
irb(main):242:1> v.vote += 1
irb(main):243:1> v.save
irb(main):244:1> end

DB Start

 select * from votes where id = 1;
 id | vote |         created_at         |         updated_at         
----+------+----------------------------+----------------------------
  1 |    0 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:42:58.875973 

After execution

Terminal One

irb(main):040:0> v.vote
=> 1

Terminal Two

irb(main):245:0> v.vote
=> 1

DB End

select * from votes where id = 1;
 id | vote |         created_at         |         updated_at         
----+------+----------------------------+----------------------------
  1 |    1 | 2013-09-30 02:29:28.740377 | 2013-12-28 20:44:10.276601 

Other Example

http://rhnh.net/2010/06/30/acts-as-list-will-break-in-production

Was it helpful?

Solution

You are correct that transactions by themselves don't protect against many common concurrency scenarios, incrementing a counter being one of them. There isn't a general way to force a lock, you have to ensure you use it everywhere necessary in your code

For the simple counter incrementing scenario there are two mechanisms that will work well:

Row Locking

Row locking will work as long as you do it everywhere in your code where it matters. Knowing where it matters may take some experience to get an instinct for :/. If, as in your above code, you have two places where a resource needs concurrency protection and you only lock in one, you will have concurrency issues.

You want to use the with_lock form; this does a transaction and a row-level lock (table locks are obviously going to scale much more poorly than row locks, although for tables with few rows there is no difference as postgresql (not sure about mysql) will use a table lock anyway. This looks like this:

    v = Vote.first
    v.with_lock do
      v.vote +=1
      sleep 10
      v.save
    end

The with_lock creates a transaction, locks the row the object represents, and reloads the objects attributes all in one step, minimizing the opportunity for bugs in your code. However this does not necessarily help you with concurrency issues involving the interaction of multiple objects. It can work if a) all possible interactions depend on one object, and you always lock that object and b) the other objects each only interact with one instance of that object, e.g. locking a user row and doing stuff with objects which all belong_to (possibly indirectly) that user object.

Serializable Transactions

The other possibility is to use serializable transaction. Since 9.1, Postgresql has "real" serializable transactions. This can perform much better than locking rows (though it is unlikely to matter in the simple counter incrementing usecase)

The best way to understand what serializable transactions give you is this: if you take all the possible orderings of all the (isolation: :serializable) transactions in your app, what happens when your app is running is guaranteed to always correspond with one of those orderings. With ordinary transactions this is not guaranteed to be true.

However, what you have to do in exchange is to take care of what happens when a transaction fails because the database is unable to guarantee that it was serializable. In the case of the counter increment, all we need to do is retry:

    begin
      Vote.transaction(isolation: :serializable) do
        v = Vote.first
        v.vote += 1
        sleep 10 # this is to simulate concurrency 
        v.save
      end
    rescue ActiveRecord::StatementInvalid => e
      sleep rand/100 # this is NECESSARY in scalable real-world code, 
                     # although the amount of sleep is something you can tune.
      retry
    end

Note the random sleep before the retry. This is necessary because failed serializable transactions have a non-trivial cost, so if we don't sleep, multiple processes contending for the same resource can swamp the db. In a heavily concurrent app you may need to gradually increase the sleep with each retry. The random is VERY important to avoid harmonic deadlocks -- if all the processes sleep the same amount of time they can get into a rhythm with each other, where they all are sleeping and the system is idle and then they all try for the lock at the same time and the system deadlocks causing all but one to sleep again.

When the transaction that needs to be serializable involves interaction with a source of concurrency other than the database, you may still have to use row-level locks to accomplish what you need. An example of this would be when a state machine transition determines what state to transition to based on a query to something other than the db, like a third-party API. In this case you need to lock the row representing the object with the state machine while the third party API is queried. You cannot nest transactions inside serializable transactions, so you would have to use object.lock! instead of with_lock.

Another thing to be aware of is that any objects fetched outside the transaction(isolation: :serializable) should have reload called on them before use inside the transaction.

OTHER TIPS

ActiveRecord always wraps save operations in a transaction.

For your simple case it might be best to just use a SQL update instead of performing logic in Ruby and then saving. Here is an example which adds a model method to do this:

class Vote
  def vote!
    self.class.update_all("vote = vote + 1", {:id => id})
  end

This method avoids the need for locking in your example. If you need more general database locking check see David's suggestion.

You can do the following in your model like so

class Vote < ActiveRecord::Base

validate :handle_conflict, only: :update
attr_accessible :original_updated_at
attr_writer :original_updated_at

def original_updated_at
  @original_updated_at || updated_at 
end 

def handle_conflict
    #If we want to use this across multiple models
    #then extract this to module
    if @conflict || updated_at.to_f> original_updated_at.to_f
      @conflict = true
      @original_updated_at = nil
      #If two updates are made at the same time a validation error
      #is displayed and the fields with
      errors.add :base, 'This record changed while you were editing'
      changes.each do |attribute, values|
        errors.add attribute, "was #{values.first}"
      end
    end
  end
end 

The original_updated_at is a virtual attribute that is set. handle_conflict is fired when the record is updated. Checks to see if the updated_at attribute is in the database is later than the one hidden(defined on your page). By the way you should define the following in the your app/view/votes/_form.html.erb

<%= f.hidden_field :original_updated_at %>

If a there is a conflict then raise the validation error.

And if you are using Rails 4 you will won't have the attr_accessible and will need to add :original_updated_at to your vote_params method in your controller.

Hopefully this sheds some light.

For simple +1

Vote.increment_counter :vote, Vote.first.id

Because vote was used both for the table name and the field, this is how the 2 are used

TableName.increment_counter :field_name, id_of_the_row
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top