Question

thanks for your time first...after all the searching on google, github and here, and got more confused about the big words(partition/shard/fedorate),I figure that I have to describe the specific problem I met and ask around.

My company's databases deals with massive users and orders, so we split databases and tables in various ways, some are described below:

way             database and table name      shard by (maybe it's should be called partitioned by?)
YZ.X            db_YZ.tb_X                   order serial number last three digits
YYYYMMDD.       db_YYYYMMDD.tb               date
YYYYMM.DD       db_YYYYMM.tb_ DD             date too

The basic concept is that databases and tables are seperated acording to a field(not nessissarily the primary key), and there are too many databases and too many tables, so that writing or magically generate one database.yml config for each database and one model for each table isn't possible or at least not the best solution.

I looked into drnic's magic solutions, and datafabric, and even the source code of active record, maybe I could use ERB to generate database.yml and do database connection in around filter, and maybe I could use named_scope to dynamically decide the table name for find, but update/create opertions are bounded to "self.class.quoted_table_name" so that I couldn't easily get my problem solved. And even I could generate one model for each table, because its amount is up to 30 most.

But this is just not DRY!

What I need is a clean solution like the following DSL:

class Order < ActiveRecord::Base
   shard_by :order_serialno do |key|
      [get_db_config_by(key), #because some or all of the databaes might share the same machine in a regular way or can be configed by a hash of regex, and it can also be a const
       get_db_name_by(key), 
       get_tb_name_by(key),        
      ]
   end
end

Can anybody enlight me? Any help would be greatly appreciated~~~~

Was it helpful?

Solution

Case two (where only db name changes) is pretty easy to implement with DbCharmer. You need to create your own sharding method in DbCharmer, that would return a connection parameters hash based on the key.

Other two cases are not supported right away, but could be easily added to your system:

  1. You implement sharding method that knows how to deal with database names in your sharded dabatase, this would give you an ability to do shard_for(key) calls to your model to switch db connection.

  2. You add a method like this:

    class MyModel < ActiveRecord::Base
      db_magic :sharded => { :sharded_connection => :my_sharding_method }
    
      def switch_shard(key)
        set_table_name(table_for_key(key))  # switch table
        shard_for(key)                      # switch connection
      end
    end
    
  3. Now you could use your model like this:

    MyModel.switch_shard(key).first
    MyModel.switch_shard(key).count
    

    and, considering you have shard_for(key) call results returned from the switch_shard method, you could use it like this:

    m = MyModel.switch_shard(key) # Switch connection and get a connection proxy
    m.first                       # Call any AR methods on the proxy
    m.count 
    

OTHER TIPS

If you want that particular DSL, or something that matches the logic behind the legacy sharding you are going to need to dig into ActiveRecord and write a gem to give you that kind of capability. All the existing solutions that you mention were not necessarily written with your situation in mind. You may be able to bend any number of solutions to your will, but in the end you're gonna have to probably write custom code to get what you are looking for.

Sounds like, in this case, you should consider not use SQL.

If the data sets are that big and can be expressed as key/value pairs (with a little de-normalization), you should look into couchDB or other noSQL solutions. These solutions are fast, fully scalable, and is REST based, so it is easy to grow and backup and replicate.

We all have gotten into solving all our problems with the same tool (Believe me, I try to too).

It would be much easier to switch to a noSQL solution then to rewrite activeRecord.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top