Вопрос

I'm newbie in Ruby but I have a lot of experience in other programming languages. I need to iterate over large amount of records (from db or any persistent storage). Storage engine allows me to retrieve records partially by ranges. In PHP I usually write custom iterator that loads range of records iterate over them and when need loads next part of records and forget about previous part. Some trade-off between script memory usage and count of request to storage. Something like this (copied from comments here):

class Database_Result_Iterator {
...
private $_db_resource = null;
private $_loaded = false;
private $_valid = false;

function rewind() {
    if ($this->_db_resource) {
        mysql_free($this->_db_resource);
        $this->_db_resource = null;
    }
    $this->_loaded = false;
    $this->_valid = false;
}

function valid() {
    if ($this->_loaded) {
        $this->load();
    }
    return $this->_valid;
}

private function load() {
    $this->_db_resource = mysql_query(...);
    $this->_loaded = true;
    $this->next(); // Sets _valid
}

}

How such approach is transformed in Ruby? I.e. I have some class Voter and method get_votes that returns all votes belong to current voter object. It is possible to retrieve not an array with all votes but collection of votes with possibility to iterate over it. How should I implement it?

UPDATE

Please not consider ActiveRecord and RDBMS as only one possible storage. And what about Redis as storage and commands like LRANGE? I'm interested in common code pattern for solution such kind of problem in Ruby.

Это было полезно?

Решение 2

I really don't see the point of this question. AR is an API for querying RDBMS and that's how you do it in AR.

If you want to do redis you'll have to either write it yourself at the driver level or find a similar abstraction to AR for Redis... I think DataMapper had a redis adapter. If there is a universal way to do this for any data store it is likely in DataMapper, but the basic pattern to follow when creating your own would be to look at how AR implements find_each/find_in_batches and do it for your store of choice.

Другие советы

From the guides on Ruby on Rails:

User.all.each do |user|
  NewsLetter.weekly_deliver(user)
end

Is very innefficient. You probably want to do most of the filtering in the database, to start with. ActiveRecord offers a method called find_each for this:

User.find_each(:batch_size => 5000) do |user|
  NewsLetter.weekly_deliver(user)
end

The :batch_size parameter allows to fetch slices of data instead of getting the entire resultset. Extremely helpfull in most cases.

But, you probably don't want to operate on all records in the first place:

User.with_newsletter.each do |user| 
   NewsLetter.weekly_deliver(user)
end

Where with_newsletter is a so called scope.

It sounds like you want to use find_each (http://apidock.com/rails/ActiveRecord/Batches/ClassMethods/find_each). This lets you iterate through a large dataset by loading in a small number, iterating over them, then loading in another batch and so on.

User.find_each do |user|
  user.do_some_stuff
end

will iterate through all users without loading a bajillion of them into memory at once.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top