Question

What I'm doing

I'm using the twitter gem (a Ruby wrapper for the Twitter API) in my app, which is run on Heroku. I use Heroku's Scheduler to periodically run caching tasks that use the twitter gem to, for example, update the list of retweets for a particular user. I'm also using delayed_job so scheduler calls a rake task, which calls a method that is 'delayed' (see scheduler.rake below). The method loops through "authentications" (for users who have authenticated twitter through my app) to update each authorized user's retweet cache in the app.

My question

What am I doing wrong? For example, since I'm using Heroku's Scheduler, is delayed_job redundant? Also, you can see I'm not catching (rescuing) any errors. So, if Twitter is unreachable, or if a user's auth token has expired, everything chokes. This is obviously dumb and terrible because if there's an error, the entire thing chokes and ends up creating a failed delayed_job, which causes ripple effects for my app. I can see this is bad, but I'm not sure what the best solution is. How/where should I be catching errors?

I'll put all my code (from the scheduler down to the method being called) for one of my cache methods. I'm really just hoping for a bulleted list (and maybe some code or pseudo-code) berating me for poor coding practice and telling me where I can improve things.

I have seen this SO question, which helps me a little with the begin/rescue block, but I could use more guidance on catching errors, and one the higher-level "is this a good way to do this?" plane.

Code

Heroku Scheduler job:

rake update_retweet_cache

scheduler.rake (in my app)

task :update_retweet_cache => :environment do
  Tweet.delay.cache_retweets_for_all_auths
end

Tweet.rb, update_retweet_cache method:

def self.cache_retweets_for_all_auths
  @authentications = Authentication.find_all_by_provider("twitter")

  @authentications.each do |authentication|
    authentication.user.twitter.retweeted_to_me(include_entities: true, count: 200).each do |tweet|
      # Actually build the cache - this is good - removing to keep this short
    end
  end
end

User.rb, twitter method:

def twitter
  authentication = Authentication.find_by_user_id_and_provider(self.id, "twitter")
  if authentication
    @twitter ||= Twitter::Client.new(:oauth_token => authentication.oauth_token, :oauth_token_secret => authentication.oauth_secret)
  end
end

Note: As I was posting this, I noticed that I'm finding all "twitter" authentications in the "cache_retweets_for_all_auths" method, then calling the "User.twitter" method, which specifically limits to "twitter" authentications. This is obviously redundant, and I'll fix it.

Was it helpful?

Solution

First what is the exact error you are getting, and what do you want to happen when there is an error?

Edit:

If you just want to catch the errors and log them then the following should work.

def self.cache_retweets_for_all_auths
  @authentications = Authentication.find_all_by_provider("twitter")

  @authentications.each do |authentication|
    being
      authentication.user.twitter.retweeted_to_me(include_entities: true, count: 200).each do |tweet|
        # Actually build the cache - this is good - removing to keep this short
      end
     rescue => e
       #Either create an object where the error is log, or output it to what ever log you wish.
     end
  end
end

This way when it fails it will keep moving on to the next user but will still making a note of the error. Most of the time with twitter its just better to do something like this then try to do with each error on its own. I have seen so many weird things out of the twitter API, and random errors, that trying to track down every error almost always turns into a wild goose chase, though it is still good to keep track just in case.


Next for when you should use what.

You should use a scheduler when you need something to happen based on time only, delayed jobs for when its based on an user action, but the 'action' you are going to delay would take to long for a normal response. Sometimes you can just put the thing plainly in the controller also.

So in other words

The scheduler will be fine as long as the time between updates X is less then the time it will take for the update to happen, time Y.

If X < Y then you might want to look at calling the logic from the controller when each indvidual entry is accessed, isntead of trying to do them all at once. The idea being you would only update it after a certain time as passed so. You could store the last time update either on the model itself in a field like twitter_udpate_time or in a redis or memecache instance at a unquie key for the user/auth.

But if the individual update itself is still too long, then thats when you should do the above, but instead of doing the actually update, call a delayed job.

You could even set it up that it only updates or calls the delayed job after a certain number of views, to further limit stuff.

Possible Fancy Pants

Or if you want to get really fancy you could still do it as a cron job, but have a point system based on views that weights which entries should be updated. The idea being certain actions would add points to certain users, and if their points are over a certain amount you update them, and then remove their points. That way you could target the ones you think are the most important, or have the most traffic or show up in the most search results etc etc.


Next off a nick picky thing.

http://api.rubyonrails.org/classes/ActiveRecord/Batches.html

You should be using

@authentications.find_each do |authentication|

instead of

@authentications.each do |authentication|

find_each pulls in only 1000 entries at a time so if you end up with a lof of Authentications you don't end up pulling a crazy amount of entries into memory.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top