Question

There is such code(using PublicActivity gem & Squeel)

  def index
    @activities = Activity.limit(20).order { created_at.desc }
    @one = @activities.where{trackable_type == 'Post'}.includes(trackable: [:author, :project])
    @two = @activities.where{trackable_type == 'Project'}.includes trackable: [:owner]
    @activities = @one + @two
  end

But it creates 8 SQL requests:

 SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Post' ORDER BY "activities"."created_at" DESC LIMIT 20

      SELECT "posts".* FROM "posts" WHERE "posts"."id" IN (800, 799, 798, 797, 796, 795, 794, 793, 792, 791, 790, 789, 788, 787, 786, 785, 784, 783, 782, 781)

      SELECT "users".* FROM "users" WHERE "users"."id" IN (880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 869, 868, 867, 866, 865, 864, 863, 862, 861, 860)

      SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79)

      SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Project' ORDER BY "activities"."created_at" DESC LIMIT 20

      SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70, 69, 68, 67, 66, 65, 64, 63, 62, 61)

     SELECT "users".* FROM "users" WHERE "users"."id" IN (870, 859, 848, 837, 826, 815, 804, 793, 782, 771, 760, 749, 738, 727, 716, 705, 694, 683, 672, 661)
  1. activites request are not joined
  2. some users (post owner and project owner) are loaded twice
  3. some projects are loaded twice
  4. @activities is Array. Rails relations merge methods(except +) don't work with the code above.

Any ideas to optimize it?

Was it helpful?

Solution 2

In a nutshell, you can't optimize any further without using SQL. This is the way Rails does business. It doesn't allow access to join fields outside the AR model where the query is posed. Therefore to get values in other tables, it does a query on each one.

It also doesn't allow UNION or fancy WHERE conditions that provide other ways of solving the problem.

The good news is that these queries are all efficient ones (given that trackable_type is indexed). If the size of the results is anything substantial (say a few dozen rows), the i/o time will dominate the slight additional overhead of 7 simple queries vice 1 complex one.

Even using SQL, it will be difficult to get all the join results you want in one query. (It can be done, but the result will be a hash rather than an AR instance. So dependent code will be ugly.) The one-query-per-table is wired pretty deeply into Active Record.

@Mr.Yoshi's solution is a good compromise using minimal SQL except it doesn't let you selectively load either author or project+owner based on the trackable_type field.

Edit

The above is all correct for Rails 3. For Rails 4 as @CMW says, the eager_load method will do the same as includes using an outer join instead of separate queries. This is why I love SO! I always learn something.

OTHER TIPS

A non-rails-4, non-squeel solution is:

def index
  @activities = Activity.limit(20).order("created_at desc")
  @one = @activities.where(trackable_type: 'Post')   .joins(trackable: [:author, :project]).includes(trackable: [:author, :project])
  @two = @activities.where(trackable_type: 'Project').joins(trackable: [:owner])           .includes(trackable: [:owner])
  @activities = @one + @two
end

The combination of joins and includes looks odd, but in my testing it works surprisingly well.

This'll reduce it down to two queries though, not to one. And @activities will still by an array. But maybe using this approach with squeel will solve that, too. I don't use squeel and can't test it, unfortunately.

EDIT: I totally missed the point of this being about polymorphic associations. The above works to force

If you want to use what AR offers, it's a bit hacky but you could define read-only associated projects and posts:

belongs_to :project, read_only: true, foreign_key: :trackable_id
belongs_to :post,    read_only: true, foreign_key: :trackable_id

With those the mentioned way of forcing eager loads should work. The where conditions are still needed, so those associations are only called on the right activities.

def index
  @activities = Activity.limit(20).order("created_at desc")
  @one = @activities.where(trackable_type: 'Post')   .joins(post: [:author, :project]).includes(post: [:author, :project])
  @two = @activities.where(trackable_type: 'Project').joins(project: [:owner])        .includes(project: [:owner])
  @activities = @one + @two
end

It's no clean solution and the associations should be attr_protected to make sure they aren't set accidentally (that will break polymorphism, I expect), but from my testing it seems to work.

Using a simple Switch case in SQL:

def index
  table_name = Activity.table_name
  @activities = Activity.where(trackable_type: ['Post', 'Project'])
                        .order("CASE #{table_name}.owner_type WHEN 'Post' THEN 'a' ELSE 'z' END, #{table_name}.created_at DESC")
end

Then you can easily add the includes you want ;)

I believe you will need at least two AR query invocations (as you currently have) because of the limit(20) clause. Your queries currently gives you up to 20 Posts, AND up to 20 Projects, so doing an aggregate limit on both activity types in a single query would not give the intended result.

I think all you need to do is use eager_load in the query instead of includes to force a single query. The differences between joins, includes, preload, eager_load and references methods are nicely covered here

So, with AR and squeel:

def index
    @activities = Activity.limit(20).order { created_at.desc }
    @one = @activities.where{trackable_type == 'Post'}.eager_loads(trackable: [:author, :project])
    @two = @activities.where{trackable_type == 'Project'}.eager_loads trackable: [:owner]
    @activities = @one + @two
end

And without the squeel, using just regular ActiveRecord 4:

def index
    @activities = Activity.limit(20).order(created_at: :desc)
    @one = @activities.where(trackable_type: 'Post').eager_loads(trackable: [:author, :project])
    @two = @activities.where(trackable_type: 'Project').eager_loads(trackable: :owner)
    @activities = @one + @two
end

You don't need squeel, I recently ripped it out of my project because it doesn't work properly for a number of complex queries in my experience, where AR 4 and Arel were ok.

That's a pretty big query there ... by the looks of it you could do it in one select, but for readability I'll use two, one for projects and one for posts.

This assumes a 1:1 relationship between activity and post/project. If this isn't correct, the problem can be solved with a subquery

select * from activities a
where a.trackable_type = 'Post'
left join posts p
on p.id = a.trackable_id -- or whatever fields join these two tables
left join users u
on a.user_id = u.id --this is joining to the main table, may want to join trackable, not sure
left join projects p
on a.project_id = p.id
order by a.created_at DESC LIMIT 20

Or, if there is a 1:many relationship, something like this:

select * from
(   select * from activities a
    where a.trackable_type = 'Post'
    order by a.created_at DESC LIMIT 20 ) activities
left join posts p
...

Edit: As I read this, I realize that I'm a bit old fashioned .... I think if you were going to use such large raw sql queries, you should make a database function, rather than coding it into your application

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top