سؤال

I have a set of records of which some, but not all, have a 'path' field, and all have a 'value' field. I wish to select only those which either do not have a path, or have the largest value of all the records with a particular path.

That is, given these records:

Name:  Path:   Value:
A      foo     5
B      foo     6
C      NULL    2
D      bar     2
E      NULL    4

I want to return B, C, D, and E, but not A (because A has a path and it's path is the same as B, but A has a lower value).

How can I accomplish this, using ActiveRecord, ARel and Postgres? Ideally, I would like a solution which functions as a scope.

هل كانت مفيدة؟

المحلول

You could use something like this by using 2 subqueries (will do only one SQL query which has subqueries). Did not test, but should get you in the right direction. This is for Postgres.

scope :null_ids, -> { where(path: nil).select('id') }
scope :non_null_ids, -> { where('path IS NOT NULL').select('DISTINCT ON (path) id').order('path, value desc, id') }
scope :stuff, -> {
  subquery = [null_ids, non_null_ids].map{|q| "(#{q.to_sql})"}.join(' UNION ')
  where("#{table_name}.id IN (#{subquery})")
}

If you are using a different DB you might need to use group/order instead of distinct on for the non_nulls scope. If the query is running slow put an index on path and value.

You get only 1 query and it's a chainable scope.

نصائح أخرى

A straightforward transliteration of your description to SQL would look like this:

select name, path, value
from (
    select name, path, value,
           row_number() over (partition by path order by value desc) as r
    from your_table
    where path is not null
) as dt
where r = 1
union all
select name, path, value
from your_table
where path is null

You could wrap that in a find_by_sql and get your objects out the other side.

That query works like this:

  1. The row_number window function allows us to group the rows by path, order each group by value, and then number the rows in each group. Play around with the SQL a bit inside psql and you'll see how this works, there are other window functions available that will allow you to do all sorts of wonderful things.
  2. You're treating NULL path values separately from non-NULL paths, hence the path is not null in the inner query.
  3. We can peel off the first row in each of the path groups by selecting those rows from the derived table that have a row number of one (i.e. where r = 1).
  4. The treatment of path is null rows is easily handled by the section query.
  5. The UNION is used to join the result sets of the queries together.

I can't think of any way to construct such a query using ActiveRecord nor can I think of any way to integrate such a query with ActiveRecord's scope mechanism. If you could easily access just the WHERE component of an ActiveRecord::Relation then you could augment the where path is not null and where path is null components of that query with the WHERE components of a scope chain. I don't know how to do that though.

In truth, I tend to abandon ActiveRecord at the drop of a hat. I find ActiveRecord to be rather cumbersome for most of the complicated things I do and not nearly as expressive as SQL. This applies to every ORM I've ever used so the problem isn't specific to ActiveRecord.

I have no experience with ActiveRecord, but here's a sample with SQLAlchemy to silent the just-use-SQL crowd ;)

q1 = Session.query(Record).filter(Record.path != None)
q1 = q1.distinct(Record.path).order_by(Record.path, Record.value.desc())

q2 = Session.query(Record).filter(Record.path == None)

query = q1.from_self().union(q2)
# Further chaining, e.g. query = query.filter(Record.value > 3) to return B, E

for record in query:
    print record.name
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top