Вопрос

I'd like to order by the count of a many-to-many relationship on a table.

My starting point was the (excellent) response to this thread. However, in my case, there is a third table involved and I need my backrefs to be 'joined' rather than 'dynamic'.

The combination of those two changes results in ProgrammingErrors when I try the solution to that question.

Here's my setup. I have the same many-to-many relationship between User and Post. I also have a one-to-many relationship between Post and Heading. What I'd like is the list of all posts ordered by the number of users that like each post.

from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy.orm import relationship, backref
from sqlalchemy import func

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgres://xxx'
db = SQLAlchemy(app)


likes = db.Table('likes',
                 db.Column('user_id', db.Integer, db.ForeignKey('user.id')),
                 db.Column('post_id', db.Integer, db.ForeignKey('post.id')))


class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(20))

    def __repr__(self):
        return "<User('%s')>" % self.username


class Post(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String(255))
    likes = db.relationship('User', secondary=likes,
                            backref=db.backref('posts', lazy='joined'))

    def __repr__(self):
        return "<Post('%s')>" % self.title


class Heading(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(255))

    post_id = db.Column(db.Integer, db.ForeignKey('post.id'))
    post = relationship('Post', backref=backref('headings', lazy='joined'))

    def __repr__(self):
        return "<Category('%s')>" % self.name

# Three users
user1 = User(username='user1')
user2 = User(username='user2')
user3 = User(username='user3')

# Two posts
post1 = Post(title='post1')
heading1 = Heading(name='heading1')
heading2 = Heading(name='heading2')

post1.headings = [heading1, heading2]
post1.likes = [user1, user2]

post2 = Post(title='post2')
heading3 = Heading(name='heading3')
post2.headings = [heading3]
post2.likes = [user3]

db.create_all()
db.session.add_all([user1, user2, user3, post1, post2, heading1, heading2, heading3])
db.session.commit()

Now if I try the solution in the other question:

In [3]: db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post).order_by('total DESC').all()
---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
<ipython-input-3-b804fb4b0893> in <module>()
----> 1 db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post).order_by('total DESC').all()

ProgrammingError: (ProgrammingError) column "heading_1.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...tle AS post_title, count(likes.user_id) AS total, heading_1....
                                                             ^
 'SELECT post.id AS post_id, post.title AS post_title, count(likes.user_id) AS total, heading_1.id AS heading_1_id, heading_1.name AS heading_1_name, heading_1.post_id AS heading_1_post_id \nFROM post JOIN likes ON post.id = likes.post_id LEFT OUTER JOIN heading AS heading_1 ON post.id = heading_1.post_id GROUP BY post.id, post.title ORDER BY total DESC' {}

Naively (and I'm pretty naive when it comes to SQL and SQLAlchemy), if I try to add Heading to my group_by:

In [6]: db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post, Heading).order_by('total DESC').all()
---------------------------------------------------------------------------
ProgrammingError                          Traceback (most recent call last)
<ipython-input-6-8efd6a8314f8> in <module>()
----> 1 db.session.query(Post, func.count(likes.c.user_id).label('total')).join(likes).group_by(Post, Heading).order_by('total DESC').all()

ProgrammingError: (ProgrammingError) invalid reference to FROM-clause entry for table "heading"
LINE 2: ...= heading_1.post_id GROUP BY post.id, post.title, heading.id...
                                                             ^
HINT:  Perhaps you meant to reference the table alias "heading_1".
 'SELECT post.id AS post_id, post.title AS post_title, count(likes.user_id) AS total, heading_1.id AS heading_1_id, heading_1.name AS heading_1_name, heading_1.post_id AS heading_1_post_id \nFROM post JOIN likes ON post.id = likes.post_id LEFT OUTER JOIN heading AS heading_1 ON post.id = heading_1.post_id GROUP BY post.id, post.title, heading.id, heading.name, heading.post_id ORDER BY total DESC' {}

How can I get my desired query to work? I'm probably doing something really dumb.

Это было полезно?

Решение

As you mentioned, the real reason for the mentioned solution not working is the joinedload of the Headings. Few things you can do about it:

Options-1: disable joinedload for this query

q = (db.session.query(Post, func.count(likes.c.user_id).label("total"))
        .options(lazyload(Post.headings)) # disable joined-load
        .join(likes) .group_by(Post) .order_by('total DESC')
    )

In case you really need the headings, you can use subqueryload instead:

q = (db.session.query(Post, func.count(likes.c.user_id).label("total"))
        .options(subqueryload(Post.headings))
        .join(likes) .group_by(Post) .order_by('total DESC')
    )

Option-2: use subquery

subq = (db.session.query( Post.id.label("post_id"),
            func.count(likes.c.user_id).label("num_likes"))
            .outerjoin(likes).group_by(Post.id)
        ).subquery("subq")

q = (db.session.query(Post, subq.c.num_likes)
        .join(subq, Post.id == subq.c.post_id)
        .group_by(Post) .order_by(subq.c.num_likes.desc())
    )

Option-3: use Hybrid Attributes

It might not be the most efficient, but keeps things clean.

Add following to the Post model:

from sqlalchemy.ext.hybrid import hybrid_property
class Post(db.Model):

    # ...

    @hybrid_property
    def num_likes(self):
        return len(self.likes)

    @num_likes.expression
    def _num_likes_expression(cls):
        return (db.select([db.func.count(likes.c.user_id).label("num_likes")])
                .where(likes.c.post_id == cls.id)
                .label("total_likes")
                )

Later you can compose a query in an clean way:

q = db.session.query(Post, Post.num_likes).order_by(Post.num_likes.desc())
# use this if you need it data sorted, but do not care how many likes are there 
#q = db.session.query(Post).order_by(Post.num_likes.desc()) 
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top