Вопрос

I have Users, Interests and Events. User has (many-to-many) interests. Event has (many-to-many) interests. That's why I have two "intermediate" tables: user_to_interest and event_to_interest.

I want to somehow select all events that has interests from user's interests list (in other words, all events that has tags IN [1, 144, 4324]).

In SQL I'd do that ~like this:

SELECT DISTINCT event.name FROM event JOIN event_to_interest ON event.id = event_to_interest.event_id WHERE event_to_interest.interest_id IN (10, 144, 432)

How should I do that through SQLAlchemy? (I'm using Flask-SQLAlchemy if necessary)

Это было полезно?

Решение

Assuming you have a (simplified) model like below:

user_to_interest = Table('user_to_interest', Base.metadata,
    Column('id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey('user.id')),
    Column('interest_id', Integer, ForeignKey('interest.id'))
    )

event_to_interest = Table('event_to_interest', Base.metadata,
    Column('id', Integer, primary_key=True),
    Column('event_id', Integer, ForeignKey('event.id')),
    Column('interest_id', Integer, ForeignKey('interest.id'))
    )

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)

class Event(Base):
    __tablename__ = 'event'
    id = Column(Integer, primary_key=True)
    name = Column(String)

class Interest(Base):
    __tablename__ = 'interest'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    users = relationship(User, secondary=user_to_interest, backref="interests")
    events = relationship(Event, secondary=event_to_interest, backref="interests")

Version-1: you should be able to do simple query on list of interest_ids, which will generate basically the SQL statement you desire:

interest_ids = [10, 144, 432]
query = session.query(Event.name)
query = query.join(event_to_interest, event_to_interest.c.event_id == Event.id)
query = query.filter(event_to_interest.c.interest_id.in_(interest_ids))

However, if there are events which have two or more of the interests from the list, the query will return the same Event.name multiple times. You can work-around it by using distinct though: query = session.query(Event.name.distinct())

Version-2: Alternatively, you could do this using just relationships, which will generate different SQL using sub-query with EXISTS clause, but semantically it should be the same:

query = session.query(Event.name)
query = query.filter(Event.interests.any(Interest.id.in_(interest_ids)))

This version does not have a problem with duplicates.

However, I would go one step back, and assume that you do get interest_ids for particular user, and would create a query that works for a user_id (or User.id)

Final Version: using any twice:

def get_events_for_user(user_id):
    #query = session.query(Event.name)
    query = session.query(Event) # @note: I assume name is not enough
    query = query.filter(Event.interests.any(Interest.users.any(User.id == user_id)))
    return query.all()

One can agrue that this creates not so beautiful SQL statement, but this is exactly the beauty of using SQLAlchemy which hides the implementation details.


Bonus: you might actually want to give higher priority to the events which have more overlapping interests. In this case the below could help:

query = session.query(Event, func.count('*').label("num_interests"))
query = query.join(Interest, Event.interests)
query = query.join(User, Interest.users)
query = query.filter(User.id == user_id)
query = query.group_by(Event)
# first order by overlaping interests, then also by event.date
query = query.order_by(func.count('*').label("num_interests").desc())
#query = query.order_by(Event.date)
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top