سؤال

Suppose an example that is similar to Stackoverflow. You have a content item (question or answer) and you fetch data about each individual content item. Each one is being represented by data like:

  • title
  • details
  • score
  • etc.

And then you'd also like to attach additional columns representing user interaction related to individual content item:

  • user voted up
  • user voted down
  • user commented
  • user flagged item for spam
  • etc.

This would give me the possibility to indicate which content items user interacted with and what that interaction was. Now content items are stored in the Posts table while actions (voting, commenting) are being recorded in separate table PostActions with columns:

  • PostId
  • UserId
  • ActionType (vote up/down, comment, spam, close etc.)
  • CreationDate

So for each item there are several rows in this table related to each item. User may have done several actions on the same item.

I could execute i.e.

select p.*, pa.ActionType
from Posts p
    left join PostActions pa
    on ((pa.PostId = p.PostId) and (pa.UserId = @UserId))

but this would result in several rows related to the same content item. To get them all in one result row I could:

  1. Use left join PostActions several times, for each action individually

    select
        p.*,
        iif(paUp.CreationDate is null, 0, 1) as VotedUp,
        iif(paDown.CreationDate isnull, 0, 1) as VotedDown,
        ...
    from Posts p
        left join PostActions paUp
        on ((paUp.PostId = p.PostId) and (paUp.UserId = @UserId) and (paUp.ActionId = @UpVoteType))
        left join PostActions paDown
        on ((paUp.PostId = p.PostId) and (paUp.UserId = @UserId) and (paUp.ActionId = @DownVoteType))
        ...
    

    but this would end up with many left join to the same table

  2. I could left join PostActions and concatenate all existing actions using stuff function and then parse that on the mid tier

  3. Whatever else - i.e. grouping by PostActions.PostId and PostActions.UserId and then getting that info from the group (if at all possible to filter the same group with several different conditions).

  4. Using pivot or apply to get my data

Question

The main question is which method (one of the above or any other) would be best in terms of performance? What would you suggest?

هل كانت مفيدة؟

المحلول

Final implementation

I experimented a bit and related to my upper possibilities is this outcome:

  1. Not entirely possible with just multiple joins. Grouping would have to be implemented as well which makes resulting query too complex and a lot less maintainable

  2. Requires upper layer post processing to parse concatenated data. Haven't implemented it because it would require more upper layer code changes which would also be more bound to the data layer than desired.

  3. I prepared a CTE that generated content interaction data grouped by content items and users and joining results to original query. Runs fine and works as expected. The tricky part is to filter data within each group of records to only count particular interactions (see below).

  4. A similar approach to #3 is using pivot relational operator and I also implemented that, although prepared pivoted data also needs to be grouped beforehand to get one record per item.

Between #3 and #4 I decided to go with #3 as it had a better execution plan.

Filtering within record group solution

As stated in #3 (final implementation) to make my query perform as it should I had to only group once and then filter data within each group of records to get info about individual content interactions.

As I only needed an indication whether a particular user interacted on a content item all these return boolean (SQL bit type). The best part is to use a function within count aggregation so that only particular records within group are used.

with Interactions (Id, UpVoted, DownVoted)
as (
    select
        p.Id,
        cast(count(iif(pa.ActionTypeId = @UpVoteType, 1, null)) as bit),
        cast(count(iif(pa.ActionTypeId = @DownVoteType, 1, null)) as bit)
    from PostActions pa
        join Posts p
        on (p.Id = pa.PostId)
    where pa.UserId = @UserId
    group by p.Id
)
select
    ...
    isnull(i.UpVoted, 0) as UpVoted,
    isnull(i.DownVoted, 0) as DownVoted
from Posts p
    ...
    left join Interactions i
    on (p.Id = i.Id)
where ...
order by ...

The good thing with count is that it only counts non null records so I took advantage of this fact to filter out individual specific user interactions within the same group related to content item.

COUNT is my new love. :)

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top