Question

I'd like to compare the popularity of tags between two months, ordered by the biggest change.

I've worked out how to count the number of tags in a month, but not how to compare them. Here's what I have so far:

select TOP 10
    tags.tagname, count(*) AS tagcount
from Posts
    INNER JOIN PostTags ON PostTags.PostId = Posts.id
    INNER JOIN Tags     ON Tags.id         = PostTags.TagId
where
    datepart(year,  Posts.CreationDate) = 2011 and
    datepart(month, Posts.CreationDate) = 1
Group by tags.tagname
Order by tagcount DESC

http://data.stackexchange.com/stackoverflow/qe/924/query-count-tags-from-daterange
(note: you can clone, edit in place, and run it)

I'm new to SQL. It seems I just need to create a second query for the previous month (December 2010), and then combine these two queries, with a column that is prevMonth.count - nextMonth.count, and order by that column (getting just the top X, so it doesn't take forever).

But I can't work out how to combine two queries in this way - I think you should be able to nest them, but I can't get it to work. Another way is to create a temporary table - it seems inefficient to me, but maybe that is the right way?

Many thanks for any help!


BTW: what I'd like to do next:

  1. After this, I'd like to find the rate of growth (not just the absolute change in numbers). That's just (new-old)/old. Like velocity, but normalized.

  2. Then, the tags whose rate of growth is increasing the most - i.e. that have exponential growth. This is would require 3 months: calculate the rate of change between months 1 and 2, and between months 2 and 3. The difference between these is the rate of change of the rate of change. It's like acceleration.

[ This is as a signal for new technologies growing, which often start small in a very specific usage. The people in that small area talk to each other, and good ideas get passed on in a chain-reaction: one person tells two, they each tell two more and so on. The that niche might get converted fully after a while, and maybe it spreads to another, similar niche. See "Seeing What's Next", by the Innovator's Dilemma guy. ]


Here's a later version, using JNK's answer: http://data.stackexchange.com/stackoverflow/q/92869/query-tags-with-highest-increase-in-growth-over-3-months

And... the fastest growing tag is... facebook-c#-sdk. Dunno how useful this is, but it's an interesting way to browse SO.

Was it helpful?

Solution

Use #Temp Tables:

-- QUERY: count tags from daterange
-- TODO: compare from two different dateranges...

select TOP 10
    tags.tagname, count(*) AS tagcount
INTO #TagCountTemp1
from Posts
    INNER JOIN PostTags ON PostTags.PostId = Posts.id
    INNER JOIN Tags     ON Tags.id         = PostTags.TagId 
where
    datepart(year,  Posts.CreationDate) = 2011 and
    datepart(month, Posts.CreationDate) = 1
Group by tags.tagname
Order by tagcount DESC

select TOP 10
    tags.tagname, count(*) AS tagcount
INTO #TagCountTemp2
from Posts
    INNER JOIN PostTags ON PostTags.PostId = Posts.id
    INNER JOIN Tags     ON Tags.id         = PostTags.TagId 
where
    datepart(year,  Posts.CreationDate) = 2010 and
    datepart(month, Posts.CreationDate) = 12
Group by tags.tagname
Order by tagcount DESC

SELECT TOP 10
t2.tagname, t2.tagcount as 'Month 1', t1.tagcount as 'Month 2', (t1.tagcount-t2.tagcount) as 'Increase'
FROM #TagCountTemp1 as t1
LEFT JOIN #TagCountTemp2 as t2
  ON T1.tagname = t2.tagname
ORDER BY (t1.tagcount-t2.tagcount) desc​​​​​​​​​​​​​​​

​This worked fine for me!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top