Question

Let's say I have a schools table (cols = "ids (int)") and a users table (cols = "id (int), school_id (int), created_at (datetime)").

I have a list of school ids saved in <school_ids>. I want to group those schools by the yearweek(users.created_at) value for the user at that school with the earliest created_at value, and for each group list the value of yearweek(users.created_at) and the number of schools.

In other words, i want to find the earliest-created user for each school, and then group the schools by the yearweek() result for that created_at date, so i have the number of schools that signed up their first user in each week, effectively.

So, i want results like

| 201301 | 22 |  #meaning there are 22 schools where the earliest created_at user 
                 #has yearweek(created_at) = "201301"

| 201302 | 5  |  #meaning there are 5 schools where the earliest created_at user  
                 #has yearweek(created_at) = "201302"

etc

As a sanity check, the total of all rows in the second column should equal the size of <school_ids>, ie the number of ids in school_ids.

Does that make sense? I can't quite figure out how to get this without doing several queries and storing values in between. I'm sure there's a one-liner. Thanks! max

Was it helpful?

Solution

You could use a subquery that returns the minimum created_at field for every school_id, and then you can group by yearweek and do the count:

SELECT
  yearweek(u.min_created_at) AS yearweek_first_user,
  COUNT(*)
FROM
  (
    SELECT school_id, MIN(created_at) AS min_created_at
    FROM users
    GROUP BY school_id
  ) u
GROUP BY
  yearweek(u.min_created_at)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top