MySQL - return rows in one table that correspond to minimum date in another (indirectly linked) table

StackOverflow https://stackoverflow.com/questions/21417495

  •  04-10-2022
  •  | 
  •  

Question

Table semesters:

semesterID  startDate
1           2013-01-01
2           2013-03-01
3           2013-06-01

Table classes:

classID  class_title  semesterID
1        Math         1
2        Science      1
3        Math         2
4        Science      2
5        Math         3
6        Science      3

Table persons:

personID  firstName  lastName
1         John       Jones
2         Steve      Smith

Table class_person:

classID  personID
1        1
2        1
5        1
6        1
3        2
4        2
5        2
6        2

I need to get a list of all the people, with the first semester in which they took a class (semester with the oldest startDate).

firstName,  lastName, semesterID, startDate
John        Jones     1           2013-01-01
Steve       Smith     2           2013-03-01

I've spent hours trying to figure this out. Here's the closest I've gotten (although it is not close at all!):

SELECT p.firstName, p.lastName, MIN(s.startDate) AS min_startDate
FROM semesters s
INNER JOIN classes c ON s.semesterID = c.semesterID
INNER JOIN class_person cp ON cp.classID = c.classID
INNER JOIN persons p ON p.personID = cp.personID
GROUP BY cs.personID
ORDER BY min_startDate, p.lastName, p.firstName

Any help would be massively appreciated. Thank you.

Was it helpful?

Solution

You could end up using a monster like the following (fiddle):

select persons.firstName, persons.lastName,
       semesters.semesterID, semesters.startDate
from persons, semesters,
(select p.personID,
 (select semesters.semesterID
  from semesters, classes, class_person
  where semesters.semesterID = classes.semesterID
    and classes.classID = class_person.classID
    and class_person.personID = p.personID
  order by semesters.startDate
  limit 1) as semesterID
 from (select distinct personID from class_person) as p
) as ps
where persons.personID = ps.personID
  and semesters.semesterID = ps.semesterID

The subquery p identifies all persons. For each, ps will contain a single row. Its personID is simply copied, its semesterID is computed by a subquery, which sorts semesters by date but returns the ID. The outermost query then re-adds the date.

If you don't really need the semesterID, you could avoid one layer. If your semesters are in order, i.e. their IDs have the same order as their startDates, then you could simply use a single query, much like your own, and return min(semesterID) and min(startDate).

On the whole, this question reminds me a lot of my own question, Select one value from a group based on order from other columns. Answers suggested there will likely apply here as well. In particular, there are approaches using user variables which I still don't feel comfortable about, but which will make this whole mess a lot easier and seem to work well enough. So adapting this answer, you get a query like this (fiddle):

SELECT p.firstName, p.lastName, s2.semesterID, s2.startDate
FROM persons p
INNER JOIN (
 SELECT @rowNum:=IF(@personID=cp.personID,@rowNum+1,1) rowNum,
        @personId:=cp.personID personID,
        s.semesterID, s.startDate
 FROM (SELECT @personID:=NULL,@rowNum:=0) dummy
 INNER JOIN semesters s
 INNER JOIN classes c ON s.semesterID = c.semesterID
 INNER JOIN class_person cp ON cp.classID = c.classID
 ORDER BY cp.personID, s.startDate
) s2 ON p.personID = s2.personID
WHERE s2.rowNum = 1

I'll leave adapting the other answers as an excercise.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top