Extra Fields with SQL MIN() & GROUP BY
-
05-09-2019 - |
Question
When using the SQL MIN() function, along with GROUP BY, will any additional columns (not the MIN column, or one of the GROUP BY columns) match the data in the matching MIN row?
For example, given a table with department names, employee names, and salary:
SELECT MIN(e.salary), e.* FROM employee e GROUP BY department
Obviously I'll get two good columns, the minimum salary and the department. Will the employee name (and any other employee fields) be from the same row? Namely the row with the MIN(salary)?
I know there could very possibly be two employees with the same (and lowest) salary, but all I'm concerned with (now) is getting all the information on the (or a single) cheapest employee.
Would this select the cheapest salesman?
SELECT min(salary), e.* FROM employee e WHERE department = 'sales'
Essentially, can I be sure that the data returned along with the MIN() function will matches the (or a single) record with that minimum value?
If the database matters, I'm working with MySql.
Solution
If you wanted to get the "cheapest" employee in each department you would have two choices off the top of my head:
SELECT
E.* -- Don't actually use *, list out all of your columns
FROM
Employees E
INNER JOIN
(
SELECT
department,
MIN(salary) AS min_salary
FROM
Employees
GROUP BY
department
) AS SQ ON
SQ.department = E.department AND
SQ.min_salary = E.salary
Or you can use:
SELECT
E.*
FROM
Employees E1
LEFT OUTER JOIN Employees E2 ON
E2.department = E1.department AND
E2.salary < E1.salary
WHERE
E2.employee_id IS NULL -- You can use any NOT NULL column here
The second statement works by effectively saying, show me all employees where you can't find another employee in the same department with a lower salary.
In both cases, if two or more employees have equal salaries that are the minimum you will get them both (all).
OTHER TIPS
SELECT e.*
FROM employee e
WHERE e.id =
(
SELECT id
FROM employee ei
WHERE ei.department = 'sales'
ORDER BY
e.salary
LIMIT 1
)
To get values for each department, use:
SELECT e.*
FROM department d
LEFT JOIN
employee e
ON e.id =
(
SELECT id
FROM employee ei
WHERE ei.department = d.id
ORDER BY
e.salary
LIMIT 1
)
To get values only for those departments that have employees, use:
SELECT e.*
FROM (
SELECT DISTINCT eo.department
FROM employee eo
) d
JOIN
employee e
ON e.id =
(
SELECT id
FROM employee ei
WHERE ei.department = d.department
ORDER BY
e.salary
LIMIT 1
)
Of course, having an index on (department, salary)
will greatly improve all three queries.
The fastest solution:
SET @dep := '';
SELECT * FROM (
SELECT * FROM `employee` ORDER BY `department`, `salary`
) AS t WHERE IF ( @dep = t.`department`, FALSE, ( @dep := t.`department` ) OR TRUE );
Another approach can be using Analytical functions. Here is the query using analytical and ROW_NUM functions
select first_name, salary from (select first_name,salary, Row_NUMBER() over (PARTITION BY DEPARTMENT_ID ORDER BY salary ASC) as row_count from employees) where row_count=1;