Unexpected results while using HAVING without GROUP BY
-
23-01-2021 - |
Question
Given below is the snapshot of the SQL queries:
MariaDB [university]> select * from department;
+------------+----------+-----------+
| dept_name | building | budget |
+------------+----------+-----------+
| Biology | Watson | 90000.00 |
| Comp. Sci. | Taylor | 100000.00 |
| Elec. Eng. | Taylor | 85000.00 |
| Finance | Painter | 120000.00 |
| History | Painter | 50000.00 |
| Music | Packard | 80000.00 |
| Physics | Watson | 70000.00 |
+------------+----------+-----------+
7 rows in set (0.000 sec)
MariaDB [university]> select * from department where budget > (select avg(budget) from department);
+------------+----------+-----------+
| dept_name | building | budget |
+------------+----------+-----------+
| Biology | Watson | 90000.00 |
| Comp. Sci. | Taylor | 100000.00 |
| Finance | Painter | 120000.00 |
+------------+----------+-----------+
3 rows in set (0.000 sec)
MariaDB [university]> select * from department having budget > avg(budget);
+-----------+----------+----------+
| dept_name | building | budget |
+-----------+----------+----------+
| Biology | Watson | 90000.00 |
+-----------+----------+----------+
1 row in set (0.051 sec)
MariaDB [university]> select * from department having budget < avg(budget);
Empty set (0.000 sec)
I am trying to select all the departments where the budget is greater than the average budget.
The expected result includes three entries but when I use HAVING
clause it selects only one of the rows.
Also, in the last SQL statement: select * from department having budget < avg(budget)
the result is an empty set which I haven't expected. I am using MariaDB 10.3. Any explanation of why it's happening like that would be appreciated! Thanks!
Solution
This is not valid SQL - even though MariaDB and MySQL (until version 5.6) allow it:
having budget > avg(budget)
So, if you get unexpected results, don't worry much. Just do not use invalid SQL syntax ;) If you are not sure what syntax is valid and what not, you can add ONLY_FULL_GROUP_BY
to the SQL mode setting and you'll get errors instead of unexpected / inconsistent results.
What is actually happening in the query:
select *
from department
having budget > avg(budget);
Whenever there is an aggregate function (SUM()
, COUNT()
, AVG()
) either in the SELECT
list or in HAVING
condition, a GROUP BY
the whole set is implied. So the query is equivalent to:
select dept_name, building, budget
from department
group by ()
having budget > avg(budget);
This means the whole set has to be collapsed in a single row and only aggregate expressions should be allowed, not columns. So the whole SELECT
list is invalid (all 3 columns) and the budget
in the HAVING
condition. However, MySQL and MariaDB have allowed this since very early versions, for some efficiency optimization (in certain cases where it would not create inconsistent results).
In the specific case (or similar ones), that a dept_name
, a building
and a budget
have to de displayed in the SELECT
result, an arbitrary row is chosen and displayed. In more complicated cases, you can even see results from different rows, showing a department A with the budget from department B !
The same (arbitrary) budget
value is used in the HAVING
filter, giving the weird results you noticed.
A few more details can be found in MySQL's documentation:
5.6 (old behaviour): MySQL Handling of
GROUP BY
8.0 (current behaviour): MySQL Handling of
GROUP BY
You first query is the right way to do it:
select *
from department
where budget > (select avg(budget) from department);
Another way is to use window functions (available in both MariaDB and latest MySQL):
select *
from
(
select *,
avg(budget) over () as avg_budget
from department
) as a
where budget > avg_budget ;