Question

From reading the MySQL documentation, I can't explain the difference between these two queries in phpMyAdmin:

SELECT * FROM f_ean GROUP BY ean HAVING type = 'media'

--> gives me 57059 results

SELECT ean, type FROM f_ean GROUP BY ean HAVING type = 'media'

--> gives me 73201 results

How can the result number of a query be different by only showing different columns?

Was it helpful?

Solution

You should be using WHERE, not HAVING if you're trying to filter records. HAVING is used to apply the filter after grouping and sorting have happened.

Regardless, the issue lies with how MySQL uses GROUP BY. GROUP BY should be used with an aggregate; MySQL extends the functionality for convenience. You're receiving different results because of the way it sorts and groups the columns.

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group.

See extensions to GROUP BY.

OTHER TIPS

MySQL doesn't promise which value from non-grouped columns will be in the result set unless you specify using a WHERE clause or AGGREGATE function (or possibly a JOIN condition).

Speculating a little, having never seen any documentation to indicate how it chooses which value to include, my assumption is that it uses row order within the most relevant index.

Hence, it's reasonable to assume that SELECTing colA, colZ versus SELECTing * might trigger MySQL to leverage different indexes when compiling the result set, altering the "perceived" order of the rows, and surfacing different values.

If you were using a WHERE condition, this wouldn't matter. WHERE conditions are applied before grouping. But, as you're using a HAVING condition on a non-grouped column with potential variation in that column, the discrepancy you're seeing is expected behavior to a large extent.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top