Question

Because the vocabulary describing what I'd like to do can be so disparate, I haven't had any luck figuring out why MySQL 5.6 is doing what it does.

Consider the following two simple queries:

#1
select sum(amount) as amt,
    (sum(amount) * (0.000+coalesce((select gas from gas_prices gp where gp.yw=yearweek(gl.stamp,0)),3.50))) as cost
from gas_log gl
where date_format(gl.stamp,'%X')=2013;

#2
select sum(amount) as amt,
    (sum(amount) * (0.000+coalesce((select gas from gas_prices gp where gp.yw=yearweek(gl.stamp,0)),3.50))) as cost
from gas_log gl
where date_format(gl.stamp,'%X')=2013 group by yearweek(gl.stamp,0);

The second query is identical to the first, with a simple group_by to get the weekly totals instead of the yearly totals. Both queries have near-identical explain outputs:

+----+--------------------+-------+------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                        |
+----+--------------------+-------+------+---------------+------+---------+------+------+----------------------------------------------+
|  1 | PRIMARY            | gl    | ALL  | NULL          | NULL | NULL    | NULL | 7428 | Using where; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | gp    | ALL  | yw            | NULL | NULL    | NULL |   52 | Using where                                  |
+----+--------------------+-------+------+---------------+------+---------+------+------+----------------------------------------------+


+----+--------------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | PRIMARY            | gl    | ALL  | NULL          | NULL | NULL    | NULL | 7428 | Using where |
|  2 | DEPENDENT SUBQUERY | gp    | ALL  | yw            | NULL | NULL    | NULL |   52 | Using where |
+----+--------------------+-------+------+---------------+------+---------+------+------+-------------+

gas_log contains a timestamp of when gas was pumped and how much was pumped. gas_prices looks like this:

10:11:09> select * from gas_prices limit 2;
+------------+--------+-------+--------+
| date       | yw     | gas   | diesel |
+------------+--------+-------+--------+
| 2013-01-07 | 201301 | 3.235 |  3.870 |
| 2013-01-14 | 201302 | 3.265 |  3.834 |
+------------+--------+-------+--------+

For the first query, MySQL only executes the subquery once and uses that value (found by matching the first row retrieved from gas_log against its corresponding week in gas_prices) to multiply against the sum of all gallons logged in the gas log, whereas in the second query it does what I'm actually after: execute the subquery for each of the 52 grouped weeks in 2013, matching each week's gas price accordingly.

Using with rollup on the group by provides an interesting result; instead of using the first gas price in the timerange, it uses the last one! Thus, it still provides an incorrect cost total for any timerange spanning more than week.

Is there a way to rewrite the query to get MySQL to present one total for a given timerange, yet still match each gas_log row to its corresponding price in gas_prices?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top