Question

currently, I have this table:

CREATE TABLE `plant_data` (
    `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
    `plant_id` BIGINT(20) UNSIGNED NOT NULL,
    `temperature` DECIMAL(5,1) UNSIGNED NOT NULL,
    `light` SMALLINT(5) UNSIGNED NOT NULL,
    `created_at` TIMESTAMP NULL DEFAULT NULL,
    `updated_at` TIMESTAMP NULL DEFAULT NULL,
    PRIMARY KEY (`id`),
)

with the following example data:

INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1623, 14, 22.2, 35, '2020-02-16 09:00:06', '2020-02-16 09:00:06');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1622, 5, 22.8, 33, '2020-02-16 09:00:06', '2020-02-16 09:00:06');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1621, 14, 22.8, 36, '2020-02-16 08:00:07', '2020-02-16 08:00:07');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1620, 5, 23.3, 33, '2020-02-16 08:00:07', '2020-02-16 08:00:07');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1619, 14, 23.1, 36, '2020-02-15 07:00:11', '2020-02-15 07:00:11');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1618, 5, 23.8, 34, '2020-02-15 07:00:11', '2020-02-15 07:00:11');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1617, 14, 24.4, 38, '2020-02-15 06:00:09', '2020-02-15 06:00:09');
INSERT INTO `plant_data` (`id`, `plant_id`, `temperature`, `light`, `created_at`, `updated_at`) VALUES (1616, 5, 24.6, 34, '2020-02-15 06:00:09', '2020-02-15 06:00:09');

I want to get the average values of the last X days as one row for each day. I could do it with X querys for each day like

SELECT plant_id, avg(temperature) FROM plant_data WHERE created_at >= '2020-02-16 00:00:00' AND created_at <= '2020-02-17 00:00:00' GROUP BY plant_id;

But I want to know, if it is possible to get the data with one query to get a result like this:

+----------+------------------+------------+
| plant_id | avg(temperature) | day        |
+----------+------------------+------------+
|        5 |         24.58000 | 2020-02-16 |
|       14 |         24.42000 | 2020-02-16 |
|        5 |         23.58000 | 2020-02-15 |
|       14 |         23.42000 | 2020-02-15 |
+----------+------------------+------------+

Would be nice if someone has a good idea for doing this to save me query time.

Was it helpful?

Solution

First, a very nice first post. Both ddl and sample data makes it easy to recreate the setup. Second, all you have to do with your query is to add the date dimension:

SELECT plant_id, date(created_at) as day, avg(temperature) 
FROM plant_data 
WHERE created_at >= '2020-02-15 00:00:00' 
  AND created_at <= '2020-02-17 00:00:00' 
GROUP BY plant_id, date(created_at);

You might also be interested in the BETWEEN predicate:

SELECT plant_id, date(created_at) as day, avg(temperature) 
FROM plant_data 
WHERE created_at BETWEEN '2020-02-15 00:00:00' 
                     AND '2020-02-17 00:00:00' 
GROUP BY plant_id, date(created_at);

I haven't checked, but there is a risk that day is a reserved word. In addition I prefer to add more meaning to my identifiers, so I would possibly choose:

SELECT plant_id, date(created_at) as created_date, avg(temperature) 
FROM plant_data 
WHERE created_at BETWEEN '2020-02-15 00:00:00' 
                     AND '2020-02-17 00:00:00' 
GROUP BY plant_id, date(created_at);

As a complement to posting ddl and insert data you can also use - for example - Fiddle. There are more sites like this, and I here choosed this one for no particular reason.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top