Question

I have a table which stores a list of activities with an interval of time delimited by 2 dates.

Sample:

+------+---------------------+---------------------+-------------+
| name |        start        |         end         | time (calc) |
+------+---------------------+---------------------+-------------+
|  me  | 2017-04-03 11:00:00 | 2017-04-03 11:30:00 |          30 |
|  me  | 2017-04-03 23:45:00 | 2017-04-04 00:15:00 |          30 |
|  me  | 2017-04-04 10:00:00 | 2017-04-04 11:00:00 |          60 |
|  me  | 2017-04-04 10:30:00 | 2017-04-04 11:30:00 |          60 |
|  me  | 2017-04-05 23:00:00 | 2017-04-05 23:30:00 |          30 |
|  me  | 2017-04-05 23:15:00 | 2017-04-07 00:45:00 |        1530 |
+------+---------------------+---------------------+-------------+

I would like to know how many minutes are occupied each day per user (and then per week), so I need to transform current table into where intervals which share a partial space time are merged in a single one, and intervals which are in several days are split, like the next one:

+------+---------------------+---------------------+-------------+
| name |        start        |         end         | time (calc) |
+------+---------------------+---------------------+-------------+
|  me  | 2017-04-03 11:00:00 | 2017-04-03 11:30:00 |          30 |
|  me  | 2017-04-03 23:45:00 | 2017-04-03 23:59:59 |          15 |
|  me  | 2017-04-04 00:00:00 | 2017-04-04 00:15:00 |          15 |
|  me  | 2017-04-04 10:00:00 | 2017-04-04 11:30:00 |          90 |
|  me  | 2017-04-05 23:00:00 | 2017-04-05 23:59:59 |          60 |
|  me  | 2017-04-06 00:00:00 | 2017-04-06 23:59:59 |        1440 |
|  me  | 2017-04-07 00:00:00 | 2017-04-07 00:45:00 |          45 |
+------+---------------------+---------------------+-------------+

To then query it easily in order to obtain minutes per day:

+------+------------+------+
| name |    day     | time |
+------+------------+------+
|  me  | 2017-04-03 |   45 |
|  me  | 2017-04-04 |  105 |
|  me  | 2017-04-05 |   60 |
|  me  | 2017-04-06 | 1440 |
|  me  | 2017-04-07 |   45 |
+------+------------+------+

I were looking for information an I found how to merge multiple date intervals here (mysql - sum interval dates), however I am not able to split an interval in several days.

It is possible to do it in a single query? How could I do it?

Edit:

SQL (structure and data):

CREATE TABLE activities (
    id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
    name VARCHAR(45),
    start DATETIME,
    end DATETIME,
    time INT GENERATED ALWAYS AS (TIMESTAMPDIFF(MINUTE, start, end)) VIRTUAL
);

INSERT INTO activities (name, start, end) VALUES
('me','2017-04-03 11:00','2017-04-03 11:30'),
('me','2017-04-03 23:45','2017-04-04 00:15'),
('me','2017-04-04 10:00','2017-04-04 11:00'),
('me','2017-04-04 10:30','2017-04-04 11:30'),
('me','2017-04-05 23:00','2017-04-05 23:30'),
('me','2017-04-05 23:15','2017-04-07 00:45');

SQL to merge multiple intervals (Soruce: mysql - sum interval dates):

SELECT name, min(start) AS start, end, TIMESTAMPDIFF(MINUTE, MIN(start), end) AS time
FROM (
    SELECT x.name, x.start, min(y.end) AS end 
    FROM activities AS x 
    JOIN activities AS y 
        ON x.name = y.name 
       AND x.start <= y.end 
       AND NOT EXISTS (
           SELECT 1 
           FROM activities AS z 
           WHERE y.name = z.name 
             AND y.end >= z.start 
             AND y.end < z.end
       ) 
    WHERE NOT EXISTS (
        SELECT 1 
        FROM activities AS u 
        WHERE x.name = u.name 
          AND x.start > u.start 
          AND x.start <= u.start
    ) 
    GROUP BY x.name, x.start
) AS v GROUP BY name, end;
Was it helpful?

Solution

First due you need to generate a series of dates, I'd suggest to use a calendar table.

CREATE TABLE if not exists calendar (
    mdate date PRIMARY KEY NOT NULL
);

INSERT INTO calendar values
('20170403'),('20170404'),('20170405'),('20170406'),('20170407'),('20170408');

How do they do it

Just to get overlapped activities I've used the query that you provided on your question.

create view overlaped_activities
as
SELECT name, min(start) AS start, end, TIMESTAMPDIFF(MINUTE, MIN(start), end) AS time
FROM (
    SELECT x.name, x.start, min(y.end) AS end 
    FROM activities AS x 
    JOIN activities AS y 
        ON x.name = y.name 
       AND x.start <= y.end 
       AND NOT EXISTS (
           SELECT 1 
           FROM activities AS z 
           WHERE y.name = z.name 
             AND y.end >= z.start 
             AND y.end < z.end
       ) 
    WHERE NOT EXISTS (
        SELECT 1 
        FROM activities AS u 
        WHERE x.name = u.name 
          AND x.start > u.start 
          AND x.start <= u.start
    ) 
    GROUP BY x.name, x.start
) AS v GROUP BY name, end;

First I calculate minutes from start date till midnight:

if(date(start) = date(end), 
  time_to_sec(timediff(end, start)) / 60, 
  (1440 - time_to_sec(time(start)) / 60)) mstart

Then, if start <> end, I calculate minutes from midnight till end date:

if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) mend

This returns a table like this:

| start               | end                 |     mdiff |  mstart |    mend |
|---------------------|---------------------|----------:|--------:|--------:|
| 03.04.2017 11:00:00 | 03.04.2017 11:30:00 |   30,0000 | 30,0000 |       0 |
| 03.04.2017 23:45:00 | 04.04.2017 00:15:00 |   30,0000 | 15,0000 | 15,0000 |
| 04.04.2017 10:00:00 | 04.04.2017 11:30:00 |   90,0000 | 90,0000 |       0 |
| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |

That's nice, but there is another problem here:

| 05.04.2017 23:00:00 | 07.04.2017 00:45:00 | 1545,0000 | 60,0000 | 45,0000 |

Of course: 1545 <> 60 + 45

We need to generate a series of dates between start and end date, and add 1440 minutes to each day.

We can get it using the calendar table:

   select   name,
            mdate date_activity,
            sum(1440) minutes
   from     calendar
   join     overlaped_activities
   on       calendar.mdate > date(start)
   and      calendar.mdate < date(end)
   where    datediff(end, start) > 1
   group by name, mdate

Ok, we got all ingredients, it's time to cook the recipe:

select name, date_activity, sum(minutes) min_activity
from (
       select name, 
              date(start) date_activity,
              if(date(start) = date(end), time_to_sec(timediff(end, start)) / 60, (1440 - time_to_sec(time(start)) / 60)) minutes
       from overlaped_activities

       UNION ALL

       select name, 
              date(end) date_activity,
              if(date(start) = date(end), 0, time_to_sec(time(end)) / 60) minutes
       from overlaped_activities

       UNION ALL

       select   name,
                mdate date_activity,
                sum(1440) minutes
       from     calendar
       join     overlaped_activities
       on       calendar.mdate > date(start)
       and      calendar.mdate < date(end)
       where    datediff(end, start) > 1
       group by name, mdate
    ) act
group by name, date_activity;

Final result:

| name |       date_activity | min_activity |
|------|--------------------:|-------------:|
| me   | 03.04.2017 00:00:00 |      45,0000 |
| me   | 04.04.2017 00:00:00 |     105,0000 |
| me   | 05.04.2017 00:00:00 |      60,0000 |
| me   | 06.04.2017 00:00:00 |    1440,0000 |
| me   | 07.04.2017 00:00:00 |      45,0000 |  

Almost forget it, the recipe: http://rextester.com/EIJOI20983

OTHER TIPS

In MySQL, to split the across days, we need to join with the table having dates as Oracle feature of Connect by level is not supported in MySQL.

Adding more details...

Create table ActivityDay 
(id INT Primary Key NOT NULL AUTO_INCREMENT, ActivityDay Date NOT NULL);

insert into ActivityDate (ActivityDay) values('2017-04-03'), 
values('2017-04-04'),values('2017-04-05'),
values('2017-04-06'),values('2017-04-07'),
values('2017-04-08');

Select id, Name, ActualStart, ActualEnd, 
(Time_to_Sec(ActualEnd) - Time_to_Sec(ActualStart))/60 time
from (
select a.id, a.name, a.start, Date(start) StartDay, Date(end) EndDay, a.time,
       case when DATE(START) = DATE(ActualStart) then Start else ActivityDay end ActualStart,
       case when Date(end) <> Date(ActualEnd) then date_add(ActivityDay, interval 24*60*60 - 1 second) else end end ActualEnd,
       ad.ActivityDay
from Activities a join ActivityDate ad) d
Where ActivityDay between start and end)

If it's just split by dates and you're using DATETIME, it can be quite simple. Just substring the timestamp and group by the date

SELECT
who,
SUBSTRING(date_created, 1, 10) AS date,
COUNT(times) AS time
FROM timings
WHERE date_created > '2017-04-03' AND date_created < '2017-04-10'
GROUP BY date
+------+------------+------+
| name |    day     | time |
+------+------------+------+
|  me  | 2017-04-03 |   45 |
|  me  | 2017-04-04 |  105 |
|  me  | 2017-04-05 |   60 |
|  me  | 2017-04-06 | 1440 |
|  me  | 2017-04-07 |   45 |
+------+------------+------+
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top