How to get the next record with a particular value and then go back one record to calculate session duration from logs?

StackOverflow https://stackoverflow.com/questions/21569183

  •  07-10-2022
  •  | 
  •  

Question

General Problem

Given a particular record for a given user, I want to get the next record for that user with the same column values, and then find the previous record for this user.

+----+--------+--------+
| id | userid | action |
+----+--------+--------+
|  1 |      2 | a      |
| 20 |      2 | b      |
| 21 |      2 | c      |
| 22 |      2 | c      |
| 23 |      2 | d      |
| 59 |      2 | a      |
| 60 |      2 | b      |
| 71 |      2 | c      |
| 72 |      2 | c      |
| 83 |      2 | d      |
| 99 |      2 | a      |
+----+--------+--------+

So I would want to return the following:

+--------+---------+----------+
| userid | left.id | right.id |
+--------+---------+----------+
| 2      | 1       | 23       |
| 2      | 59      | 83       |
+--------+---------+----------+

Specific example of what I am trying to achieve

I am trying to approximate session duration from the log table in Moodle for reporting purposes.

For example a user will login which generates a log with module = user and action = login. If they logout then this will create a log with module = user and action = logout, but this occurs only in about 20% of cases. A whole series of other logs will occur after login.

One can use this 20% as a sample for average duration calculations, but the report requires that this is approximated for every user.

The current report tool integration is MySQL driven which prompts the desire to do this purely in SQL rather than PHP.

What I have done

So I have built this as a query that uses subqueries as follows:

  1. Find existing login entry
  2. Find the next login entry
  3. Self join the existing entry to the previous entry before the next login

This seems to work, however performance is quite poor on the entire dataset. There are millions of rows over several years in total, though typically a report will be interested in a weekly or monthly summation.

My question is whether there is a better approach to this?

There are wider requirements that will evolve from this to aggregate duration reports by course, department, etc so optimal SQL will be crucial to this.

SQLFiddle

Using subqueries: http://sqlfiddle.com/#!2/42d5ce/6

MySQL

SELECT l.userid, FROM_UNIXTIME(l.time) as start,
       FROM_UNIXTIME(r.time) as end, (r.time - l.time) AS duration
FROM mdl_log AS l 
INNER JOIN mdl_log AS r ON r.id = (
    SELECT n.id
    FROM mdl_log n
    WHERE n.id < (
      SELECT id 
      FROM mdl_log t
      WHERE l.userid = t.userid
        AND t.time > l.time 
        AND t.module = 'user' 
        AND t.action = 'login'
      LIMIT 0,1
    )
    AND l.userid = n.userid
    ORDER BY n.id DESC
    LIMIT 0,1
)
WHERE l.module = 'user'
  AND l.action = 'login'
Was it helpful?

Solution

Maybe with a bit of php. The logout is key, so only the log in immediately before the log out will be used - and of course it needs to be in userid + time order.

Then I would use a flexi table to display the results - but you would need to do a bit more work for that with paging.

I'm also using get_recordset_sql() rather than get_records() because potentially there will be a lot of records.

$sql = "SELECT l.userid,
                l.time AS timeaction,
                CASE WHEN l.action = 'login' THEN l.time ELSE 0 END AS timestart,
                CASE WHEN l.action = 'logout' THEN l.time ELSE 0 END AS timeend
        FROM {log} l
        WHERE l.module = 'user'
        AND l.action IN ('login', 'logout')
        ORDER BY l.userid, l.time";

$logs = $DB->get_recordset_sql($sql);
$sessions = array();
if ($logs->valid()) {
    $userid = 0;
    $timestart = 0;
    $timeend = 0;
    foreach ($logs as $log) {
        if (!empty($log->timestart)) {
            // Logged in.
            $userid = $log->userid;
            $timestart = $log->timestart;
        } else if (!empty($log->timeend)) {
            // Logged out.
            $session = new stdClass();
            $session->userid = $userid;
            $session->timestart = $timestart;
            $session->timeend = $log->timeend;
            $sessions[] = $session;
        }
    }
    $logs->close(); // Required for recordset.

    // Use a flexitable to display the results properly with paging.
    foreach ($sessions as $session) {
        echo 'Userid : ' . $session->userid . 
            ' Time start ' . gmdate("Y-m-d H:i:s", $session->timestart) . 
            ' Time end ' . gmdate("Y-m-d H:i:s", $session->timeend)  .
            ' Duration ' . gmdate("H:i:s", $session->timeend - $session->timestart) . '<br/>';
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top