Question

I have some data about when, how long, and what channel people are listening to the radio. I need to make a variable called sessions that groups all entries which occur while the radio is on. Because the data may contain some errors I would like to say that if less than five minutes passes from the end of one channel period to the next then it is still the same session. Hopefully a brief example will clarify.

id  obs  Entry_date   Entry_time  duration(in secs) channel
 1   1    01/01/12      23:25:21    6000               2
 1   2    01/03/12      01:05:64     300               5
 1   3    01/05/12      12:12:35     456               5
 2   4    01/05/12      16:45:21     657               8

I want to create the variable sessions so that

id   obs  Entry_date   Entry_time  duration(in secs) channel   session
 1    1    01/01/12      23:25:21    6000               2    1
 1    2    01/03/12      01:05:64     300               5    1
 1    3    01/05/12      12:12:35     456               5    2
 2    4    01/05/12      16:45:21     657               8    1

for defining 1 session i need to use entry_time (and date if it goes from 11pm into the next morning) so that if entry_time+duration + (5minutes) < entry_time(next channel) then the session changes. This has been killing me and simple arrays wont do the trick, or my attempt using arrays has not worked. Thanks in advance

the following code works well , but does not start the session over when the id changes

data sirius1;  /*creates sessions*/
set sirius;
by account_number entry_date_est entry_time_est; /* put in to check data is sorted correctly */
retain session 1; /* initialise session with value 1 */
session+(dif(dhms(entry_date_est,0,0,entry_time_est))-lag(duration_seconds)>300); /*  increment session by 1 if time difference > 5 minutes */
run;
Was it helpful?

Solution

if first.account_number then session=1; *(or first.id or whatever...);

You just need to reinitialize at each new ID. (You may need to initialize more frequently than that - like every channel). I'm not sure how your example data correlates with your code, so you may need to modify your BY statement to reflect it correctly. For your example information, you need

by id channel;

at minimum - so you can say

if first.channel then session=1;

as it looks like you need to reset for each channel.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top