Question

I want to find the average time for each user to play a song given a list of times for song plays for different users.

user  time                 action
A     2013-03-25T14:12:24Z PLAY
B     2013-03-28T14:54:30Z LIKE
C     2013-04-18T18:51:10Z LIKE
D     2013-05-07T18:06:24Z PLAY
B     2013-04-23T12:18:41Z PLAY
D     2013-04-29T12:00:16Z PLAY
A     2013-03-27T12:09:37Z PLAY
A     2013-04-16T18:31:44Z PLAY

I only want to include times where action equals PLAY.

Thanks in advance

No correct solution

OTHER TIPS

The following code returns the average hour (of the day) in which each user plays a song:

DF <- 
read.csv(text=
"user,time,action
A,2013-03-25T14:12:24Z,PLAY
B,2013-03-28T14:54:30Z,LIKE
C,2013-04-18T18:51:10Z,LIKE
D,2013-05-07T18:06:24Z,PLAY
B,2013-04-23T12:18:41Z,PLAY
D,2013-04-29T12:00:16Z,PLAY
A,2013-03-27T12:09:37Z,PLAY
A,2013-04-16T18:31:44Z,PLAY",stringsAsFactors=F)

# filter by PLAY
plays <- DF[DF$action == "PLAY",]

# create means by user
byRes <- 
by(plays, plays$user,
   FUN=function(grp){
        dates <- as.POSIXlt(grp$time,format="%Y-%m-%dT%H:%M:%S", tz = "GMT")
        data.frame(user=grp$user[1],AvgHour=mean(dates$hour))
     })

# put the "by" result into a data.frame
res <- do.call(rbind,byRes)


# result :
> res
  user  AvgHour
A    A 14.66667
B    B 12.00000
D    D 15.00000
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top