Question

I have a data.frame with task assignments from a ticket tracking system.

Assignments <- data.frame('Task'=c(1, 1, 2, 3, 2, 2, 1), 'Assignee'=c('Alice', 'Bob', 'Alice', 'Alice', 'Bob', 'Chuck', 'Alice'))

I need to summarize the data for some monthly reports. Here is what I have so far:

ddply(Assignments, 'Task', 
      summarize, 
      Assignee.Count=length(Assignee), 
      Unique.Assignees.Involved=length(unique(Assignee)),
      Assignees.Involved=paste(Assignee, sep=", ", collapse=", "))

And that nets me:

  Task Assignee.Count Unique.Assignees.Involved Assignees.Involved
1    1              3                         2  Alice, Bob, Alice
2    2              3                         3  Alice, Bob, Chuck
3    3              1                         1              Alice

In the Assignees.Involved column, I'd like to further summarize the data. In line 1, I'd like it to say "Alice 2, Bob 1". It feels to me like I need to use some other plyr method to take the Assignees for each task, sort them, then run them through the rle function, and paste the lengths and values back together. I can't figure out how to do that within the summarize function.

Here is the result for the whole entire data.frame:

paste(rle(as.vector(sort(Assignments$Assignee)))$values,
      rle(as.vector(sort(Assignments$Assignee)))$lengths,
      sep=" ", collapse=", ")

Results:

[1] "Alice 4, Bob 2, Chuck 1"
Was it helpful?

Solution

I figured this out while posting the question :)

The trick is that within the functions specified as arguments to the summarize function, you refer to them as a bareword; Assignments$Assignee should be called just Assignee, no data frame, no quotes, etc.

So once I had figured out that the rle function could get me where I needed to be, I had what I needed.

ddply(Assignments, 'Task', 
      summarize, 
      Assignee.Count=length(Assignee), 
      Unique.Assignees.Involved=length(unique(Assignee)), 
      Assignments=paste(rle(as.vector(sort(Assignee)))$values, 
                        rle(as.vector(sort(Assignee)))$lengths, 
                        sep=" ", collapse=", "))

Gives:

  Task Assignee.Count Unique.Assignees.Involved             Assignments
1    1              3                         2          Alice 2, Bob 1
2    2              3                         3 Alice 1, Bob 1, Chuck 1
3    3              1                         1                 Alice 1
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top