How to label ties when creating a variable capturing the most frequent occurence of a group?

StackOverflow https://stackoverflow.com/questions/20898512

  •  23-09-2022
  •  | 
  •  

Вопрос

In the following example, how do I ask R to identify a tie as "tie" when I want to determine the most frequent value within a group?

I am basically following on from a previous question, that used which.max or which.is.max and a custom function (Create a variable capturing the most frequent occurence by group), but I want to acknowledge the ties as a tie. Any ideas?

df1 <-data.frame(
id=c(rep(1,3),rep(2,3)),
v1=as.character(c("a","b","b",rep("c",3)))
)

I want to create a third variable freq that contains the most frequent observation in v1 by id, but also creates identifies ties as "tie".

From previous answers, this code works to create the freq variable, but just doesn't deal with the ties:

myFun <- function(x){
tbl <- table(x$v1)
x$freq <- rep(names(tbl)[which.max(tbl)],nrow(x))
x
}

ddply(df1,.(id),.fun=myFun)
Это было полезно?

Решение

You could slightly modify your function by testing if the maximum count occurs more than once. This happens in sum(tbl == max(tbl)). Then proceed accordingly.

df1 <-data.frame(
  id=rep(1:2, each=4),
  v1=rep(letters[1:4], c(2,2,3,1))
)

myFun <- function(x){
  tbl <- table(x$v1)
  nmax <- sum(tbl == max(tbl))
  if (nmax == 1)
    x$freq <- rep(names(tbl)[which.max(tbl)],nrow(x))
  else
    x$freq <- "tie"
  x
}

ddply(df1,.(id),.fun=myFun)

  id v1 freq
1  1  a  tie
2  1  a  tie
3  1  b  tie
4  1  b  tie
5  2  c    c
6  2  c    c
7  2  c    c
8  2  d    c
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top