Pregunta

I have the following data frame, call it df, which is a data frame consisting in three vectors: "Name," "Age," and "ZipCode."

df=      
  Name Age ZipCode
1  Joe  16   60559
2  Jim  20   60637
3  Bob  64   94127
4  Joe  23   94122
5  Bob  45   25462

I want to delete the entire row of df if the Name in it appears fewer than 2 times in the data frame as a whole (and flexibly 3, 4, or x times). Basically keep Bob and Joe in the data frame, but delete Jim. How can I do this?

I tried to turn it into a table:

> table(df$Name)

Bob Jim Joe 
 2   1   2 

But I don't know where to go from there.

¿Fue útil?

Solución

You can use ave like this:

df[as.numeric(ave(df$Name, df$Name, FUN=length)) >= 2, ]
#   Name Age ZipCode
# 1  Joe  16   60559
# 3  Bob  64   94127
# 4  Joe  23   94122
# 5  Bob  45   25462

This answer assumes that df$Name is a character vector, not a factor vector.


You can also continue with table as follows:

x <- table(df$Name)
df[df$Name %in% names(x[x >= 2]), ]
#   Name Age ZipCode
# 1  Joe  16   60559
# 3  Bob  64   94127
# 4  Joe  23   94122
# 5  Bob  45   25462
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top