How to add a new column in an R data frame with count based on factor column?
-
10-09-2020 - |
题
How to add a new column in an R data frame with count based on factor column?
While doing the data analysis, often we have to deal with factor data and we might want to find the frequency or count of a level of factor and the other variable combination. This helps us to make comparison within and between factor levels. Therefore, we can add a new column as count to find the required frequency and it can be done by using group_by and mutate function of dplyr package.
Example
Consider the below data frame −
> Group<-rep(c("A","B","C","D","E"),times=10) > Rating<-sample(1:10,50,replace=TRUE) > df<-data.frame(Group,Rating) > head(df,20)
Output
Group Rating 1 A 1 2 B 6 3 C 2 4 D 4 5 E 9 6 A 3 7 B 5 8 C 7 9 D 1 10 E 9 11 A 9 12 B 8 13 C 9 14 D 2 15 E 6 16 A 2 17 B 2 18 C 2 19 D 2 20 E 2
> tail(df,20)
Output
Group Rating 31 A 1 32 B 7 33 C 10 34 D 8 35 E 6 36 A 8 37 B 4 38 C 4 39 D 10 40 E 4 41 A 6 42 B 4 43 C 3 44 D 7 45 E 5 46 A 1 47 B 6 48 C 7 49 D 1 50 E 6
Loading dplyr package and finding the count −
> library(dplyr) > df_with_count<-df%>%group_by(Group,Rating)%>%mutate(count=n()) > head(df_with_count,20) # A tibble: 20 x 3 # Groups: Group, Rating [17]
Output
Group Rating count <fct> <int> <int> 1 A 1 4 2 B 6 3 3 C 2 3 4 D 4 1 5 E 9 2 6 A 3 1 7 B 5 1 8 C 7 2 9 D 1 3 10 E 9 2 11 A 9 1 12 B 8 1 13 C 9 1 14 D 2 3 15 E 6 3 16 A 2 1 17 B 2 1 18 C 2 3 19 D 2 3 20 E 2 1
> tail(df_with_count,20) # A tibble: 20 x 3 # Groups: Group, Rating [17]
Output
Group Rating count <fct> <int> <int> 1 A 1 4 2 B 7 1 3 C 10 2 4 D 8 1 5 E 6 3 6 A 8 1 7 B 4 2 8 C 4 1 9 D 10 1 10 E 4 1 11 A 6 1 12 B 4 2 13 C 3 1 14 D 7 1 15 E 5 2 16 A 1 4 17 B 6 3 18 C 7 2 19 D 1 3 20 E 6 3
Advertisements
不隶属于 Tutorialspoint