Question

I am trying to add multiple new rows for 3 new factor level in an existing data frame. Please refer to sample data for an example. My starting data frame has 18 levels for col1 and all 12 months for column mon and past 20 years for year. I then impute values and add new columns, however I need new factors to be added for further analysis.

For each mon and year combination, a new level should exist.

Sample df:

col1 <- c(rep("a",4),rep("b",4)) 
col2 <- c(1:4)  
mon <- c(rep(c("Jan","Feb", "Mar","Apr"), 4))  
year <- c(rep("2016",8), rep("2015",8))  
df <- as.data.frame(cbind(col1,col2,mon,year))

head(df,8) # edited to make it readable

  col1 col2   mon year  
1 a    1      Jan 2016  
2 a    2      Feb 2016  
3 a    3      Mar 2016  
4 a    4      Apr 2016  
5 b    1      Jan 2016  
6 b    2      Feb 2016  
7 b    3      Mar 2016  
8 b    4      Apr 2016  

Expected Output

   col1 col2   mon year  
1  a    1      Jan 2016  
2  a    2      Feb 2016  
3  a    3      Mar 2016  
4  a    4      Apr 2016  
5  b    1      Jan 2016  
6  b    2      Feb 2016  
7  b    3      Mar 2016  
8  b    4      Apr 2016
9  c    NA     Jan 2016  # New level c for each mon and year
10 c    NA     Feb 2016  # New level c for each mon and year
11 c    NA     Mar 2016  # New level c for each mon and year
12 c    NA     Apr 2016  # New level c for each mon and year

How do I go about reaching the expected df?

Was it helpful?

Solution

Several possibilities. For example, to add c for existing mon-year combinations in your data frame:

rbind(df, transform(df[!duplicated(df[, 3:4]), ], col1="c", col2=NA))
#     col1 col2 mon year
# 1      a    1 Jan 2016
# 2      a    2 Feb 2016
# 3      a    3 Mar 2016
# 4      a    4 Apr 2016
# 5      b    1 Jan 2016
# 6      b    2 Feb 2016
# 7      b    3 Mar 2016
# 8      b    4 Apr 2016
# 9      a    1 Jan 2015
# 10     a    2 Feb 2015
# 11     a    3 Mar 2015
# 12     a    4 Apr 2015
# 13     b    1 Jan 2015
# 14     b    2 Feb 2015
# 15     b    3 Mar 2015
# 16     b    4 Apr 2015
# 17     c <NA> Jan 2016
# 21     c <NA> Feb 2016
# 31     c <NA> Mar 2016
# 41     c <NA> Apr 2016
# 91     c <NA> Jan 2015
# 101    c <NA> Feb 2015
# 111    c <NA> Mar 2015
# 121    c <NA> Apr 2015

To add c for all possible combinations of existing mon values and existing year values:

rbind(df, data.frame(col1="c", col2=NA, expand.grid(mon=levels(df$mon), year=levels(df$year))))

To add c for all possible combinations of all months names and existing year values:

rbind(df, data.frame(col1="c", col2=NA, expand.grid(mon=month.abb, year=levels(df$year))))

and so on.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top