Question

I wasn't sure what to title this.

I have a dataset of people, years, and activities

df <- data.frame("id" = c("1", "1", "1", "2", "2","3"), "years" = rep(1971, 6),
                      "activity" = c("a","b","c","d","e","e"))
  id years activity
1  1  1971        a
2  1  1971        b
3  1  1971        c
4  2  1971        d
5  2  1971        e
6  3  1971        e

I want to combine the years and activities columns, but for each year, in the original years column, I want to generate +/- 3 years, while retaining association with the id

If I did this in 2 steps: For id "1" the original year is 1971, so +/-3 years for ID 1 would result in:

 id   all_years 
 1    1968
 1    1969
 1    1970
 1    1971
 1    1972
 1    1973
 1    1974

In step 2, I want to combine this all_years column with the activities column from the original df, keeping the ids. So id "1" has 3 activities (a,b,c) and 7 years (1968:1964), so id "1" would appear 10 times in the new combined column.

So ultimately, I would end up with something like this:

  id   year_and_activities 
  1    a
  1    b
  1    c
  1    1968
  1    1969
  1    1970
  1    1971
  1    1972
  1    1973
  1    1974
  2    d
  2    e
  2    1968
...
  2    1974
...
  3    e
...

As always, Thank you!

Was it helpful?

Solution

I couldn't really follow your question, but given the initial data frame, you can get your final data frame using melt:

require(reshape2)

##To get your +/- 3
dd = data.frame(id=df$id, activity=df$activity,
   years=df$years- rep(-3:3, nrow(df)))

##Pretty much gives you what you want
df_melt = melt(dd, id=1)

##Remove the unnecessary column
df_melt = df_melt[,c(1,3)]
##Rename 
colnames(df_melt) = c("id","year_and_activities")

##Order the column
df_melt[with(df_melt, order(id, year_and_activities)),]

As an aside, I would suggest that having a column as a mixture of "characters" and "years" is probably a bad idea - but you may have a good reason.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top