Call your data frame x
:
x <- read.table(header=TRUE, text='CompanyID ProjectID Year
A 1 2010
B 3 2011
C 1 2010
D 5 2012
E 1 2010')
Choose those entries that have multiple values for ProjectID
:
(mx <- x[ave(seq(nrow(x)), x$ProjectID, FUN=length) > 1,])
## CompanyID ProjectID Year
## 1 A 1 2010
## 3 C 1 2010
## 5 E 1 2010
Now for the magic:
do.call(rbind,
by(mx, mx$ProjectID,
FUN=function(mx)
t(apply(combn(as.numeric(mx$CompanyID), 2), 2,
function(x) levels(mx$CompanyID)[x]
)
)
)
)
## [,1] [,2]
## [1,] "A" "C"
## [2,] "A" "E"
## [3,] "C" "E"
With your example data, you get the same result without wrapping up in do.call(rbind ...
but that is needed in the case where there are multiple ProjectID's in play.