Quickly generate the cartesian product of a matrix

https://stackoverflow.com/questions/1428174

07-07-2019
|

Question

Let's say I have a matrix x which contains 10 rows and 2 columns. I want to generate a new matrix M that contains each unique pair of rows from x - that is, a new matrix with 55 rows and 4 columns.

E.g.,

x <- matrix (nrow=10, ncol=2, 1:20)

M <- data.frame(matrix(ncol=4, nrow=55))
k <- 1
for (i in 1:nrow(x))
for (j in i:nrow(x))
{
    M[k,] <- unlist(cbind (x[i,], x[j,]))
    k <- k + 1
}

So, x is:

      [,1] [,2]
 [1,]    1   11
 [2,]    2   12
 [3,]    3   13
 [4,]    4   14
 [5,]    5   15
 [6,]    6   16
 [7,]    7   17
 [8,]    8   18
 [9,]    9   19
[10,]   10   20

And then M has 4 columns, the first two are one row from x and the next 2 are another row from x:

> head(M,10)
   X1 X2 X3 X4
1   1 11  1 11
2   1 11  2 12
3   1 11  3 13
4   1 11  4 14
5   1 11  5 15
6   1 11  6 16
7   1 11  7 17
8   1 11  8 18
9   1 11  9 19
10  1 11 10 20

Is there either a faster or simpler (or both) way of doing this in R?

Solution

The expand.grid() function useful for this:

R> GG <- expand.grid(1:10,1:10)
R> GG <- GG[GG[,1]>=GG[,2],]     # trim it to your 55 pairs
R> dim(GG)
[1] 55  2
R> head(GG)
  Var1 Var2
1    1    1
2    2    1
3    3    1
4    4    1
5    5    1
6    6    1
R>

Now you have the 'n*(n+1)/2' subsets and you can simple index your original matrix.

OTHER TIPS

I'm not quite grokking what you are doing so I'll just throw out something that may, or may not help.

Here's what I think of as the Cartesian product of the two columns:

expand.grid(x[,1],x[,2])

You can also try the "relations" package. Here is the vignette. It should work like this:

relation_table(x %><% x)

Using Dirk's answer:

idx <- expand.grid(1:nrow(x), 1:nrow(x))
idx<-idx[idx[,1] >= idx[,2],]
N <- cbind(x[idx[,2],], x[idx[,1],])

> all(M == N)
[1] TRUE

Thanks everyone!

Inspired from the other answers, here is a function implementing cartesian product of two matrices, in the case of two matrices, the full cartesian product, for only one argument, omitting one of each pair:

cartesian_prod <- function(M1, M2) {
if(missing(M2)) {  M2 <- M1
     ind  <- expand.grid(1:NROW(M1), 1:NROW(M2))
     ind <- ind[ind[,1] >= ind[,2],] } else {
                                          ind  <- expand.grid(1:NROW(M1), 1:NROW(M2))}
rbind(cbind(M1[ind[,1],], M2[ind[,2],]))

}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow