Question

I have a bipartite graph stored in an adjacency matrix A (100*1900), 100 rows, 1900 columns. Simply, I denote 100 rows representing factorA, 1900 columns representing factorB. The graph tells the connection between 100 factorA and 1900 factorB, thus it is a bipartite graph.

Thus, the matrix is |factorA|*|factorB|, the dimension of the matrix is 100*1900.

I need to find the pairwise overlap between factorB. A way of doing so is a get A and transpose of A, denote as T(A).

Then get A' = T(A)*A, so A' will be 100*100 matrix, then the items A'[i,j] corresponds to the number of factorB shared by factorA i, and factorA j.

Why is the above algorithm working ? Any reference publication or mathematical proof could be given ?

Was it helpful?

Solution

Let A' = A * transpose(A).

A'[i,j] is the inner product of the ith row of A and the jth row of A. Suppose these two look like the following:

row(A,i) = [0, 0, 1, 0, 1, 1, 0, 1]
row(A,j) = [1, 0, 1, 1, 0, 1, 1, 0]

The element-wise product of these two is

row(A,i) .* row(A,j) = [0, 0, 1, 0, 0, 1, 0, 0]

The inner product of the two rows is the sum of these values, 2. This is the intuition for why A'[i,j] is the number of shared connections between row i and row j.

If you look at transpose(A) * A, you will similarly be able to find shared connections between columns.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top