Question

Does anyone have a pointer to a resource or, even better, a tip to provide on how to efficiently generate a very large matrix representing a connected graph.

Graph can be randomly created although I would ideally generate a graph of desired size and topology similar to what one can do with JGraphT.

My general intent is to create a very large graph representation (billions of nodes and edges) in parallel by first generating an adjacency matrix to ensure connectedness and then create a representation (RDF, etc) in parallel.

Any further suggestions or alternative approaches are welcome.

Was it helpful?

Solution

Yes there is an easy way to generate a random DAG. If a graph is a DAG it can be separated into levels or ranks. The idea is to rank the nodes and randomly generate edges from lower ranked nodes to higher ranked nodes. A little bit of Googling finds the exact same question asked on stackoverflow a few months ago (with sample code!).

My only concern is that an adjacency matrix representation for a graph with billions of nodes/edges won't fit in memory. However, it's possible I misunderstood your intent in that part of the question.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top